CN111291417B - Method and device for protecting data privacy of multi-party combined training object recommendation model - Google Patents
Method and device for protecting data privacy of multi-party combined training object recommendation model Download PDFInfo
- Publication number
- CN111291417B CN111291417B CN202010384206.4A CN202010384206A CN111291417B CN 111291417 B CN111291417 B CN 111291417B CN 202010384206 A CN202010384206 A CN 202010384206A CN 111291417 B CN111291417 B CN 111291417B
- Authority
- CN
- China
- Prior art keywords
- kth
- matrix
- fragment
- user
- parties
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the specification provides a method for a multi-party joint training object recommendation model for protecting data privacy, wherein multiple parties are N parties, and a total scoring matrix of P objects by M users is maintained together. The method is executed by any kth party in the N parties, and comprises the following steps: firstly, secret sharing is carried out to obtain the following matrix fragments: a kth scoring matrix segment of the total scoring matrix, a kth object matrix segment of the initialized object feature matrices of the P objects, and a kth user matrix segment of the initialized user feature matrices of the M users; and performing repeated iterative updating, specifically, performing secret sharing matrix operation with other N-1 parties based on the matrix fragments to obtain a kth fragment of the object updating gradient and the user updating gradient, and further updating the kth object matrix fragment and the kth user matrix fragment. After the multiple iterative updating is finished, the N parties respectively exchange the updated matrix fragments to carry out matrix reconstruction, and further establish respective object recommendation models.
Description
Technical Field
One or more embodiments of the present disclosure relate to the technical field of information security, and in particular, to a method and an apparatus for a multi-party joint training object recommendation model for protecting data privacy.
Background
In a recommendation system, a single data party usually learns the potential preferences of a user feature vector and an object feature vector by using stored scoring data of a plurality of users on a plurality of objects, and then predicts the scoring of some objects by a certain user by using the learned feature vectors of the users and the objects, wherein some objects are usually objects for which the certain user has not performed scoring behavior, and then performs object recommendation to the certain user according to the predicted scoring.
With the heat of research caused by the fact that multiple parties carry out machine learning by utilizing the data cooperation of the multiple parties, different data parties storing different scoring data hope to utilize the scoring data held by the multiple parties to learn the user characteristic vector and the object characteristic vector in cooperation, so that the accuracy of the learned characteristic vector is improved, and the accuracy and the effectiveness of prediction scoring are improved. The difficulty is how to ensure the privacy and safety of data of each party, including user data, object data, scoring data and the like, in the learning process.
Therefore, a solution is needed, so that multiple parties can jointly learn feature vectors for constructing respective object recommendation models on the premise of protecting data privacy. However, no relevant solution currently exists.
Disclosure of Invention
One or more embodiments of the present specification describe a method for protecting a multi-party joint training object recommendation model for data privacy, which can achieve multi-party collaborative updating of a user feature vector and an object feature vector while ensuring the data privacy security of each party, thereby constructing respective object recommendation models and further making more accurate object recommendation.
According to a first aspect, a method for a multi-party joint training object recommendation model for protecting data privacy is provided, where the multiple parties are N parties, and the N parties jointly maintain a total score matrix of M users for P objects, and the method is performed by any kth party of the N parties, and includes: and acquiring the kth scoring matrix fragment of the total scoring matrix through secret sharing. And acquiring the k-th object matrix fragment of the initialized object feature matrix of the P objects and the k-th user matrix fragment of the initialized user feature matrix of the M users through secret sharing. Performing a plurality of iterative updates, wherein any iterative update comprises: for any first user and any first object, acquiring a kth scoring fragment of the first object from the kth scoring matrix fragment by the first user; acquiring a kth object characteristic fragment of the first object from the kth object matrix fragment of the previous iteration; acquiring a kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration; based on the kth user feature fragment and the kth object feature fragment, performing secret sharing matrix operation with other N-1 parties to obtain a kth similarity fragment of the feature vector similarity of the first user and the first object, and taking the difference between the kth similarity fragment and the kth scoring fragment as a kth error fragment; calculating a user characteristic updating gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth object characteristic fragment to obtain a kth user gradient fragment; calculating an object characteristic updating gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth user characteristic fragment to obtain a kth object gradient fragment; updating the kth user characteristic fragment according to the kth user gradient fragment; and updating the k object feature fragment according to the k object gradient fragment.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix; the obtaining of the kth scoring matrix segment of the total scoring matrix through secret sharing includes: splitting the kth scoring submatrix into N fragments, reserving the kth fragment, and correspondingly sending the other N-1 fragments to other N-1 parties; and receiving k-th fragments of other scoring matrixes from other N-1 parties; and splicing the kth fragment of the kth scoring submatrix with the kth fragments of other scoring submatrixes to form the kth scoring matrix fragment.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix, the kth scoring submatrix being defined by the M users for P in the kth partykThe scores of the individual subjects constitute.
In a specific embodiment, the obtaining the initialized kth object matrix partition of the object feature matrices of the P objects includes: initialization PkP corresponding to each objectkThe characteristic vector of each object forms a k object sub-matrix; splitting the k object sub-matrix into N fragments through secret sharing, reserving the k fragment, and correspondingly sending other N-1 fragments to other N-1 parties; and receiving initialized kth slices of other object submatrices from other N-1 parties; and splicing the kth fragment of the kth object sub-matrix and the kth fragments of other object sub-matrices to form the kth object matrix fragment.
In a specific embodiment, the kth party is a designated party for initializing the user feature vector; the acquiring the initialized kth user matrix segment of the user feature matrices of the M users includes: initializing M user characteristic vectors corresponding to M users to form a user characteristic matrix; and splitting the user characteristic matrix into N fragments through secret sharing, reserving the k-th user matrix fragment, and correspondingly sending other N-1 fragments to other N-1 parties.
In a specific embodiment, after the performing the plurality of iterative updates, the method further comprises: respectively sending the updated kth user characteristic fragment to other N-1 parties, and receiving the updated other user characteristic fragments from other N-1 parties; and splicing the updated kth user characteristic fragment and the received other user characteristic fragments to form an updated user characteristic matrix.
In a specific embodiment, after the performing the plurality of iterative updates, the method further comprises: correspondingly sending the kth fragment belonging to other object sub-matrixes in the updated kth object matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth object sub-matrix from other N-1 parties; and splicing the kth fragment belonging to the kth object sub-matrix in the updated kth object matrix fragment with the received other fragments to form an updated kth object sub-matrix.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix, the kth scoring submatrix being formed by M of the kth partykThe scores of the P objects by the users are formed.
In a specific embodiment, the obtaining the initialized kth user matrix segment of the user feature matrices of the M users includes: initialization MkM corresponding to each userkForming a kth user sub-matrix by the user characteristic vector; the k user sub-matrix is divided into N fragments through secret sharing, the k fragment is reserved, and other N-1 fragments are correspondingly sent to other N-1 parties; and receiving initialized kth slices of other user submatrices from other N-1 parties; splicing the kth fragment of the kth user sub-matrix with the kth fragments of other user sub-matricesAnd forming the k-th user matrix slice.
In a specific embodiment, the kth party is a designated party for initializing the object feature vector; the acquiring the initialized kth object matrix slice of the object feature matrices of the P objects includes: initializing P object feature vectors corresponding to the P objects to form an object feature matrix; and splitting the object feature matrix into N fragments through secret sharing, reserving the k-th object matrix fragment, and correspondingly sending the other N-1 fragments to other N-1 parties.
In a specific embodiment, after the performing the plurality of iterative updates, the method further comprises: respectively sending the updated kth object feature fragment to other N-1 parties, and receiving the updated other object feature fragments from other N-1 parties; and splicing the updated kth object feature fragment and the received other object feature fragments to form an updated object feature matrix.
In a specific embodiment, after the performing the plurality of iterative updates, the method further comprises: correspondingly sending the kth fragment belonging to other user sub-matrixes in the updated kth user matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth user sub-matrix from other N-1 parties; and splicing the kth fragment belonging to the kth user sub-matrix in the updated kth user matrix fragment with the received other fragments to form an updated kth user sub-matrix.
According to a second aspect, there is provided an apparatus for a multi-party joint training object recommendation model for protecting data privacy, where the multiple parties are N parties, and the N parties jointly maintain a total scoring matrix of P objects by M users, the apparatus is integrated at any kth party of the N parties, and the apparatus includes: and the scoring fragment acquisition unit is configured to acquire the kth scoring matrix fragment of the total scoring matrix through secret sharing. And the object fragment acquisition unit is configured to acquire the k-th object matrix fragment of the initialized object feature matrix of the P objects through secret sharing. And the user fragment acquisition unit is configured to acquire the initialized kth user matrix fragments of the user feature matrices of the M users through secret sharing. An iterative update unit configured to perform a plurality of iterative updates, wherein any one iterative update is performed by: the multi-segment acquisition module is configured to acquire a kth scoring segment of the first object from the kth scoring matrix segment for any first user and any first object; acquiring a kth object characteristic fragment of the first object from the kth object matrix fragment of the previous iteration; acquiring a kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration; the error fragment calculation module is configured to perform secret sharing matrix operation with other N-1 parties based on the kth user feature fragment and the kth object feature fragment to obtain a kth similarity fragment of feature vector similarity of the first user and the first object, and taking the difference between the kth similarity fragment and the kth score fragment as a kth error fragment; the user gradient fragment calculation module is configured to calculate a user characteristic update gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth object characteristic fragment to obtain a kth user gradient fragment; the object gradient calculation module is configured to calculate an object characteristic update gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth user characteristic fragment to obtain a kth object gradient fragment; a user segment updating module configured to update the kth user feature segment according to the kth user gradient segment; and the object fragment updating module is configured to update the kth object feature fragment according to the kth object gradient fragment.
According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
According to a fourth aspect, there is provided a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of the first aspect.
In summary, by using the method and apparatus provided in this specification, multiple parties do not perform plaintext exchange of score data, object data, and user data, and the user feature matrix and the object feature matrix are also split into feature fragments, and each of the user feature matrix and the object feature matrix only maintains iterative update of the feature fragments until iteration is completed, and the feature matrices are reconstructed to construct respective object recommendation models. Therefore, the safety of the private data in the joint training process is guaranteed, and meanwhile the prediction accuracy of the object recommendation model can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 shows a general scoring matrix schematic according to one embodiment;
FIG. 2 shows a schematic diagram of a total scoring matrix composed of N scoring submatrices, according to one embodiment;
FIG. 3 shows a schematic diagram of a total scoring matrix composed of N scoring submatrices, according to another embodiment;
FIG. 4 shows a schematic diagram of a scoring submatrix forming an overall scoring matrix according to yet another embodiment;
FIG. 5 is a schematic diagram illustrating an implementation of a method for a multi-party joint training object recommendation model disclosed in an embodiment of the present disclosure;
FIG. 6 illustrates a flowchart of a method for a multi-party federated training object recommendation model to protect data privacy, in accordance with one embodiment;
FIG. 7 illustrates a process diagram for two parties to perform a secret sharing matrix operation, according to one embodiment;
FIG. 8 is a schematic diagram illustrating an apparatus structure of a multi-party joint training object recommendation model for protecting data privacy according to an embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
As described above, object recommendation is currently implemented in a recommendation system by learning feature vectors of users and objects. The learning process typically includes learning the potential vector preferences of users and objects by matrix decomposition to fit existing user-object scores.
In particular, the scoring of a set of users u on a set of objects v is represented by a matrix R, where each element in the matrix RRepresenting a userTo the objectThe process of learning the feature vector is as follows:
wherein the content of the first and second substances,the function of the loss is represented by,show to makeThe size of the particles is minimized and,andrespectively representing usersFeature vectors and objects ofIs determined by the feature vector of (a),andrespectively representing a user feature vector set and an object feature vector set,andrespectively representA two-norm sum ofThe two-norm of (a) is,is super ginseng.
In the above equation (1), the first term constrains the relationship between the scores and the feature vector similarity, i.e., the higher the score is desired, the more similar the feature vectors are. The second term is a regular term in order to prevent overfitting.
Further, based on the formula (1) pairAndthe separate derivatives can yield the gradient of the loss function to the user and item as follows:
wherein, the upper labelIndicating transposition. Thus, the user vector can be based onGradient of loss ofUpdatingAccording to the object vectorGradient of loss ofUpdating。
As can be seen from the above process, the learning process includes several core operations: calculating the dot product of the user feature vector and the object feature vector as the similarity between the user feature vector and the object feature vector, further calculating the user gradient by combining the user-object score and the object feature vector according to the similarity, and calculating the object gradient by combining the user-object score and the user feature vector according to the similarity; then, the user feature vector is updated with the calculated user gradient, and the object feature vector is updated with the calculated object gradient.
In the case where the user feature vector and the object feature vector are learned unilaterally independently, the above calculation can be easily performed. However, under the condition of multi-party joint learning of the feature vectors, users or objects related to scoring data stored by each party may be different, and each party needs to learn the user feature vector of the user related to each party and the object feature vector of the object related to each party.
For ease of description, assuming that the above-mentioned parties are N parties, the scoring data stored by the parties collectively involve M users and P objects, where N, M and P are integers greater than 1. Therefore, the above problem can be described as how to implement the above operations without revealing plaintext data of each party under the scenario that N parties commonly maintain total scoring matrices of M users for P objects, so as to implement multi-party joint learning of user feature vectors and object feature vectors.
Aiming at the problems, the inventor proposes that firstly, through secret sharing, each party obtains a certain user matrix fragment of a user feature matrix of M users, a certain object matrix fragment of an object feature matrix of P objects and a certain score fragment of a total score matrix; then, the operations are correspondingly disassembled into safe and secret piece operation through secret sharing matrix multiplication, and the operations are realized through interaction and joint calculation of multi-party piece operation results, so that safe collaborative learning is realized.
Next, a scenario in which the N-party jointly maintains the total score matrix is introduced. As described above, the N parties collectively maintain a total scoring matrix of the M users for the P objects. FIG. 1 shows a general scoring matrix diagram in which the elements located in the ith row and jth column represent users U, according to one embodimentiFor object VjThe score of (1). Specifically, each of the N parties maintains a portion of the overall score matrix, which accordingly includes N portions maintained by the N parties. It is to be understood that, one of them, some elements may be default, for which it is stated that the corresponding user has not scored the corresponding object, and for any user or any object in the total scoring matrix, there need only be at least one valid score (non-default) associated therewith; and the N parts are mutually exclusive, and the composition of each part has multiple possibilities according to the distribution situation of the actual scoring data.
In one embodiment, each of the N partiesA scoring submatrix in the self-maintenance total scoring matrix. In a specific embodiment, the N parties each store scores of different objects for the same user, and fig. 2 shows a schematic diagram of a total score matrix composed of N score sub-matrices according to an embodiment, where an arbitrary k-th party has a k-th score sub-matrix, and the k-th score sub-matrix is formed by P of M users in the k-th partykThe scores of the individual subjects constitute. In another specific embodiment, the N parties each store scores of the same object by different users, and fig. 3 shows a schematic diagram of a total score matrix composed of N score sub-matrices according to another embodiment, where any k party has a k-th score sub-matrix, and the k-th score sub-matrix is composed of M of the k partieskThe scores of the P objects by the users are formed.
In a further specific embodiment, fig. 4 is a schematic diagram illustrating a total scoring matrix composed of scoring submatrices according to a further embodiment, as shown in fig. 4, in the N-square, there are two sides storing scores of different users by the same user, two sides storing scores of the same user by different users, and two sides storing scores of different objects by different users.
In another embodiment, several scoring submatrices of the total scoring matrix are each maintained in the N-party.
From the above, the total scoring matrix has various composition modes according to various distribution conditions of the scoring data in the N direction.
On the other hand, it should be noted that "party" in the N-party may refer to a data holder, a data party, or a participant, wherein each party may be implemented as any device, platform, server, or cluster having computing and processing capabilities; the user may be an individual user, an enterprise entity, or a merchant, and may specifically be embodied as a user identifier agreed by multiple data parties in a unified manner, such as a mobile phone number, a user number, and the like.
Further, in one embodiment, the object may be a commodity, and the commodity may have various forms including a physical commodity (e.g., a garment, a video), a virtual electronic commodity (e.g., an electronic book, a cloud storage space, a traffic, etc.), an online service or an offline service, and the like. In an exemplary scenario, the N parties are different e-commerce platforms, and the types of goods provided may include books, clothes, food, electronic products, living goods, and so on. In another exemplary scenario, the N parties are different APP (Application) download platforms, and the provided goods may include education APP, office APP, entertainment APP, and so on. In yet another exemplary scenario, where the N parties are different service experience platforms, the offered goods may include a variety of service items, such as taxi service, beauty service, fitness service, and so on. In another embodiment, the object may be a user. In one exemplary scenario, the N parties are different friend-making platforms, where users may score each other. In another exemplary scenario, party N is a house rental platform, where teamed roommates can be scored against each other.
In the above, a scene in which the N parties commonly maintain the total scoring matrix of the P objects scored by the M users is introduced. It is to be understood that object data, user data, score data and the like maintained by the N parties respectively belong to privacy data, and plaintext exchange cannot be performed in the joint training process so as to ensure the security of the privacy data.
In order to perform joint update of feature vectors without revealing privacy data, according to an embodiment of the present specification, fig. 5 is a schematic diagram illustrating an implementation of a method for a multi-party joint training object recommendation model disclosed in the embodiment of the present specification, and an arbitrary kth party acquires the total score matrix through secret sharingIs divided into k-th scoring matrixObject feature matrix of P objectsK object matrix sharding ofAnd a user feature matrix of M usersThe kth user matrix segment of (1), whereby each of the N parties may obtain a matrix segment corresponding to each of the three matrices.
And each party executes multiple iterative updates through the secret sharing acquired matrix fragments. In the process of any iterative update, for any first user uiAnd a first object vjFirstly, the kth party acquires the kth scoring fragment of the first object from the kth scoring matrix fragment(ii) a Obtaining k object feature patches of a first object from a last k object matrix patch(ii) a And acquiring the kth user characteristic fragment of the first user from the kth user matrix fragment(ii) a Then, based onAndobtaining the similarity of the feature vectors of the first user and the first object by carrying out secret sharing matrix operation with other N-1 partiesK-th similarity slice ofWill beAndis taken as the k-th error slice(ii) a Then based onAndcalculating the updating gradient of the user characteristics by carrying out secret sharing matrix operation with other N-1 partiesTo obtain the k-th user gradient sliceAnd is based onAndcalculating object feature update gradient by performing secret sharing matrix operation with other N-1 partiesObtaining the kth object gradient slice(ii) a Then, according toUpdatingAnd, according toUpdating. In this manner, multiple iterative updates may be implemented.
Until the whole iterative process is finished, the multi-party exchanges the matrix fragments thereof to reconstruct the feature matrix. The situation where each of the parties maintains scoring data for different objects for the same user is illustrated in FIG. 1, where the k-th party reconstructs the user feature matrix U and the object sub-matrix V, respectivelykFurther use of U and VkConstructing an object recommendation model thereof for predicting P pair of any one of M userskAnd scoring any one of the objects, and recommending the object according to a scoring result.
In the whole iterative updating process, multiple parties do not exchange the score data, the object data and the plaintext of the user data, the user characteristic matrix and the object characteristic matrix are also split into characteristic fragments, the characteristic fragments are respectively maintained for iterative updating, and the characteristic matrix can be reconstructed until iteration is finished. Therefore, the security of the private data in the joint updating process is greatly enhanced.
In the following, a specific process of jointly learning feature vectors by multiple parties and further constructing respective object recommendation models is described.
Specifically, fig. 6 is a flowchart illustrating a method for a multi-party joint training object recommendation model for protecting data privacy according to an embodiment, where multiple parties are N parties that collectively maintain an overall evaluation matrix of M users on P objects, and the method is performed by any kth party of the N parties, and the kth party may be implemented by any device or equipment or server cluster with computing and processing capabilities.
As shown in fig. 6, the method comprises the steps of:
step S610, obtaining the kth scoring matrix segment of the total scoring matrix through secret sharing. Step S620, obtaining the initialized kth object matrix segment of the object feature matrices of the P objects and obtaining the initialized kth user matrix segment of the user feature matrices of the M users through secret sharing. Step S630, a plurality of iterative updates are performed, where any iterative update includes: step S631, for any first user and first object, acquiring the kth scoring fragment of the first object from the kth scoring matrix fragment; acquiring a kth object characteristic fragment of the first object from the kth object matrix fragment of the previous iteration; acquiring a kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration; step S632, based on the kth user feature fragment and the kth object feature fragment, performing secret sharing matrix operation with other N-1 parties to obtain a kth similarity fragment of the feature vector similarity of the first user and the first object, and taking the difference between the kth similarity fragment and the kth scoring fragment as a kth error fragment; step S633, based on the k error fragment and the k object feature fragment, calculating a user feature update gradient by performing secret sharing matrix operation with other N-1 parties to obtain a k user gradient fragment; step S634, based on the kth error fragment and the kth user characteristic fragment, calculating an object characteristic update gradient by performing secret sharing matrix operation with other N-1 parties to obtain a kth object gradient fragment; step S635, updating the kth user characteristic fragment according to the kth user gradient fragment; and updating the k object feature fragment according to the k object gradient fragment.
The steps are as follows:
step S610, obtaining the kth scoring matrix segment of the total scoring matrix through secret sharing.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix described aboveBased on this, the present step may include: in one aspect, the kth scoring submatrixSplitting into N pieces, and reserving k pieceThe other N-1 shards are correspondingly sent to other N-1 parties, such asWhen k ≠ 1, it includes slicing 1 stSending to the 1 st party; on the other hand, the kth slice of other scoring matrices is received from other N-1 parties(ii) a Further, the kth slice of the kth scoring submatrixAnd k fragment of other scoring submatricesSplicing to form the kth scoring matrix fragment。
In another embodiment, the kth scoring submatrix may be a plurality of submatrices, and accordingly, in this step, any one of the submatrices is split into N segments, the kth segment is reserved, and the other N-1 segments are correspondingly sent to the other N-1 parties; on the other hand, receiving the kth fragment of other scoring matrixes from other N-1 parties; therefore, the plurality of k-th slices of the k-th scoring submatrix and the k-th slices of other scoring submatrixes are collectively classified into the k-th scoring matrix slice。
It should be noted that, the k-th scoring submatrix is divided into N segments, which can be implemented by using an existing secret sharing (or called secret sharing) technology, and the secret sharing technology is not described herein again.
Thus, the kth party can obtain the kth scoring matrix sliceBy analogy, each of the N parties correspondingly obtains a score of a certain scoring matrixAnd (3) slicing. And, in step S620, acquiring, through secret sharing, a kth object matrix segment of the initialized object feature matrices of the P objects and a kth user matrix segment of the initialized user feature matrices of the M users.
In one embodiment, the N parties respectively store scoring data of M users on some of the P objects, at this time, the k party has a k-th scoring submatrix of the total scoring matrix, and the k-th scoring submatrix is formed by the M users on P in the k partykThe scores of the individual subjects constitute.
Further, in a specific embodiment, the obtaining the kth object matrix partition may include: first, P is initializedkP corresponding to each objectkThe characteristic vector of each object forms the k-th object sub-matrix Vk(ii) a Then, the k object sub-matrix V is shared by secretkSplitting into N pieces, and reserving k pieceCorrespondingly sending the other N-1 fragments to other N-1 parties; and receiving initialized kth fragment of other object submatrix from other N-1 sides}; then, the k slice of the k object sub-matrix is divided intoAnd k fragment of other object sub-matrixSplicing to form the k-th object matrix slice。
In a specific embodiment, since the score data stored in each of the N parties points to M users, any one of the parties may be designated to initialize the user feature vector, and perform secret sharing with the other party based on the initialized user feature vector.
In an example, the kth party is a designated party for initializing the user feature vector, and accordingly, the obtaining the kth user matrix fragment may include: firstly, initializing M user characteristic vectors corresponding to M users to form a user characteristic matrix(ii) a Then, the user characteristic matrix is divided into N fragments through secret sharing, and the k-th user matrix fragment in the user characteristic matrix is reservedAnd correspondingly sending the other N-1 fragments to other N-1 parties. In another example, the kth party is not a designated party for initializing the user feature vector, and accordingly, the obtaining the kth user matrix fragment may include: receiving the k-th user matrix slice from a designated party of other N-1 parties. Therefore, each party in the N parties can correspondingly obtain a certain user matrix fragment.
In one embodiment, the N parties respectively store scoring data of P objects by some of the M users, and at this time, the k party has a k-th scoring submatrix of the total scoring matrix, and the k-th scoring submatrix is formed by M in the k partykThe scores of the P objects by the users are formed.
Further, in a specific embodiment, the obtaining the kth user matrix segment may include: first, M is initializedkM corresponding to each userkThe characteristic vector of each user forms a k-th user sub-matrix Uk(ii) a Then, the k-th user sub-matrix U is shared by secretskSplitting into N pieces, and reserving k pieceCorrespondingly sending the other N-1 fragments to other N-1 parties; and, from the other N-1 sideReceiving initialized kth fragment of other user submatrix}; then, the k slice of the k user sub-matrix is dividedAnd k fragment of other user sub-matrixSplicing to form the k-th user matrix segment。
In a specific embodiment, since the score data stored in each of the N parties points to P objects, any one of the parties may be designated to initialize the object feature vector, and perform secret sharing with the other party based on the initialized object feature vector.
In an example, the kth party is a designated party for initializing the object feature vector, and accordingly, the obtaining the kth object matrix fragment may include: firstly, P object feature vectors corresponding to P objects are initialized to form an object feature matrix(ii) a Then, the object feature matrix is divided into N fragments through secret sharing, and the k-th object matrix fragment in the object feature matrix is reservedAnd correspondingly sending the other N-1 fragments to other N-1 parties. In another example, the kth party is not a designated party for initializing the object feature vector, and accordingly, the obtaining the kth object matrix fragment may include: receiving the k-th object matrix slice from a designated party of the other N-1 parties. Thus, each of N squaresOne party can correspondingly obtain a certain object matrix fragment.
From the above, the k-th party can obtain the initialized k-th object matrix sliceAnd k-th user matrix fragmentation. By analogy, each of the N parties may correspond to a certain object matrix segment and a certain user matrix segment that are initialized.
The kth party divides the kth scoring matrix based on the above acquisitionKth object matrix fragmentationAnd k-th user matrix fragmentationIn step S630, a plurality of iterative updates are performed, wherein any one iterative update includes the following steps S631 to S635, specifically as follows:
first, in step S631, for an arbitrary first user uiAnd a first object vjAcquiring the kth scoring fragment of the first object from the kth scoring matrix fragment by the first user(ii) a Obtaining the k object feature fragment of the first object from the k object matrix fragment of the last iteration(ii) a And acquiring the kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration. It is to be understood that where k object featuresSlicingAnd kth user feature shardingActually, the j < th > object feature vector respectivelyAnd ith user feature vectorThe k-th slice of (a), therefore,andare all vectors.
Then, in step S632, slicing is performed based on the k-th user feature thereinAnd kth object feature shardingObtaining the kth similarity fragment of the similarity of the feature vectors of the first user and the first object by performing secret sharing matrix operation with other N-1 partiesDividing the k similarity into piecesAnd kth scoring shardsThe difference is used as the k-th error slice。
Specifically, referring to the above formula (1), the first user uiAnd a first object vjDegree of similarity ofCan be calculated byIt was found that, while in the examples of the present specification,andthe method is characterized in that the method is divided into N fragments which are dispersed in N parties, and based on the method, the inventor proposes that secret sharing matrix operation is utilized to realize that the N parties utilize respective user characteristic fragments and object characteristic fragments to jointly calculate the similarity between a user and an object.
It should be noted that the secret sharing matrix operation method itself is already known. For the convenience of understanding, the following will first take two parties as examples, and describe the secret sharing matrix operation, and then how the kth party splits based on the kth user feature thereinAnd kth object feature shardingAnd carrying out secret sharing matrix operation with other N-1 parties to obtain the k-th similarity slice。
FIG. 7 illustrates a process diagram for two parties to perform a secret sharing matrix operation, according to one embodiment. Wherein the two parties are the 0 th party and the 1 st party, and, assuming that the two parties have previously passed secret sharing, obtaining auxiliary values for secret sharing matrix operation、、Wherein party 0 holds the slice、Andthe 1 st party has a fragment、And. For numerical values、、Wherein、Is from a finite fieldA random number of (2), whereinThe method is a super-ginseng method,whereinTo representAndthe product of (a) and (b),is a remainder symbol, used in FIG. 7Refers to。
As shown in fig. 7, where party 0 holds dataIs divided intoAnd dataIs divided intoThe 1 st party holds dataIs divided intoAnd dataIs divided intoTwo parties want to work out together。
Next, party 0 is based on the shards it holds、、Andcalculate outAndin the case of a liquid crystal display device, in particular,=-,=-(ii) a Shards that party 1 holds based on、、Andcalculate outAndin the case of a liquid crystal display device, in particular,=-,=-。
then, the 0 th party and the 1 st party exchange the calculatedAndto be divided into pieces. Thus, both parties 0 and 1 own、、Andand is reconstructedAndwherein,。
Then, the 0 th party calculates1 st calculation. Wherein the content of the first and second substances,,。
finally, the 0 th and 1 st parties exchange the calculatedTo be divided into pieces. Thus, both parties 0 and 1 ownAndand is reconstructed,Due to the factThen by reconstructionThen obtain。
From above, hold dataParty 0 and holding dataThe 1 st party of (1) can calculate by secret sharing matrix operation. And, due toTherefore, it isAndis in factShare shards with secrets. That is, if the reconstruction of the last step is discarded Parties 0 and 1, may be based on dataAndperforming secret sharing matrix operation to obtainShare shards with secrets. It should be noted that fig. 2 shows a process of performing secret sharing matrix operation by two parties, and the secret sharing matrix operation can be further extended to more than two parties, briefly, according to the number of participating parties and dataAndis divided into secret sharing fragments with corresponding quantity, and is distributed in each participant, and the auxiliary value is、、The secret sharing fragments are also split into corresponding number of secret sharing fragments which are distributed in each participant, and finally, each participant can correspondingly calculateShare a shard with a certain secret.
In the above description, the secret sharing matrix operation is described, and it can be seen from the above description that a plurality of parties can be based on data held by the partiesAndthe sub-slices are obtained by respectively corresponding secret sharing matrix operationShare a shard with a certain secret. Based on this, in the embodiment of the present specification, the N parties may be based on the feature vectors of the users held by the respective partiesAnd object feature vectorsThe sub-slices are obtained by respectively corresponding secret sharing matrix operationOf the shared secret sharding, wherein,。
in this step, the kth party acquires the kth user feature fragment according to the kth user feature fragmentAnd kth object feature shardingAnd carrying out secret sharing matrix operation with other N-1 parties to obtain() The k-th slice, that is, the k-th similarity slice. Further, the kth party may slice the kth similarityAnd the k-th score segmentThe difference is used as the k-th error slice。
It can be shown that, because:,(ii) a Therefore:that is to say that the first and second electrodes,anddifference of differenceIs an error ofTo be divided into pieces.
Above, the k-th error slice is obtainedThereafter, on the one hand, the kth party, at step S633, slices based on the kth errorAnd kth object feature shardingCalculating the updating gradient of the user characteristics by carrying out secret sharing matrix operation with other N-1 partiesTo obtain the k-th user gradient slice。
In particular, see equation (2) above for calculating a user update gradient, in conjunction with the foregoing, whereAccordingly, it can be inferred that:. To thereinThe kth party may be based on the kth error sliceAnd kth object feature shardingObtaining the secret sharing matrix operation with other N-1 partiesSlice k of. It is to be understood that regularization in equation (1)The terms being selectable, accordingly, in equation (2)Are optional and, in one embodiment, may not be consideredAnd, thus,and, in this step, the calculatedAs kth user gradient sliceI.e. by(ii) a In another embodiment, considerThen, in this step, it is calculatedSubtract kth user feature shardingTo obtain the k-th user gradient slice。
Thus, the k-th user gradient slice can be obtained. On the other hand, in step S634, based on the k-th error sliceAnd kth user feature shardingCalculating object feature update gradient by performing secret sharing matrix operation with other N-1 partiesObtaining the kth object gradient slice. It should be noted that, as can be seen from the observation, the equations (2) and (3) have a certain degree of symmetry, so that reference may be made to the description of step S633 for the description of step S634, and no further description is given.
In the above steps S633 and S634, the kth party can obtain the kth user gradient sliceAnd kth object gradient segmentation. Then, in step S635, slicing is performed according to the k-th user gradientUpdating the k-th user feature slice(ii) a Slicing according to the k-th object gradientUpdating the k-th object feature slice。
Specifically, the update may be performed by the following formula (4) and formula (5), respectively
Wherein the content of the first and second substances,andboth are learning rates and super-parameters, which may be the same or different because they are manually set. In one example of the above-mentioned method,,. In this way, the k-th user feature fragment and the k-th object feature fragment can be updated.
As can be seen from the above, the kth party can perform any iterative update by performing steps S631 to S635, so as to update the kth segment of the user feature vector of the first user and update the kth segment of the object feature vector of the first object according to the score of the first object by any first user. It should be noted that actually, in a certain iterative update, besides the kth slice of the single user feature vector and the single object feature vector, the kth slices of the multiple user feature vectors and the multiple object feature vectors may be updated in batches, and the size of the batch depends on the sampling number (batch size) of each update. Thus, the kth party can realize multiple iterative updates by repeatedly executing steps S631 to S635 for multiple times, and specifically, can realize the updates to the kth user matrix partition and the kth object matrix partition. By analogy, the N parties can cooperatively realize the updating of the user feature matrix and the object feature matrix, that is, the updating of the M user feature vectors and the P object feature vectors.
In one embodiment, after the process of the multiple iterative updates is finished, the multiple parties exchange matrix fragments thereof to perform feature matrix reconstruction.
In one embodiment, in a scenario where N parties each store scores of M users for a part of objects in P objects, each party needs to reconstruct the user feature matrix and an object submatrix of a corresponding part of objects in the object feature matrix. The embodiment of the method is that, on one hand, the kth party sends the updated kth user feature fragments to other N-1 parties respectively, receives the updated other user feature fragments from other N-1 parties, and then splices the updated kth user feature fragments and the received other user feature fragments to form an updated user feature matrix. On the other hand, the kth party correspondingly sends the kth fragment belonging to other object sub-matrixes in the updated kth object matrix fragment to other N-1 parties, and receives the updated other fragments belonging to the kth object sub-matrixes from other N-1 parties; and then, splicing the kth fragment belonging to the kth object sub-matrix in the updated kth object matrix fragment with the received other fragments to form an updated kth object sub-matrix.
Therefore, the k-th party can reconstruct the user characteristic matrix and the k-th object sub-matrix, and further construct an object recommendation model based on the reconstructed user characteristic matrix and the k-th object sub-matrix. For example, in order to select some objects from the objects of the kth party and recommend the selected objects to a certain user, the certain user and a plurality of candidate objects may be input into an object recommendation model, so the object recommendation model may obtain a user feature vector of the certain user from the reconstructed user feature matrix, obtain a plurality of object feature vectors corresponding to the plurality of candidate objects from the reconstructed kth object sub-matrix, then calculate similarities between the user feature vector and the plurality of object feature vectors respectively, score the user prediction on the plurality of candidate objects, and output the candidate objects with prediction scores greater than a predetermined threshold as recommendation objects recommended to the certain user.
In another embodiment, in a scenario where N parties each store scores of P objects by some users in M users, each party needs to reconstruct the object feature matrix and the user submatrix of the corresponding user in the user feature matrix. The embodiment of the method is that, on one hand, the kth party sends the updated kth object feature fragments to other N-1 parties respectively, receives the updated other object feature fragments from other N-1 parties, and then splices the updated kth object feature fragments and the received other object feature fragments to form an updated object feature matrix. On the other hand, the kth party correspondingly sends the kth fragments belonging to other user sub-matrixes in the updated kth user matrix fragments to other N-1 parties, and receives the updated other fragments belonging to the kth user sub-matrixes from other N-1 parties; and then, splicing the kth fragment belonging to the kth user sub-matrix in the updated kth user matrix fragment with the received other fragments to form an updated kth user sub-matrix. Therefore, the k-th party can reconstruct the object feature matrix and the k-th user sub-matrix, and further construct the user recommendation model based on the reconstructed object feature matrix and the k-th user sub-matrix.
In summary, by adopting the method disclosed in the embodiment of the present specification, not only plaintext exchange of score data, object data and user data is not performed by multiple parties, but also the user feature matrix and the object feature matrix are split into feature fragments, and each user feature matrix and the object feature matrix are only maintained for iterative updating of the feature fragments, and the feature matrix is reconstructed until iteration is finished, so that respective object recommendation models are constructed, and further more accurate object recommendation is made, and meanwhile, the security of privacy data in the joint training process is greatly ensured.
Corresponding to the training method, the embodiment of the specification also discloses a training device. Fig. 8 is a schematic structural diagram of an apparatus for a multi-party joint training object recommendation model for protecting data privacy according to an embodiment, where multiple parties are N parties, and the N parties jointly maintain a total scoring matrix of P objects by M users, and the apparatus is integrated in any kth party of the N parties. The apparatus may be implemented by any computing unit or server cluster having computing, processing capabilities.
As shown in fig. 8, the apparatus 800 includes: the score fragment obtaining unit 810 is configured to obtain a kth score matrix fragment of the total score matrix through secret sharing. An object fragment obtaining unit 820 configured to obtain the k-th object matrix fragment of the initialized object feature matrix of the P objects through secret sharing. The user fragment obtaining unit 830 is configured to obtain initialized kth user matrix fragments of the user feature matrices of the M users through secret sharing. An iterative update unit 840 configured to perform a plurality of iterative updates, wherein any one iterative update is performed by: a multi-segment obtaining module 841 configured to, for any first user and first object, obtain a kth scoring segment of the first object by the first user from the kth scoring matrix segment; acquiring a kth object characteristic fragment of the first object from the kth object matrix fragment of the previous iteration; acquiring a kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration; an error fragment calculation module 842 configured to perform secret sharing matrix operation with other N-1 parties based on the kth user feature fragment and the kth object feature fragment to obtain a kth similarity fragment of feature vector similarity of the first user and the first object, and taking a difference between the kth similarity fragment and the kth score fragment as a kth error fragment; the user gradient fragment calculation module 843 is configured to calculate a user characteristic update gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth object characteristic fragment, so as to obtain a kth user gradient fragment; an object gradient calculation module 844 is configured to calculate an object characteristic update gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth user characteristic fragment to obtain a kth object gradient fragment; a user segment updating module 845, configured to update the kth user feature segment according to the kth user gradient segment; an object slice updating module 846 configured to update the kth object feature slice according to the kth object gradient slice.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix; the score segment obtaining unit 810 is specifically configured to: splitting the kth scoring submatrix into N fragments, reserving the kth fragment, and correspondingly sending the other N-1 fragments to other N-1 parties; and receiving k-th fragments of other scoring matrixes from other N-1 parties; and splicing the kth fragment of the kth scoring submatrix with the kth fragment of other scoring submatrixes to form the kth scoring matrix fragment.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix, the kth scoring submatrix being defined by the M users for P in the kth partykThe scores of the individual subjects constitute.
In a specific embodiment, the object fragment obtaining unit 820 is specifically configured to: initialization PkP corresponding to each objectkThe characteristic vector of each object forms a k object sub-matrix; splitting the k object sub-matrix into N fragments through secret sharing, reserving the k fragment, and correspondingly sending other N-1 fragments to other N-1 parties; and receiving initialized kth slices of other object submatrices from other N-1 parties; and splicing the kth fragment of the kth object sub-matrix and the kth fragments of other object sub-matrices to form the kth object matrix fragment.
In a specific embodiment, the kth party is a designated party for initializing the user feature vector; the user segment obtaining unit 830 is specifically configured to: initializing M user characteristic vectors corresponding to M users to form a user characteristic matrix; and splitting the user characteristic matrix into N fragments through secret sharing, reserving the k-th user matrix fragment, and correspondingly sending other N-1 fragments to other N-1 parties.
In a specific embodiment, the apparatus 800 further includes a user matrix reconstructing unit 851 configured to: respectively sending the updated kth user characteristic fragment to other N-1 parties, and receiving the updated other user characteristic fragments from other N-1 parties; and splicing the updated kth user characteristic fragment and the received other user characteristic fragments to form an updated user characteristic matrix.
In a specific embodiment, the apparatus 800 further includes an object sub-matrix reconstructing unit 852 configured to: correspondingly sending the kth fragment belonging to other object sub-matrixes in the updated kth object matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth object sub-matrix from other N-1 parties; and splicing the kth fragment belonging to the kth object sub-matrix in the updated kth object matrix fragment with the received other fragments to form an updated kth object sub-matrix.
In one embodiment, the kth party has a kth scoring submatrix of the total scoring matrix, the kth scoring submatrix being formed by M of the kth partykThe scores of the P objects by the users are formed.
In a specific embodiment, the user slice obtaining unit 830 is specifically configured to: initialization MkM corresponding to each userkForming a kth user sub-matrix by the user characteristic vector; the k user sub-matrix is divided into N fragments through secret sharing, the k fragment is reserved, and other N-1 fragments are correspondingly sent to other N-1 parties; and receiving initialized kth slices of other user submatrices from other N-1 parties; and splicing the kth fragment of the kth user sub-matrix with the kth fragments of other user sub-matrices to form the kth user matrix fragment.
In a specific embodiment, the kth party is a designated party for initializing the object feature vector; the object fragment obtaining unit 820 is specifically configured to: initializing P object feature vectors corresponding to the P objects to form an object feature matrix; and splitting the object feature matrix into N fragments through secret sharing, reserving the k-th object matrix fragment, and correspondingly sending the other N-1 fragments to other N-1 parties.
In a specific embodiment, the apparatus 800 further comprises an object matrix reconstruction unit 861 configured to: respectively sending the updated kth object feature fragment to other N-1 parties, and receiving the updated other object feature fragments from other N-1 parties; and splicing the updated kth object feature fragment and the received other object feature fragments to form an updated object feature matrix.
In a specific embodiment, the apparatus 800 further includes a user sub-matrix reconstruction unit 862 configured to: correspondingly sending the kth fragment belonging to other user sub-matrixes in the updated kth user matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth user sub-matrix from other N-1 parties; and splicing the kth fragment belonging to the kth user sub-matrix in the updated kth user matrix fragment with the received other fragments to form an updated kth user sub-matrix.
In summary, by using the device disclosed in the embodiment of the present specification, multiple parties do not perform plaintext exchange on score data, object data, and user data, the user feature matrix and the object feature matrix are also split into feature fragments, and each user feature matrix and the object feature matrix are only maintained for iterative updating of the feature fragments, and the feature matrices are reconstructed until iteration is completed, so that respective object recommendation models are constructed, and further more accurate object recommendation is made.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 6.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method described in connection with fig. 6.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.
Claims (24)
1. A method for protecting a multi-party joint training object recommendation model of data privacy is disclosed, wherein the multi-party is N parties, the N parties jointly maintain total scoring matrixes of M users for P objects, the method is executed by any kth party in the N parties, and the method comprises the following steps:
acquiring a kth scoring matrix fragment of the total scoring matrix through secret sharing;
acquiring the k-th object matrix fragment of the initialized object feature matrix of the P objects and the k-th user matrix fragment of the initialized user feature matrix of the M users through secret sharing;
performing a plurality of iterative updates, wherein any iterative update comprises:
for any first user and any first object, acquiring a kth scoring fragment of the first object from the kth scoring matrix fragment by the first user; acquiring a kth object characteristic fragment of the first object from the kth object matrix fragment of the previous iteration; acquiring a kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration;
based on the kth user feature fragment and the kth object feature fragment, performing secret sharing matrix operation with other N-1 parties to obtain a kth similarity fragment of the feature vector similarity of the first user and the first object, and taking the difference between the kth similarity fragment and the kth scoring fragment as a kth error fragment;
calculating a user characteristic updating gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth object characteristic fragment to obtain a kth user gradient fragment;
calculating an object characteristic updating gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth user characteristic fragment to obtain a kth object gradient fragment;
updating the kth user characteristic fragment according to the kth user gradient fragment; updating the kth object feature fragment according to the kth object gradient fragment;
the kth scoring matrix fragment, the kth object matrix fragment and the kth user matrix fragment are all matrix fragments owned by the kth party;
the kth party has a kth scoring submatrix of the total scoring matrix; the obtaining of the kth scoring matrix segment of the total scoring matrix through secret sharing includes:
splitting the kth scoring submatrix into N fragments, reserving the kth fragment, and correspondingly sending the other N-1 fragments to other N-1 parties; and receiving k-th fragments of other scoring matrixes from other N-1 parties;
and splicing the kth fragment of the kth scoring submatrix with the kth fragments of other scoring submatrixes to form the kth scoring matrix fragment.
2. The method of claim 1, wherein the kth scoring submatrix is defined by P in the kth party of M userskThe scores of the individual subjects constitute.
3. The method of claim 2, wherein the obtaining the initialized kth object matrix slice of the object feature matrices of the P objects comprises:
initialization PkP corresponding to each objectkThe characteristic vector of each object forms a k object sub-matrix;
splitting the k object sub-matrix into N fragments through secret sharing, reserving the k fragment, and correspondingly sending other N-1 fragments to other N-1 parties; and receiving initialized kth slices of other object submatrices from other N-1 parties;
and splicing the kth fragment of the kth object sub-matrix and the kth fragments of other object sub-matrices to form the kth object matrix fragment.
4. The method of claim 2, wherein the kth party is a designated party that initializes a user feature vector; the acquiring the initialized kth user matrix segment of the user feature matrices of the M users includes:
initializing M user characteristic vectors corresponding to M users to form a user characteristic matrix;
and splitting the user characteristic matrix into N fragments through secret sharing, reserving the k-th user matrix fragment, and correspondingly sending other N-1 fragments to other N-1 parties.
5. The method of claim 2, wherein after the performing a plurality of iterative updates, the method further comprises:
respectively sending the updated kth user characteristic fragment to other N-1 parties, and receiving the updated other user characteristic fragments from other N-1 parties;
and splicing the updated kth user characteristic fragment and the received updated other user characteristic fragments to form an updated user characteristic matrix.
6. The method of claim 3, wherein after the performing a plurality of iterative updates, the method further comprises:
correspondingly sending the kth fragment belonging to other object sub-matrixes in the updated kth object matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth object sub-matrix from other N-1 parties;
and splicing the kth fragment belonging to the kth object sub-matrix in the updated kth object matrix fragment with the received updated other fragments to form an updated kth object sub-matrix.
7. The method of claim 1, wherein the kth party has a kth sub-matrix of scores of the total score matrix, the kth sub-matrix of scores being defined by M of the kth partykThe scores of the P objects by the users are formed.
8. The method of claim 7, wherein the obtaining the initialized kth user matrix slice of the user feature matrices of the M users comprises:
initialization MkM corresponding to each userkForming a kth user sub-matrix by the user characteristic vector;
the k user sub-matrix is divided into N fragments through secret sharing, the k fragment is reserved, and other N-1 fragments are correspondingly sent to other N-1 parties; and receiving initialized kth slices of other user submatrices from other N-1 parties;
and splicing the kth fragment of the kth user sub-matrix with the kth fragments of other user sub-matrices to form the kth user matrix fragment.
9. The method of claim 7, wherein the kth party is a designated party that initializes an object feature vector; the acquiring the initialized kth object matrix slice of the object feature matrices of the P objects includes:
initializing P object feature vectors corresponding to the P objects to form an object feature matrix;
and splitting the object feature matrix into N fragments through secret sharing, reserving the k-th object matrix fragment, and correspondingly sending the other N-1 fragments to other N-1 parties.
10. The method of claim 7, wherein after the performing a plurality of iterative updates, the method further comprises:
respectively sending the updated kth object feature fragment to other N-1 parties, and receiving the updated other object feature fragments from other N-1 parties;
and splicing the updated kth object feature fragment and the received updated other object feature fragments to form an updated object feature matrix.
11. The method of claim 8, wherein after the performing a plurality of iterative updates, the method further comprises:
correspondingly sending the kth fragment belonging to other user sub-matrixes in the updated kth user matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth user sub-matrix from other N-1 parties;
and splicing the kth fragment belonging to the kth user sub-matrix in the updated kth user matrix fragment with the received updated other fragments to form an updated kth user sub-matrix.
12. A device for protecting a multi-party joint training object recommendation model of data privacy is disclosed, wherein the multi-party is N parties, the N parties jointly maintain a total scoring matrix of P objects by M users, the device is integrated in any kth party of the N parties, and the device comprises:
the scoring fragment acquisition unit is configured to acquire the kth scoring matrix fragment of the total scoring matrix through secret sharing;
the object fragment acquisition unit is configured to acquire the k-th object matrix fragment of the initialized object feature matrix of the P objects through secret sharing;
the user fragment acquisition unit is configured to acquire initialized kth user matrix fragments of the user feature matrices of the M users through secret sharing;
an iterative update unit configured to perform a plurality of iterative updates, wherein any one iterative update is performed by:
the multi-segment acquisition module is configured to acquire a kth scoring segment of the first object from the kth scoring matrix segment for any first user and any first object; acquiring a kth object characteristic fragment of the first object from the kth object matrix fragment of the previous iteration; acquiring a kth user characteristic fragment of the first user from the kth user matrix fragment of the last iteration;
the error fragment calculation module is configured to perform secret sharing matrix operation with other N-1 parties based on the kth user feature fragment and the kth object feature fragment to obtain a kth similarity fragment of feature vector similarity of the first user and the first object, and taking the difference between the kth similarity fragment and the kth score fragment as a kth error fragment;
the user gradient fragment calculation module is configured to calculate a user characteristic update gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth object characteristic fragment to obtain a kth user gradient fragment;
the object gradient calculation module is configured to calculate an object characteristic update gradient by performing secret sharing matrix operation with other N-1 parties based on the kth error fragment and the kth user characteristic fragment to obtain a kth object gradient fragment;
a user segment updating module configured to update the kth user feature segment according to the kth user gradient segment;
an object fragment updating module configured to update the kth object feature fragment according to the kth object gradient fragment;
the kth scoring matrix fragment, the kth object matrix fragment and the kth user matrix fragment are all matrix fragments owned by the kth party;
the kth party has a kth scoring submatrix of the total scoring matrix; the score fragment acquisition unit is specifically configured to:
splitting the kth scoring submatrix into N fragments, reserving the kth fragment, and correspondingly sending the other N-1 fragments to other N-1 parties; and receiving k-th fragments of other scoring matrixes from other N-1 parties;
and splicing the kth fragment of the kth scoring submatrix with the kth fragments of other scoring submatrixes to form the kth scoring matrix fragment.
13. The apparatus of claim 12, wherein the kth scoreboardThe matrix is composed of M users to P in the k squarekThe scores of the individual subjects constitute.
14. The apparatus according to claim 13, wherein the object slice obtaining unit is specifically configured to:
initialization PkP corresponding to each objectkThe characteristic vector of each object forms a k object sub-matrix;
splitting the k object sub-matrix into N fragments through secret sharing, reserving the k fragment, and correspondingly sending other N-1 fragments to other N-1 parties; and receiving initialized kth slices of other object submatrices from other N-1 parties;
and splicing the kth fragment of the kth object sub-matrix and the kth fragments of other object sub-matrices to form the kth object matrix fragment.
15. The apparatus of claim 13, wherein the kth party is a designated party that initializes a user feature vector; the user fragment acquiring unit is specifically configured to:
initializing M user characteristic vectors corresponding to M users to form a user characteristic matrix;
and splitting the user characteristic matrix into N fragments through secret sharing, reserving the k-th user matrix fragment, and correspondingly sending other N-1 fragments to other N-1 parties.
16. The apparatus of claim 13, wherein the apparatus further comprises a user matrix reconstruction unit configured to:
respectively sending the updated kth user characteristic fragment to other N-1 parties, and receiving the updated other user characteristic fragments from other N-1 parties;
and splicing the updated kth user characteristic fragment and the received updated other user characteristic fragments to form an updated user characteristic matrix.
17. The apparatus according to claim 14, wherein the apparatus further comprises an object sub-matrix reconstruction unit configured to:
correspondingly sending the kth fragment belonging to other object sub-matrixes in the updated kth object matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth object sub-matrix from other N-1 parties;
and splicing the kth fragment belonging to the kth object sub-matrix in the updated kth object matrix fragment with the received updated other fragments to form an updated kth object sub-matrix.
18. The apparatus of claim 12, wherein the kth party has a kth sub-matrix of scores of the total score matrix, the kth sub-matrix of scores being defined by M of the kth partykThe scores of the P objects by the users are formed.
19. The apparatus according to claim 18, wherein the user slice acquiring unit is specifically configured to:
initialization MkM corresponding to each userkForming a kth user sub-matrix by the user characteristic vector;
the k user sub-matrix is divided into N fragments through secret sharing, the k fragment is reserved, and other N-1 fragments are correspondingly sent to other N-1 parties; and receiving initialized kth slices of other user submatrices from other N-1 parties;
and splicing the kth fragment of the kth user sub-matrix with the kth fragments of other user sub-matrices to form the kth user matrix fragment.
20. The apparatus of claim 18, wherein the kth party is a designated party that initializes an object feature vector; the object fragment obtaining unit is specifically configured to:
initializing P object feature vectors corresponding to the P objects to form an object feature matrix;
and splitting the object feature matrix into N fragments through secret sharing, reserving the k-th object matrix fragment, and correspondingly sending the other N-1 fragments to other N-1 parties.
21. The apparatus according to claim 19, wherein the apparatus further comprises an object matrix reconstruction unit configured to:
respectively sending the updated kth object feature fragment to other N-1 parties, and receiving the updated other object feature fragments from other N-1 parties;
and splicing the updated kth object feature fragment and the received updated other object feature fragments to form an updated object feature matrix.
22. The apparatus of claim 19, wherein the apparatus further comprises a user sub-matrix reconstruction unit configured to:
correspondingly sending the kth fragment belonging to other user sub-matrixes in the updated kth user matrix fragment to other N-1 parties, and receiving the updated other fragments belonging to the kth user sub-matrix from other N-1 parties;
and splicing the kth fragment belonging to the kth user sub-matrix in the updated kth user matrix fragment with the received updated other fragments to form an updated kth user sub-matrix.
23. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed in a computer, causes the computer to perform the method of any of claims 1-11.
24. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that when executed by the processor implements the method of any of claims 1-11.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010384206.4A CN111291417B (en) | 2020-05-09 | 2020-05-09 | Method and device for protecting data privacy of multi-party combined training object recommendation model |
PCT/CN2021/092179 WO2021227959A1 (en) | 2020-05-09 | 2021-05-07 | Data privacy protected multi-party joint training of object recommendation model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010384206.4A CN111291417B (en) | 2020-05-09 | 2020-05-09 | Method and device for protecting data privacy of multi-party combined training object recommendation model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111291417A CN111291417A (en) | 2020-06-16 |
CN111291417B true CN111291417B (en) | 2020-08-28 |
Family
ID=71029680
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010384206.4A Active CN111291417B (en) | 2020-05-09 | 2020-05-09 | Method and device for protecting data privacy of multi-party combined training object recommendation model |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111291417B (en) |
WO (1) | WO2021227959A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291417B (en) * | 2020-05-09 | 2020-08-28 | 支付宝(杭州)信息技术有限公司 | Method and device for protecting data privacy of multi-party combined training object recommendation model |
CN112016698A (en) * | 2020-08-28 | 2020-12-01 | 深圳前海微众银行股份有限公司 | Factorization machine model construction method and device and readable storage medium |
CN112800466B (en) * | 2021-02-10 | 2022-04-22 | 支付宝(杭州)信息技术有限公司 | Data processing method and device based on privacy protection and server |
CN113094739B (en) * | 2021-03-05 | 2022-04-22 | 支付宝(杭州)信息技术有限公司 | Data processing method and device based on privacy protection and server |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10216954B2 (en) * | 2016-06-27 | 2019-02-26 | International Business Machines Corporation | Privacy detection of a mobile application program |
CN107145792B (en) * | 2017-04-07 | 2020-09-15 | 哈尔滨工业大学深圳研究生院 | Multi-user privacy protection data clustering method and system based on ciphertext data |
CN107563841B (en) * | 2017-08-03 | 2021-02-05 | 电子科技大学 | Recommendation system based on user score decomposition |
CN109034398B (en) * | 2018-08-10 | 2023-09-12 | 深圳前海微众银行股份有限公司 | Gradient lifting tree model construction method and device based on federal training and storage medium |
WO2020077573A1 (en) * | 2018-10-17 | 2020-04-23 | Alibaba Group Holding Limited | Secret sharing with no trusted initializer |
CN109902109B (en) * | 2019-02-20 | 2021-04-30 | 北京邮电大学 | Multi-party collaborative data mining method and device |
CN111079022B (en) * | 2019-12-20 | 2023-10-03 | 深圳前海微众银行股份有限公司 | Personalized recommendation method, device, equipment and medium based on federal learning |
CN111291417B (en) * | 2020-05-09 | 2020-08-28 | 支付宝(杭州)信息技术有限公司 | Method and device for protecting data privacy of multi-party combined training object recommendation model |
-
2020
- 2020-05-09 CN CN202010384206.4A patent/CN111291417B/en active Active
-
2021
- 2021-05-07 WO PCT/CN2021/092179 patent/WO2021227959A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
CN111291417A (en) | 2020-06-16 |
WO2021227959A1 (en) | 2021-11-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111291417B (en) | Method and device for protecting data privacy of multi-party combined training object recommendation model | |
JP6615362B2 (en) | Method and apparatus for obtaining user caricature | |
CN111738361B (en) | Joint training method and device for business model | |
CN112818290B (en) | Method and device for determining object feature correlation in privacy data by multiparty combination | |
WO2015148422A1 (en) | Recommendation system with dual collaborative filter usage matrix | |
CN110896488B (en) | Recommendation method for live broadcast room and related equipment | |
CN108062692B (en) | Recording recommendation method, device, equipment and computer readable storage medium | |
EP3806070A1 (en) | Secret aggregate function calculation system, secret calculation device, secret aggregate function calculation method, and program | |
CN112396456A (en) | Advertisement pushing method and device, storage medium and terminal | |
CN111260449A (en) | Model training method, commodity recommendation device and storage medium | |
CN108022144A (en) | The method and device of data object information is provided | |
CN111797319B (en) | Recommendation method, recommendation device, recommendation equipment and storage medium | |
CN114691167A (en) | Method and device for updating machine learning model | |
CN111582979A (en) | Clothing matching recommendation method and device and electronic equipment | |
WO2015153240A1 (en) | Directed recommendations | |
CN112016698A (en) | Factorization machine model construction method and device and readable storage medium | |
CN113271486B (en) | Interactive video processing method, device, computer equipment and storage medium | |
US11676015B2 (en) | Personalized recommendations using a transformer neural network | |
TW201926087A (en) | Question pushing method and device | |
Wang et al. | QPIN: a quantum-inspired preference interactive network for E-commerce recommendation | |
US20180039712A1 (en) | Systems and methods for matching users | |
US20210357955A1 (en) | User search category predictor | |
US20220067052A1 (en) | Providing dynamically customized rankings of game items | |
CN113407988A (en) | Method and device for determining effective value of service data characteristic of control traffic | |
CN110232393B (en) | Data processing method and device, storage medium and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40029452 Country of ref document: HK |