CN112449009A - SVD-based federated learning recommendation system communication compression method and device - Google Patents
SVD-based federated learning recommendation system communication compression method and device Download PDFInfo
- Publication number
- CN112449009A CN112449009A CN202011274868.2A CN202011274868A CN112449009A CN 112449009 A CN112449009 A CN 112449009A CN 202011274868 A CN202011274868 A CN 202011274868A CN 112449009 A CN112449009 A CN 112449009A
- Authority
- CN
- China
- Prior art keywords
- data
- classification
- uploaded
- gradient
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/55—Push-based network services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Abstract
The invention provides a communication compression method and a communication compression system of a federal learning recommendation system based on SVD (singular value decomposition), wherein the method comprises the following steps: acquiring gradient data to be uploaded of a current client; grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded; clustering each group of gradient data to be uploaded according to rows by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification labels corresponding to the target numbers; generating classification label data according to the classification label corresponding to each target number and the target number sequence; and sending the classification data and the classification label data to a server. Therefore, data compression is realized by clustering the data to be uploaded, and the influence of data compression on the accuracy of the recommendation system is reduced under the condition of improving the data compression rate.
Description
Technical Field
The invention relates to the technical field of computer network application, in particular to a communication compression method and device of a federal learning recommendation system based on SVD.
Background
Through years of development, the recommendation system is more and more intelligent, the preference of people can be comprehensively known, and the people can accurately make good use of the recommendation system. With the popularization of smart phones, network users are increased in a blowout mode again, and the traditional recommendation system has to face the problems of server resource shortage, insufficient computation amount and the like. In addition, for more accurate recommendation, a recommendation system may widely collect various information of a user, and a user terminal such as a mobile phone stores a large amount of information of the user, including some private contents related to an individual. If the information is not protected, security problems such as privacy disclosure and the like easily occur.
Based on the above problems, a concept of federal learning based on model averaging has been proposed. The training link is moved to the user side, so that the user does not need to upload personal information to a server, and only the trained gradient needs to be uploaded. The method can solve the problems of privacy protection of users and shortage of computing resources of the server. For the SVD-based federal learning recommendation system, the data volume of the gradient to be uploaded is large, the uploading bandwidth of the user side such as a mobile phone is limited, and if the gradient data is not compressed and directly uploaded, the transmission efficiency of data transmission is greatly affected. At present, the existing communication compression methods mainly include random mask, rank reduction, deep gradient compression and the like, however, when the communication compression methods are applied to the SVD-based federal learning recommendation system, the problem that the data compression effect is poor or the accuracy of the recommendation model of the whole system is affected after data compression exists.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a communication compression method and apparatus for a SVD-based federal learning recommendation system, so as to overcome the problem in the prior art that a communication compression method applicable to a SVD-based federal learning recommendation system is lacking.
The embodiment of the invention provides a communication compression method of a federated learning recommendation system based on SVD, which is applied to a client and comprises the following steps:
acquiring gradient data to be uploaded of a current client;
based on the target number of the gradient data to be uploaded, clustering the gradient data to be uploaded by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification label corresponding to each target number;
generating classification label data according to the classification label corresponding to each target number and the target number sequence;
and sending the classification data and the classification label data to a server.
Optionally, the obtaining of gradient data to be uploaded at the current client includes:
acquiring local gradient data of the current client, and receiving previous round of global gradient data fed back by the server;
and based on the target number of the local gradient data, performing difference calculation on the local gradient data and the previous round of global gradient data to obtain the gradient data to be uploaded.
Optionally, the clustering, based on the target number of the gradient data to be uploaded, the gradient data to be uploaded by using a preset clustering algorithm to obtain classification data with classification tags, and determining the classification tag corresponding to each target number includes:
grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded;
and based on the target numbers of the gradient data to be uploaded, clustering each group of gradient data to be uploaded by adopting a preset clustering algorithm to obtain classification data with classification labels, and determining the classification labels corresponding to the target numbers.
Optionally, the gradient data to be uploaded is matrix data with target numbers, and the preset number of the target numbers is obtained by the following method:
acquiring the row and column number and the target compression rate of the matrix data;
and calculating the number of the preset target numbers according to the row number and the column number and the target compression multiplying power.
Optionally, the classification data with classification labels includes: first classification data with a first classification tag and second classification data with a second classification tag.
Optionally, the generating of the classification label data according to the sequence of the object numbers and the classification labels corresponding to the object numbers includes:
sequentially acquiring classification labels corresponding to 32 target numbers according to the target number sequence to combine into 32-bit binary data;
and sequentially converting the 32-bit binary data into Int type data to generate the classification label data.
Optionally, the preset target number is an integer multiple of 32.
The embodiment of the invention also provides a communication compression device of the federal learning recommendation system based on SVD, which is applied to a client and comprises the following components:
the acquisition module is used for acquiring gradient data to be uploaded of the current client;
the first processing module is used for clustering the gradient data to be uploaded by adopting a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classified data with a classified label and determining the classified label corresponding to each target number;
the second processing module is used for generating classification label data according to the classification labels corresponding to the target numbers and the target number sequence;
and the third processing module is used for sending the classification data and the classification label data to a server.
An embodiment of the present invention further provides an electronic device, including: the communication compression method comprises a memory and a processor, wherein the memory and the processor are mutually connected in a communication mode, the memory stores computer instructions, and the processor executes the computer instructions so as to execute the communication compression method based on the SVD (singular value decomposition) Federal learning recommendation system provided by the embodiment of the invention.
The embodiment of the invention also provides a computer-readable storage medium, which stores computer instructions for enabling the computer to execute the communication compression method of the SVD-based federated learning recommendation system provided by the embodiment of the invention.
The technical scheme of the invention has the following advantages:
the embodiment of the invention provides a communication compression method and a communication compression system of a federal learning recommendation system based on SVD (singular value decomposition), wherein gradient data to be uploaded of a current client side are obtained; grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded; clustering each group of gradient data to be uploaded according to rows by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification labels corresponding to the target numbers; generating classification label data according to the classification label corresponding to each target number and the target number sequence; and sending the classification data and the classification label data to a server. Therefore, data compression is achieved by clustering the data to be uploaded, and by uploading the classified data with the classified labels and the classified label data containing the classified labels corresponding to the target numbers and the target number sequence, the server can achieve higher reduction degree of the gradient data reduced by the classified labels, so that the accuracy of the recommendation model finally generated by the recommendation system is guaranteed, and the influence of data compression on the accuracy of the recommendation system is reduced under the condition of improving the data compression rate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a communication compression method of a federated learning recommendation system based on SVD in an embodiment of the present invention;
FIG. 2 is a diagram illustrating a processing result of gradient data to be uploaded according to an embodiment of the present invention;
FIG. 3 is another diagram illustrating the processing result of gradient data to be uploaded according to an embodiment of the invention;
FIG. 4 is a schematic structural diagram of a communication compression device of the SVD-based federated learning recommendation system in an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical features mentioned in the different embodiments of the invention described below can be combined with each other as long as they do not conflict with each other.
At present, the existing communication compression methods mainly include random mask, rank reduction, deep gradient compression, etc., wherein,
reducing the rank: the basic idea of the method for optimizing the communication between federal learning provided by Google is to take a matrix to be uploaded as a product of two small matrices, wherein one of the two small matrices is generated by using a random seed, and the other small matrix is used as upload data. If the original matrix to be uploaded is:assuming that the maximum rank of the matrix H is k (k is a fixed value) ("k")]It can therefore be assumed that the matrix H is the product of two matrices AB, namely:
however, in the Federal SVD recommendation system, the gradient Qi required to be uploaded each time is a vector of 200k (15-30), that is, the rank of the matrix H in the above formula is at most 30, and if k is set to be equal to 30, the compression effect is not achieved, and if k is less than 30, then (d 2-k) d1 pieces of information are necessarily lost. This method is not well suited for this recommendation system.
Random masking method: similar to rank reduction, the method hopes to upload the original matrix into a smaller matrix, except that the method is originally provided for sparse matrix compression. A small part of values in the sparse matrix are randomly selected to represent the whole sparse matrix, so that data needing to be uploaded can be greatly compressed, and compared with the situation that little point error caused by the compression is completely acceptable. In dense matrices, this approach is not fundamentally feasible.
Deep gradient compression: for a deep neural network, the deep gradient compression performance is excellent, vectors smaller than a threshold value are reserved by the aid of gradient vectors, unnecessary parameter transmission among all network layers is reduced, and meanwhile, during next training, the reserved gradient vectors of the previous training are added with the gradients of the training of the current training, so that details of each training are not lost. However, the method is not suitable for being adopted in the Federal SVD recommendation system, because the clients are independent and data are isolated, and the gradient of each training round can have irreplaceable effect on the whole situation. Each round must upload the complete gradient instead of the gradient filtered by the threshold, otherwise it causes a particularly large error in the recommendation accuracy of the final recommendation model.
Based on the problem that the existing communication compression method is difficult to be applied to the SVD-based federated learning recommendation system, the embodiment of the invention provides a communication compression method specially aiming at the SVD-based federated learning recommendation system, and as shown in fig. 1, the communication compression method mainly comprises the following steps:
step S101: and acquiring gradient data to be uploaded of the current client. Specifically, in the SVD-based federal learning recommendation system, each client uploads gradient data obtained after local data training of the client to the server in each round, the gradient data to be uploaded is in a form of a gradient matrix containing target codes, then the server performs model averaging according to the received gradient matrix of each client to obtain a global gradient matrix, and feeds back the obtained global gradient matrix to each client until the recommendation model of the recommendation system is trained, and recommendation targets are recommended to users through the trained recommendation model, for example: when the recommendation system is used for recommending movies for the user, the target codes are movie numbers of all movies to be recommended, and in the gradient data to be uploaded, the target codes are sorted according to a fixed order, for example, sorted from small to large according to the codes, and correspondingly, the global gradient data fed back by the server are also sorted according to the same order.
Step S102: and based on the target number of the gradient data to be uploaded, clustering the gradient data to be uploaded by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification label corresponding to each target number. Specifically, in the embodiment of the present invention, the selected preset clustering algorithm is a KMeans + + algorithm, and experiments show that a clustering result with a better clustering effect can be obtained by using the KMeans + + algorithm, and in practical applications, other clustering algorithms such as mean shift clustering and the like can also be used, which is not limited to this. In the embodiment of the invention, the classification label is set for each classified data after clustering, and then the corresponding relation between the gradient data corresponding to each target number in the gradient data to be uploaded and each classified data can be established through the classification label, so that the server can more accurately restore the gradient data to be uploaded at the current client according to the classification label corresponding to each target number, and the influence on the accuracy of the recommendation model of the recommendation system is reduced.
Step S103: and generating classification label data according to the classification label corresponding to each target number and the target number sequence. Specifically, the classification labels corresponding to the target numbers are arranged according to the arrangement sequence of the target labels to obtain classification label data, so that the server can determine the classification labels corresponding to the target numbers directly according to the classification label data.
Step S104: and sending the classification data and the classification label data to a server. Specifically, the matrix data formed by the classification data and the classification tag data may be packaged together and then uploaded to the server.
Through the steps S101 to S104, the communication compression method for the SVD-based federal learning recommendation system provided in the embodiment of the present invention implements data compression by clustering data to be uploaded, and by uploading classified data with classification tags and classification tag data including classification tags corresponding to target numbers and target number sequences, the server can reduce gradient data by the classification tags with a higher reduction degree, thereby ensuring the accuracy of the recommendation model finally generated by the recommendation system, and further reducing the influence of data compression on the accuracy of the recommendation system under the condition of improving the data compression ratio.
Specifically, in an embodiment, the step S101 specifically includes the following steps:
step S201: and acquiring local gradient data of the current client, and receiving the previous round of global gradient data fed back by the server.
Step S202: and based on the target number of the local gradient data, performing difference calculation on the local gradient data and the previous round of global gradient data to obtain gradient data to be uploaded.
Specifically, if each iteration directly uses the local gradient data of the current client as the gradient data to be uploaded, since there is no correlation between gradient values in the gradient data, a large amount of operations may be performed during clustering due to an excessively large difference between gradient values. And the situation that the gradients of two rounds are not changed in the training process exists in the federal system, so that the difference value between the local gradient data of the current round of the client and the global gradient data fed back by the previous round of the server can be used as the gradient data to be uploaded, and after the compressed gradient data is received at the server, the local gradient data of the current client can be restored by using the global gradient data of the previous round stored by the server, so that the clustering speed is improved on the basis of not influencing the recommendation model training of the recommendation system, and the calculation amount in the data compression process is reduced.
Specifically, in an embodiment, the step S102 specifically includes the following steps:
step S301: and grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded. Specifically, since the gradient data to be uploaded is matrix data with a target number, the preset target number is obtained by the following method: acquiring the row and column number and target compression multiplying power of matrix data; and calculating the number of preset target numbers according to the number of rows and columns and the target compression multiplying power. In the embodiment of the present invention, a calculation formula of the compression ratio is shown as formula (1):
where r denotes a compression magnification, N denotes a column number of the matrix data, I denotes a row number (i.e., the number of object numbers) of the matrix data, and K denotes a preset object number.
The above formula is simplified to obtain:
therefore, since the gradient data to be uploaded are known (i.e. the number of rows and columns of the matrix data is determined), the relationship between the compression ratio and the number of the preset target numbers can be obtained through the above formulas (1) and (2), and therefore the number of the preset target numbers can be obtained according to the compression ratio requirement set by the recommendation system. Of course, in practical applications, the number of preset target numbers may also be set empirically, and then the compression ratio may be estimated by the above formulas (1) and (2).
Step S302: and based on the target numbers of the gradient data to be uploaded, clustering each group of gradient data to be uploaded by adopting a preset clustering algorithm to obtain classification data with classification labels, and determining the classification labels corresponding to the target numbers. Specifically, the matrix data may be grouped according to the target number of the gradient data (matrix data with the target number) to be uploaded, and then the groups are clustered, after all the groups are clustered, the classification results of each group are merged together to obtain classification data, and the classification label of each target number in the group to which the target number belongs is determined. Since the number of the preset target numbers (i.e., the number of the target numbers of each packet) is fixed, the classification label of each packet can be represented by a natural number from 0 to the number of the preset target numbers K-1, and data confusion does not occur. The processing result of performing the grouping clustering on the gradient data to be uploaded is shown in fig. 2, where the left side of the arrow is matrix data with target numbers (i.e. gradient data to be uploaded), the first column is the target numbers, and the right side of the arrow is sequentially classified data with classification labels (the first column is the classification labels) and classification label data composed of the classification labels corresponding to the target numbers.
Specifically, as can be seen from the above formula (2), since N can be regarded as a constant value, when K is much larger than N, r ≈ N, but the value of N is fixed, so that the compression rate of this method has an upper limit and is small. The reason for this is that the number I (number of target numbers) is too large and each class tag is stored by one 32bit Int type data, resulting in class tag data occupying a large amount of data space. In order to further improve the data compression rate, in the embodiment of the present invention, the classification data with the classification tags is divided into: first classification data with a first classification tag and second classification data with a second classification tag. The gradient data clustering result of each group is limited to be two types, so that the data quantity of the classified data of each group is reduced, and the data compression rate is improved. Further, the first classification label is 0, the second classification label is 1, and the number of the preset target numbers is an integer multiple of 32. The step S103 includes the following steps:
step S401: and sequentially acquiring the classification labels corresponding to the 32 target numbers according to the sequence of the target numbers to combine into 32-bit binary data.
Step S402: the 32-bit binary data is sequentially converted into Int type data, and classification tag data is generated. If the number of the remaining objects is less than 32, 32 binary data is formed after 0 is supplemented at the end, and then the 32 binary data is converted into Int type data for uploading.
Therefore, by limiting the classification quantity of each group of clustering results to be 2 and representing the classification labels by 0 and 1, the index values corresponding to all target numbers are guaranteed to be 0 or 1, so that the value of each classification label only occupies 1bit, the 0 and 1 of the 2-system-level classification label corresponding to each group of target numbers are obtained, and every 32 bits are combined together to form corresponding Int type data which is uploaded. The problem that the data quantity of the classification labels in the previous process is too large is solved. Therefore, the purpose of reducing the communication data volume in the transmission process and improving the compression ratio is achieved.
On the basis of the grouping clustering shown in fig. 2, the processing result of the serialized grouping clustering is shown in fig. 3, and at this time, the calculation formula of the compression ratio is shown in formula (3):
where r denotes a compression magnification, N denotes a column number of the matrix data, I denotes a row number (i.e., the number of object numbers) of the matrix data, and K denotes a preset object number.
The above formula is simplified to obtain:
since the classification labels corresponding to the target numbers are combined in 32-bit groups, the value of K is a multiple of 32, that is, K is a multiple of K
K=2n(n is a positive integer of 5 or more)
Theoretically, assuming that N is 20, when K is 4096, r is 487, which is a very ideal compression ratio. Experiments show that the accuracy of the whole recommendation system after compression is slightly different from that of the recommendation system without compression, but the error of the whole recommendation system is within an acceptable range.
Under the condition of different numbers of clients, local gradient data to be uploaded to a server by each client is compressed and then uploaded by using the communication compression method (serialized packet clustering for short) of the SVD-based federated learning recommendation system provided by the embodiment of the invention, and a comparison experiment is carried out with direct uploading without compression (clustering for short). The results of the specific experiments are shown in table 1. Experimental results show that the compression method provided by the embodiment of the invention can cause a system recommendation error (RMES for short) to slightly increase, that is, the recommendation accuracy of the recommendation model is slightly deviated, but deviation values are small, and the deviation is within an acceptable range compared with the compression rate of the gradient.
Table 1: influence of compression algorithm on accuracy under different number of clients
In addition, different preset target number numbers (namely K values) are set, local gradient data to be uploaded to a server by a client are compressed and then uploaded by using the communication compression method (serialized packet clustering for short) of the SVD-based federated learning recommendation system provided by the embodiment of the invention, and a comparison experiment is carried out, wherein the experiment result is shown in table 2. According to the experimental result, the increase of the K value can increase the compression multiple, but the accuracy of the recommendation system is lost to a certain extent. Moreover, the compression method provided by the embodiment of the invention greatly compresses the uploaded gradient data, and the deviation of the RMSE is within a system acceptable range, so that the compression method is more excellent than the existing compression method.
Table 2: different K-value corresponding compression ratio and RMSE
Value of K | 0 (not compressed) | 512 | 1024 | 2048 | 4096 |
Convergence time(s) | 937 | 1909 | 1767 | 1694 | 1605 |
Average RMSE | 0.8036 | 0.8074 | 0.8075 | 0.8072 | 0.8068 |
Theoretical compression ratio | - | 183 | 284 | 393 | 487 |
Actual compression factor | - | 182 | 281 | 387 | 478 |
The communication compression method of the federated learning recommendation system based on SVD provided by the embodiment of the invention improves the data compression rate and simultaneously reduces the influence of data compression on the accuracy rate of the recommendation model of the recommendation system, thereby accelerating the convergence rate of the recommendation model training.
The embodiment of the invention also provides a communication compression device of the federal learning recommendation system based on the SVD, as shown in fig. 4, the communication compression device of the federal learning recommendation system based on the SVD comprises:
the obtaining module 101 is configured to obtain gradient data to be uploaded of a current client. For details, refer to the related description of step S101 in the above method embodiment.
The first processing module 102 is configured to cluster the gradient data to be uploaded by using a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classification data with classification tags, and determine the classification tag corresponding to each target number. For details, refer to the related description of step S102 in the above method embodiment.
The second processing module 103 is configured to generate classification tag data according to the classification tag corresponding to each object number and the sequence of the object numbers. For details, refer to the related description of step S103 in the above method embodiment.
And the third processing module 104 is configured to send the classification data and the classification label data to the server. For details, refer to the related description of step S104 in the above method embodiment.
Through the cooperative cooperation of the above components, the communication compression device of the SVD-based federal learning recommendation system provided in the embodiment of the present invention implements data compression by clustering data to be uploaded, and by uploading classification data with classification tags and classification tag data including classification tags corresponding to target numbers and target number sequences, the server can make the gradient data restored by the classification tags have a higher restoration degree, thereby ensuring the accuracy of the recommendation model finally generated by the recommendation system, and further reducing the influence of data compression on the accuracy of the recommendation system under the condition of improving the data compression ratio.
There is also provided an electronic device according to an embodiment of the present invention, as shown in fig. 5, the electronic device may include a processor 901 and a memory 902, where the processor 901 and the memory 902 may be connected by a bus or in another manner, and fig. 5 takes the example of being connected by a bus as an example.
The memory 902, which is a non-transitory computer readable storage medium, may be used for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods in the method embodiments of the present invention. The processor 901 executes various functional applications and data processing of the processor by executing non-transitory software programs, instructions and modules stored in the memory 902, that is, implements the methods in the above-described method embodiments.
The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 901, and the like. Further, the memory 902 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 902 may optionally include memory located remotely from the processor 901, which may be connected to the processor 901 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory 902, which when executed by the processor 901 performs the methods in the above-described method embodiments.
The specific details of the electronic device may be understood by referring to the corresponding related descriptions and effects in the above method embodiments, and are not described herein again.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, and the program can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.
Claims (10)
1. A communication compression method of a federated learning recommendation system based on SVD is applied to a client, and is characterized by comprising the following steps:
acquiring gradient data to be uploaded of a current client;
based on the target number of the gradient data to be uploaded, clustering the gradient data to be uploaded by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification label corresponding to each target number;
generating classification label data according to the classification label corresponding to each target number and the target number sequence;
and sending the classification data and the classification label data to a server.
2. The method of claim 1, wherein the obtaining gradient data to be uploaded at a current client comprises:
acquiring local gradient data of the current client, and receiving previous round of global gradient data fed back by the server;
and based on the target number of the local gradient data, performing difference calculation on the local gradient data and the previous round of global gradient data to obtain the gradient data to be uploaded.
3. The method according to claim 1, wherein the clustering the gradient data to be uploaded by using a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classification data with classification tags, and determining the classification tag corresponding to each target number comprises:
grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded;
and based on the target numbers of the gradient data to be uploaded, clustering each group of gradient data to be uploaded by adopting a preset clustering algorithm to obtain classification data with classification labels, and determining the classification labels corresponding to the target numbers.
4. The method according to claim 1, wherein the gradient data to be uploaded is matrix data with target numbers, and the preset number of target numbers is obtained by:
acquiring the row and column number and the target compression rate of the matrix data;
and calculating the number of the preset target numbers according to the row number and the column number and the target compression multiplying power.
5. The method of claim 1, wherein the classification data with classification labels comprises: first classification data with a first classification tag and second classification data with a second classification tag.
6. The method of claim 5, wherein the first class label is 0 and the second class label is 1, and wherein the generating of class label data according to the sequence of object numbers and the class labels corresponding to the object numbers comprises:
sequentially acquiring classification labels corresponding to 32 target numbers according to the target number sequence to combine into 32-bit binary data;
and sequentially converting the 32-bit binary data into Int type data to generate the classification label data.
7. The method according to claim 5, wherein the preset target number is an integer multiple of 32.
8. The utility model provides a federal study recommendation system communication compression device based on SVD, is applied to the client, its characterized in that includes:
the acquisition module is used for acquiring gradient data to be uploaded of the current client;
the first processing module is used for clustering the gradient data to be uploaded by adopting a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classified data with a classified label and determining the classified label corresponding to each target number;
the second processing module is used for generating classification label data according to the classification labels corresponding to the target numbers and the target number sequence;
and the third processing module is used for sending the classification data and the classification label data to a server.
9. An electronic device, comprising:
a memory and a processor communicatively coupled to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to thereby perform the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011274868.2A CN112449009B (en) | 2020-11-12 | 2020-11-12 | SVD-based communication compression method and device for Federal learning recommendation system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011274868.2A CN112449009B (en) | 2020-11-12 | 2020-11-12 | SVD-based communication compression method and device for Federal learning recommendation system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112449009A true CN112449009A (en) | 2021-03-05 |
CN112449009B CN112449009B (en) | 2023-01-10 |
Family
ID=74737868
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011274868.2A Active CN112449009B (en) | 2020-11-12 | 2020-11-12 | SVD-based communication compression method and device for Federal learning recommendation system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112449009B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114125070A (en) * | 2021-11-10 | 2022-03-01 | 深圳大学 | Communication method, system, electronic device and storage medium for quantization compression |
CN114339252A (en) * | 2021-12-31 | 2022-04-12 | 深圳大学 | Data compression method and device |
CN114861790A (en) * | 2022-04-29 | 2022-08-05 | 深圳大学 | Method, system and device for optimizing federal learning compression communication |
CN115022316A (en) * | 2022-05-20 | 2022-09-06 | 阿里巴巴(中国)有限公司 | End cloud cooperative data processing system, method, equipment and computer storage medium |
WO2023092323A1 (en) * | 2021-11-24 | 2023-06-01 | Intel Corporation | Learning-based data compression method and system for inter-system or inter-component communications |
WO2024060400A1 (en) * | 2022-09-20 | 2024-03-28 | 天翼电子商务有限公司 | Discrete variable preprocessing method in vertical federated learning |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018226527A1 (en) * | 2017-06-08 | 2018-12-13 | D5Ai Llc | Data splitting by gradient direction for neural networks |
US20190147366A1 (en) * | 2017-11-13 | 2019-05-16 | International Business Machines Corporation | Intelligent Recommendations Implemented by Modelling User Profile Through Deep Learning of Multimodal User Data |
CN110297848A (en) * | 2019-07-09 | 2019-10-01 | 深圳前海微众银行股份有限公司 | Recommended models training method, terminal and storage medium based on federation's study |
CN111079022A (en) * | 2019-12-20 | 2020-04-28 | 深圳前海微众银行股份有限公司 | Personalized recommendation method, device, equipment and medium based on federal learning |
CN111324812A (en) * | 2020-02-20 | 2020-06-23 | 深圳前海微众银行股份有限公司 | Federal recommendation method, device, equipment and medium based on transfer learning |
CN111582505A (en) * | 2020-05-14 | 2020-08-25 | 深圳前海微众银行股份有限公司 | Federal modeling method, device, equipment and computer readable storage medium |
CN111865815A (en) * | 2020-09-24 | 2020-10-30 | 中国人民解放军国防科技大学 | Flow classification method and system based on federal learning |
-
2020
- 2020-11-12 CN CN202011274868.2A patent/CN112449009B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018226527A1 (en) * | 2017-06-08 | 2018-12-13 | D5Ai Llc | Data splitting by gradient direction for neural networks |
US20190147366A1 (en) * | 2017-11-13 | 2019-05-16 | International Business Machines Corporation | Intelligent Recommendations Implemented by Modelling User Profile Through Deep Learning of Multimodal User Data |
CN110297848A (en) * | 2019-07-09 | 2019-10-01 | 深圳前海微众银行股份有限公司 | Recommended models training method, terminal and storage medium based on federation's study |
CN111079022A (en) * | 2019-12-20 | 2020-04-28 | 深圳前海微众银行股份有限公司 | Personalized recommendation method, device, equipment and medium based on federal learning |
CN111324812A (en) * | 2020-02-20 | 2020-06-23 | 深圳前海微众银行股份有限公司 | Federal recommendation method, device, equipment and medium based on transfer learning |
CN111582505A (en) * | 2020-05-14 | 2020-08-25 | 深圳前海微众银行股份有限公司 | Federal modeling method, device, equipment and computer readable storage medium |
CN111865815A (en) * | 2020-09-24 | 2020-10-30 | 中国人民解放军国防科技大学 | Flow classification method and system based on federal learning |
Non-Patent Citations (2)
Title |
---|
LAIZHONG CUI,ETC: "CREAT: Blockchain-assisted Compression Algorithm of Federated Learning for Content Caching in Edge Computing", 《IEEE INTERNET OF THINGS JOURNAL》 * |
贾延延; 张昭; 冯键; 王春凯: "联邦学习模型在涉密数据处理中的应用", 《中国电子科学研究院学报》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114125070A (en) * | 2021-11-10 | 2022-03-01 | 深圳大学 | Communication method, system, electronic device and storage medium for quantization compression |
WO2023092323A1 (en) * | 2021-11-24 | 2023-06-01 | Intel Corporation | Learning-based data compression method and system for inter-system or inter-component communications |
CN114339252A (en) * | 2021-12-31 | 2022-04-12 | 深圳大学 | Data compression method and device |
CN114339252B (en) * | 2021-12-31 | 2023-10-31 | 深圳大学 | Data compression method and device |
CN114861790A (en) * | 2022-04-29 | 2022-08-05 | 深圳大学 | Method, system and device for optimizing federal learning compression communication |
CN115022316A (en) * | 2022-05-20 | 2022-09-06 | 阿里巴巴(中国)有限公司 | End cloud cooperative data processing system, method, equipment and computer storage medium |
CN115022316B (en) * | 2022-05-20 | 2023-08-11 | 阿里巴巴(中国)有限公司 | End cloud collaborative data processing system, method, equipment and computer storage medium |
WO2024060400A1 (en) * | 2022-09-20 | 2024-03-28 | 天翼电子商务有限公司 | Discrete variable preprocessing method in vertical federated learning |
Also Published As
Publication number | Publication date |
---|---|
CN112449009B (en) | 2023-01-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112449009B (en) | SVD-based communication compression method and device for Federal learning recommendation system | |
Ma et al. | Layer-wised model aggregation for personalized federated learning | |
CN110222048B (en) | Sequence generation method, device, computer equipment and storage medium | |
CN110084365B (en) | Service providing system and method based on deep learning | |
Zhu et al. | Network latency estimation for personal devices: A matrix completion approach | |
CN110941598A (en) | Data deduplication method, device, terminal and storage medium | |
CN105138647A (en) | Travel network cell division method based on Simhash algorithm | |
WO2019019649A1 (en) | Method and apparatus for generating investment portfolio product, storage medium and computer device | |
WO2023024749A1 (en) | Video retrieval method and apparatus, device, and storage medium | |
CN114138231B (en) | Method, circuit and SOC for executing matrix multiplication operation | |
CN115496970A (en) | Training method of image task model, image recognition method and related device | |
CN110266834B (en) | Area searching method and device based on internet protocol address | |
CN110135465B (en) | Model parameter representation space size estimation method and device and recommendation method | |
CN109428774B (en) | Data processing method of DPI equipment and related DPI equipment | |
CN107342798B (en) | Method and device for determining codebook | |
CN104391916A (en) | GPEH data analysis method and device based on distributed computing platform | |
CN113807370A (en) | Data processing method, device, equipment, storage medium and computer program product | |
Ma et al. | Efl: Elastic federated learning on non-iid data | |
CN115329032B (en) | Learning data transmission method, device, equipment and storage medium based on federated dictionary | |
CN115905871B (en) | Matrix similarity-based network transmission file information rapid judging method and system | |
CN109831469B (en) | Network recovery method, device, server and storage medium | |
Li et al. | A novel data compression technique incorporated with computer offloading in RGB-D SLAM | |
CN117437010A (en) | Resource borrowing level prediction method, device, equipment, storage medium and program product | |
CN117081822A (en) | Traffic detection method, traffic detection device, communication equipment, storage medium and chip | |
Guo et al. | Aggregated Learning: A Deep Learning Framework Based on Information-Bottleneck Vector Quantization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |