CN112449009A - SVD-based federated learning recommendation system communication compression method and device - Google Patents

SVD-based federated learning recommendation system communication compression method and device Download PDF

Info

Publication number
CN112449009A
CN112449009A CN202011274868.2A CN202011274868A CN112449009A CN 112449009 A CN112449009 A CN 112449009A CN 202011274868 A CN202011274868 A CN 202011274868A CN 112449009 A CN112449009 A CN 112449009A
Authority
CN
China
Prior art keywords
data
classification
uploaded
gradient
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011274868.2A
Other languages
Chinese (zh)
Other versions
CN112449009B (en
Inventor
刘刚
谭向前
周明洋
蔡树彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN202011274868.2A priority Critical patent/CN112449009B/en
Publication of CN112449009A publication Critical patent/CN112449009A/en
Application granted granted Critical
Publication of CN112449009B publication Critical patent/CN112449009B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention provides a communication compression method and a communication compression system of a federal learning recommendation system based on SVD (singular value decomposition), wherein the method comprises the following steps: acquiring gradient data to be uploaded of a current client; grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded; clustering each group of gradient data to be uploaded according to rows by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification labels corresponding to the target numbers; generating classification label data according to the classification label corresponding to each target number and the target number sequence; and sending the classification data and the classification label data to a server. Therefore, data compression is realized by clustering the data to be uploaded, and the influence of data compression on the accuracy of the recommendation system is reduced under the condition of improving the data compression rate.

Description

SVD-based federated learning recommendation system communication compression method and device
Technical Field
The invention relates to the technical field of computer network application, in particular to a communication compression method and device of a federal learning recommendation system based on SVD.
Background
Through years of development, the recommendation system is more and more intelligent, the preference of people can be comprehensively known, and the people can accurately make good use of the recommendation system. With the popularization of smart phones, network users are increased in a blowout mode again, and the traditional recommendation system has to face the problems of server resource shortage, insufficient computation amount and the like. In addition, for more accurate recommendation, a recommendation system may widely collect various information of a user, and a user terminal such as a mobile phone stores a large amount of information of the user, including some private contents related to an individual. If the information is not protected, security problems such as privacy disclosure and the like easily occur.
Based on the above problems, a concept of federal learning based on model averaging has been proposed. The training link is moved to the user side, so that the user does not need to upload personal information to a server, and only the trained gradient needs to be uploaded. The method can solve the problems of privacy protection of users and shortage of computing resources of the server. For the SVD-based federal learning recommendation system, the data volume of the gradient to be uploaded is large, the uploading bandwidth of the user side such as a mobile phone is limited, and if the gradient data is not compressed and directly uploaded, the transmission efficiency of data transmission is greatly affected. At present, the existing communication compression methods mainly include random mask, rank reduction, deep gradient compression and the like, however, when the communication compression methods are applied to the SVD-based federal learning recommendation system, the problem that the data compression effect is poor or the accuracy of the recommendation model of the whole system is affected after data compression exists.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a communication compression method and apparatus for a SVD-based federal learning recommendation system, so as to overcome the problem in the prior art that a communication compression method applicable to a SVD-based federal learning recommendation system is lacking.
The embodiment of the invention provides a communication compression method of a federated learning recommendation system based on SVD, which is applied to a client and comprises the following steps:
acquiring gradient data to be uploaded of a current client;
based on the target number of the gradient data to be uploaded, clustering the gradient data to be uploaded by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification label corresponding to each target number;
generating classification label data according to the classification label corresponding to each target number and the target number sequence;
and sending the classification data and the classification label data to a server.
Optionally, the obtaining of gradient data to be uploaded at the current client includes:
acquiring local gradient data of the current client, and receiving previous round of global gradient data fed back by the server;
and based on the target number of the local gradient data, performing difference calculation on the local gradient data and the previous round of global gradient data to obtain the gradient data to be uploaded.
Optionally, the clustering, based on the target number of the gradient data to be uploaded, the gradient data to be uploaded by using a preset clustering algorithm to obtain classification data with classification tags, and determining the classification tag corresponding to each target number includes:
grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded;
and based on the target numbers of the gradient data to be uploaded, clustering each group of gradient data to be uploaded by adopting a preset clustering algorithm to obtain classification data with classification labels, and determining the classification labels corresponding to the target numbers.
Optionally, the gradient data to be uploaded is matrix data with target numbers, and the preset number of the target numbers is obtained by the following method:
acquiring the row and column number and the target compression rate of the matrix data;
and calculating the number of the preset target numbers according to the row number and the column number and the target compression multiplying power.
Optionally, the classification data with classification labels includes: first classification data with a first classification tag and second classification data with a second classification tag.
Optionally, the generating of the classification label data according to the sequence of the object numbers and the classification labels corresponding to the object numbers includes:
sequentially acquiring classification labels corresponding to 32 target numbers according to the target number sequence to combine into 32-bit binary data;
and sequentially converting the 32-bit binary data into Int type data to generate the classification label data.
Optionally, the preset target number is an integer multiple of 32.
The embodiment of the invention also provides a communication compression device of the federal learning recommendation system based on SVD, which is applied to a client and comprises the following components:
the acquisition module is used for acquiring gradient data to be uploaded of the current client;
the first processing module is used for clustering the gradient data to be uploaded by adopting a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classified data with a classified label and determining the classified label corresponding to each target number;
the second processing module is used for generating classification label data according to the classification labels corresponding to the target numbers and the target number sequence;
and the third processing module is used for sending the classification data and the classification label data to a server.
An embodiment of the present invention further provides an electronic device, including: the communication compression method comprises a memory and a processor, wherein the memory and the processor are mutually connected in a communication mode, the memory stores computer instructions, and the processor executes the computer instructions so as to execute the communication compression method based on the SVD (singular value decomposition) Federal learning recommendation system provided by the embodiment of the invention.
The embodiment of the invention also provides a computer-readable storage medium, which stores computer instructions for enabling the computer to execute the communication compression method of the SVD-based federated learning recommendation system provided by the embodiment of the invention.
The technical scheme of the invention has the following advantages:
the embodiment of the invention provides a communication compression method and a communication compression system of a federal learning recommendation system based on SVD (singular value decomposition), wherein gradient data to be uploaded of a current client side are obtained; grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded; clustering each group of gradient data to be uploaded according to rows by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification labels corresponding to the target numbers; generating classification label data according to the classification label corresponding to each target number and the target number sequence; and sending the classification data and the classification label data to a server. Therefore, data compression is achieved by clustering the data to be uploaded, and by uploading the classified data with the classified labels and the classified label data containing the classified labels corresponding to the target numbers and the target number sequence, the server can achieve higher reduction degree of the gradient data reduced by the classified labels, so that the accuracy of the recommendation model finally generated by the recommendation system is guaranteed, and the influence of data compression on the accuracy of the recommendation system is reduced under the condition of improving the data compression rate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a communication compression method of a federated learning recommendation system based on SVD in an embodiment of the present invention;
FIG. 2 is a diagram illustrating a processing result of gradient data to be uploaded according to an embodiment of the present invention;
FIG. 3 is another diagram illustrating the processing result of gradient data to be uploaded according to an embodiment of the invention;
FIG. 4 is a schematic structural diagram of a communication compression device of the SVD-based federated learning recommendation system in an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical features mentioned in the different embodiments of the invention described below can be combined with each other as long as they do not conflict with each other.
At present, the existing communication compression methods mainly include random mask, rank reduction, deep gradient compression, etc., wherein,
reducing the rank: the basic idea of the method for optimizing the communication between federal learning provided by Google is to take a matrix to be uploaded as a product of two small matrices, wherein one of the two small matrices is generated by using a random seed, and the other small matrix is used as upload data. If the original matrix to be uploaded is:
Figure BDA0002775882210000051
assuming that the maximum rank of the matrix H is k (k is a fixed value) ("k")]It can therefore be assumed that the matrix H is the product of two matrices AB, namely:
Figure BDA0002775882210000061
however, in the Federal SVD recommendation system, the gradient Qi required to be uploaded each time is a vector of 200k (15-30), that is, the rank of the matrix H in the above formula is at most 30, and if k is set to be equal to 30, the compression effect is not achieved, and if k is less than 30, then (d 2-k) d1 pieces of information are necessarily lost. This method is not well suited for this recommendation system.
Random masking method: similar to rank reduction, the method hopes to upload the original matrix into a smaller matrix, except that the method is originally provided for sparse matrix compression. A small part of values in the sparse matrix are randomly selected to represent the whole sparse matrix, so that data needing to be uploaded can be greatly compressed, and compared with the situation that little point error caused by the compression is completely acceptable. In dense matrices, this approach is not fundamentally feasible.
Deep gradient compression: for a deep neural network, the deep gradient compression performance is excellent, vectors smaller than a threshold value are reserved by the aid of gradient vectors, unnecessary parameter transmission among all network layers is reduced, and meanwhile, during next training, the reserved gradient vectors of the previous training are added with the gradients of the training of the current training, so that details of each training are not lost. However, the method is not suitable for being adopted in the Federal SVD recommendation system, because the clients are independent and data are isolated, and the gradient of each training round can have irreplaceable effect on the whole situation. Each round must upload the complete gradient instead of the gradient filtered by the threshold, otherwise it causes a particularly large error in the recommendation accuracy of the final recommendation model.
Based on the problem that the existing communication compression method is difficult to be applied to the SVD-based federated learning recommendation system, the embodiment of the invention provides a communication compression method specially aiming at the SVD-based federated learning recommendation system, and as shown in fig. 1, the communication compression method mainly comprises the following steps:
step S101: and acquiring gradient data to be uploaded of the current client. Specifically, in the SVD-based federal learning recommendation system, each client uploads gradient data obtained after local data training of the client to the server in each round, the gradient data to be uploaded is in a form of a gradient matrix containing target codes, then the server performs model averaging according to the received gradient matrix of each client to obtain a global gradient matrix, and feeds back the obtained global gradient matrix to each client until the recommendation model of the recommendation system is trained, and recommendation targets are recommended to users through the trained recommendation model, for example: when the recommendation system is used for recommending movies for the user, the target codes are movie numbers of all movies to be recommended, and in the gradient data to be uploaded, the target codes are sorted according to a fixed order, for example, sorted from small to large according to the codes, and correspondingly, the global gradient data fed back by the server are also sorted according to the same order.
Step S102: and based on the target number of the gradient data to be uploaded, clustering the gradient data to be uploaded by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification label corresponding to each target number. Specifically, in the embodiment of the present invention, the selected preset clustering algorithm is a KMeans + + algorithm, and experiments show that a clustering result with a better clustering effect can be obtained by using the KMeans + + algorithm, and in practical applications, other clustering algorithms such as mean shift clustering and the like can also be used, which is not limited to this. In the embodiment of the invention, the classification label is set for each classified data after clustering, and then the corresponding relation between the gradient data corresponding to each target number in the gradient data to be uploaded and each classified data can be established through the classification label, so that the server can more accurately restore the gradient data to be uploaded at the current client according to the classification label corresponding to each target number, and the influence on the accuracy of the recommendation model of the recommendation system is reduced.
Step S103: and generating classification label data according to the classification label corresponding to each target number and the target number sequence. Specifically, the classification labels corresponding to the target numbers are arranged according to the arrangement sequence of the target labels to obtain classification label data, so that the server can determine the classification labels corresponding to the target numbers directly according to the classification label data.
Step S104: and sending the classification data and the classification label data to a server. Specifically, the matrix data formed by the classification data and the classification tag data may be packaged together and then uploaded to the server.
Through the steps S101 to S104, the communication compression method for the SVD-based federal learning recommendation system provided in the embodiment of the present invention implements data compression by clustering data to be uploaded, and by uploading classified data with classification tags and classification tag data including classification tags corresponding to target numbers and target number sequences, the server can reduce gradient data by the classification tags with a higher reduction degree, thereby ensuring the accuracy of the recommendation model finally generated by the recommendation system, and further reducing the influence of data compression on the accuracy of the recommendation system under the condition of improving the data compression ratio.
Specifically, in an embodiment, the step S101 specifically includes the following steps:
step S201: and acquiring local gradient data of the current client, and receiving the previous round of global gradient data fed back by the server.
Step S202: and based on the target number of the local gradient data, performing difference calculation on the local gradient data and the previous round of global gradient data to obtain gradient data to be uploaded.
Specifically, if each iteration directly uses the local gradient data of the current client as the gradient data to be uploaded, since there is no correlation between gradient values in the gradient data, a large amount of operations may be performed during clustering due to an excessively large difference between gradient values. And the situation that the gradients of two rounds are not changed in the training process exists in the federal system, so that the difference value between the local gradient data of the current round of the client and the global gradient data fed back by the previous round of the server can be used as the gradient data to be uploaded, and after the compressed gradient data is received at the server, the local gradient data of the current client can be restored by using the global gradient data of the previous round stored by the server, so that the clustering speed is improved on the basis of not influencing the recommendation model training of the recommendation system, and the calculation amount in the data compression process is reduced.
Specifically, in an embodiment, the step S102 specifically includes the following steps:
step S301: and grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded. Specifically, since the gradient data to be uploaded is matrix data with a target number, the preset target number is obtained by the following method: acquiring the row and column number and target compression multiplying power of matrix data; and calculating the number of preset target numbers according to the number of rows and columns and the target compression multiplying power. In the embodiment of the present invention, a calculation formula of the compression ratio is shown as formula (1):
Figure BDA0002775882210000091
where r denotes a compression magnification, N denotes a column number of the matrix data, I denotes a row number (i.e., the number of object numbers) of the matrix data, and K denotes a preset object number.
The above formula is simplified to obtain:
Figure BDA0002775882210000092
therefore, since the gradient data to be uploaded are known (i.e. the number of rows and columns of the matrix data is determined), the relationship between the compression ratio and the number of the preset target numbers can be obtained through the above formulas (1) and (2), and therefore the number of the preset target numbers can be obtained according to the compression ratio requirement set by the recommendation system. Of course, in practical applications, the number of preset target numbers may also be set empirically, and then the compression ratio may be estimated by the above formulas (1) and (2).
Step S302: and based on the target numbers of the gradient data to be uploaded, clustering each group of gradient data to be uploaded by adopting a preset clustering algorithm to obtain classification data with classification labels, and determining the classification labels corresponding to the target numbers. Specifically, the matrix data may be grouped according to the target number of the gradient data (matrix data with the target number) to be uploaded, and then the groups are clustered, after all the groups are clustered, the classification results of each group are merged together to obtain classification data, and the classification label of each target number in the group to which the target number belongs is determined. Since the number of the preset target numbers (i.e., the number of the target numbers of each packet) is fixed, the classification label of each packet can be represented by a natural number from 0 to the number of the preset target numbers K-1, and data confusion does not occur. The processing result of performing the grouping clustering on the gradient data to be uploaded is shown in fig. 2, where the left side of the arrow is matrix data with target numbers (i.e. gradient data to be uploaded), the first column is the target numbers, and the right side of the arrow is sequentially classified data with classification labels (the first column is the classification labels) and classification label data composed of the classification labels corresponding to the target numbers.
Specifically, as can be seen from the above formula (2), since N can be regarded as a constant value, when K is much larger than N, r ≈ N, but the value of N is fixed, so that the compression rate of this method has an upper limit and is small. The reason for this is that the number I (number of target numbers) is too large and each class tag is stored by one 32bit Int type data, resulting in class tag data occupying a large amount of data space. In order to further improve the data compression rate, in the embodiment of the present invention, the classification data with the classification tags is divided into: first classification data with a first classification tag and second classification data with a second classification tag. The gradient data clustering result of each group is limited to be two types, so that the data quantity of the classified data of each group is reduced, and the data compression rate is improved. Further, the first classification label is 0, the second classification label is 1, and the number of the preset target numbers is an integer multiple of 32. The step S103 includes the following steps:
step S401: and sequentially acquiring the classification labels corresponding to the 32 target numbers according to the sequence of the target numbers to combine into 32-bit binary data.
Step S402: the 32-bit binary data is sequentially converted into Int type data, and classification tag data is generated. If the number of the remaining objects is less than 32, 32 binary data is formed after 0 is supplemented at the end, and then the 32 binary data is converted into Int type data for uploading.
Therefore, by limiting the classification quantity of each group of clustering results to be 2 and representing the classification labels by 0 and 1, the index values corresponding to all target numbers are guaranteed to be 0 or 1, so that the value of each classification label only occupies 1bit, the 0 and 1 of the 2-system-level classification label corresponding to each group of target numbers are obtained, and every 32 bits are combined together to form corresponding Int type data which is uploaded. The problem that the data quantity of the classification labels in the previous process is too large is solved. Therefore, the purpose of reducing the communication data volume in the transmission process and improving the compression ratio is achieved.
On the basis of the grouping clustering shown in fig. 2, the processing result of the serialized grouping clustering is shown in fig. 3, and at this time, the calculation formula of the compression ratio is shown in formula (3):
Figure BDA0002775882210000111
where r denotes a compression magnification, N denotes a column number of the matrix data, I denotes a row number (i.e., the number of object numbers) of the matrix data, and K denotes a preset object number.
The above formula is simplified to obtain:
Figure BDA0002775882210000112
since the classification labels corresponding to the target numbers are combined in 32-bit groups, the value of K is a multiple of 32, that is, K is a multiple of K
K=2n(n is a positive integer of 5 or more)
Theoretically, assuming that N is 20, when K is 4096, r is 487, which is a very ideal compression ratio. Experiments show that the accuracy of the whole recommendation system after compression is slightly different from that of the recommendation system without compression, but the error of the whole recommendation system is within an acceptable range.
Under the condition of different numbers of clients, local gradient data to be uploaded to a server by each client is compressed and then uploaded by using the communication compression method (serialized packet clustering for short) of the SVD-based federated learning recommendation system provided by the embodiment of the invention, and a comparison experiment is carried out with direct uploading without compression (clustering for short). The results of the specific experiments are shown in table 1. Experimental results show that the compression method provided by the embodiment of the invention can cause a system recommendation error (RMES for short) to slightly increase, that is, the recommendation accuracy of the recommendation model is slightly deviated, but deviation values are small, and the deviation is within an acceptable range compared with the compression rate of the gradient.
Table 1: influence of compression algorithm on accuracy under different number of clients
Figure BDA0002775882210000113
Figure BDA0002775882210000121
In addition, different preset target number numbers (namely K values) are set, local gradient data to be uploaded to a server by a client are compressed and then uploaded by using the communication compression method (serialized packet clustering for short) of the SVD-based federated learning recommendation system provided by the embodiment of the invention, and a comparison experiment is carried out, wherein the experiment result is shown in table 2. According to the experimental result, the increase of the K value can increase the compression multiple, but the accuracy of the recommendation system is lost to a certain extent. Moreover, the compression method provided by the embodiment of the invention greatly compresses the uploaded gradient data, and the deviation of the RMSE is within a system acceptable range, so that the compression method is more excellent than the existing compression method.
Table 2: different K-value corresponding compression ratio and RMSE
Value of K 0 (not compressed) 512 1024 2048 4096
Convergence time(s) 937 1909 1767 1694 1605
Average RMSE 0.8036 0.8074 0.8075 0.8072 0.8068
Theoretical compression ratio - 183 284 393 487
Actual compression factor - 182 281 387 478
The communication compression method of the federated learning recommendation system based on SVD provided by the embodiment of the invention improves the data compression rate and simultaneously reduces the influence of data compression on the accuracy rate of the recommendation model of the recommendation system, thereby accelerating the convergence rate of the recommendation model training.
The embodiment of the invention also provides a communication compression device of the federal learning recommendation system based on the SVD, as shown in fig. 4, the communication compression device of the federal learning recommendation system based on the SVD comprises:
the obtaining module 101 is configured to obtain gradient data to be uploaded of a current client. For details, refer to the related description of step S101 in the above method embodiment.
The first processing module 102 is configured to cluster the gradient data to be uploaded by using a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classification data with classification tags, and determine the classification tag corresponding to each target number. For details, refer to the related description of step S102 in the above method embodiment.
The second processing module 103 is configured to generate classification tag data according to the classification tag corresponding to each object number and the sequence of the object numbers. For details, refer to the related description of step S103 in the above method embodiment.
And the third processing module 104 is configured to send the classification data and the classification label data to the server. For details, refer to the related description of step S104 in the above method embodiment.
Through the cooperative cooperation of the above components, the communication compression device of the SVD-based federal learning recommendation system provided in the embodiment of the present invention implements data compression by clustering data to be uploaded, and by uploading classification data with classification tags and classification tag data including classification tags corresponding to target numbers and target number sequences, the server can make the gradient data restored by the classification tags have a higher restoration degree, thereby ensuring the accuracy of the recommendation model finally generated by the recommendation system, and further reducing the influence of data compression on the accuracy of the recommendation system under the condition of improving the data compression ratio.
There is also provided an electronic device according to an embodiment of the present invention, as shown in fig. 5, the electronic device may include a processor 901 and a memory 902, where the processor 901 and the memory 902 may be connected by a bus or in another manner, and fig. 5 takes the example of being connected by a bus as an example.
Processor 901 may be a Central Processing Unit (CPU). The Processor 901 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof.
The memory 902, which is a non-transitory computer readable storage medium, may be used for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods in the method embodiments of the present invention. The processor 901 executes various functional applications and data processing of the processor by executing non-transitory software programs, instructions and modules stored in the memory 902, that is, implements the methods in the above-described method embodiments.
The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 901, and the like. Further, the memory 902 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 902 may optionally include memory located remotely from the processor 901, which may be connected to the processor 901 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory 902, which when executed by the processor 901 performs the methods in the above-described method embodiments.
The specific details of the electronic device may be understood by referring to the corresponding related descriptions and effects in the above method embodiments, and are not described herein again.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, and the program can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A communication compression method of a federated learning recommendation system based on SVD is applied to a client, and is characterized by comprising the following steps:
acquiring gradient data to be uploaded of a current client;
based on the target number of the gradient data to be uploaded, clustering the gradient data to be uploaded by adopting a preset clustering algorithm to obtain classified data with classification labels, and determining the classification label corresponding to each target number;
generating classification label data according to the classification label corresponding to each target number and the target number sequence;
and sending the classification data and the classification label data to a server.
2. The method of claim 1, wherein the obtaining gradient data to be uploaded at a current client comprises:
acquiring local gradient data of the current client, and receiving previous round of global gradient data fed back by the server;
and based on the target number of the local gradient data, performing difference calculation on the local gradient data and the previous round of global gradient data to obtain the gradient data to be uploaded.
3. The method according to claim 1, wherein the clustering the gradient data to be uploaded by using a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classification data with classification tags, and determining the classification tag corresponding to each target number comprises:
grouping the gradient data to be uploaded based on the target number sequence and the preset target number of the gradient data to be uploaded to obtain multiple groups of gradient data to be uploaded;
and based on the target numbers of the gradient data to be uploaded, clustering each group of gradient data to be uploaded by adopting a preset clustering algorithm to obtain classification data with classification labels, and determining the classification labels corresponding to the target numbers.
4. The method according to claim 1, wherein the gradient data to be uploaded is matrix data with target numbers, and the preset number of target numbers is obtained by:
acquiring the row and column number and the target compression rate of the matrix data;
and calculating the number of the preset target numbers according to the row number and the column number and the target compression multiplying power.
5. The method of claim 1, wherein the classification data with classification labels comprises: first classification data with a first classification tag and second classification data with a second classification tag.
6. The method of claim 5, wherein the first class label is 0 and the second class label is 1, and wherein the generating of class label data according to the sequence of object numbers and the class labels corresponding to the object numbers comprises:
sequentially acquiring classification labels corresponding to 32 target numbers according to the target number sequence to combine into 32-bit binary data;
and sequentially converting the 32-bit binary data into Int type data to generate the classification label data.
7. The method according to claim 5, wherein the preset target number is an integer multiple of 32.
8. The utility model provides a federal study recommendation system communication compression device based on SVD, is applied to the client, its characterized in that includes:
the acquisition module is used for acquiring gradient data to be uploaded of the current client;
the first processing module is used for clustering the gradient data to be uploaded by adopting a preset clustering algorithm based on the target number of the gradient data to be uploaded to obtain classified data with a classified label and determining the classified label corresponding to each target number;
the second processing module is used for generating classification label data according to the classification labels corresponding to the target numbers and the target number sequence;
and the third processing module is used for sending the classification data and the classification label data to a server.
9. An electronic device, comprising:
a memory and a processor communicatively coupled to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to thereby perform the method of any one of claims 1-7.
CN202011274868.2A 2020-11-12 2020-11-12 SVD-based communication compression method and device for Federal learning recommendation system Active CN112449009B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011274868.2A CN112449009B (en) 2020-11-12 2020-11-12 SVD-based communication compression method and device for Federal learning recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011274868.2A CN112449009B (en) 2020-11-12 2020-11-12 SVD-based communication compression method and device for Federal learning recommendation system

Publications (2)

Publication Number Publication Date
CN112449009A true CN112449009A (en) 2021-03-05
CN112449009B CN112449009B (en) 2023-01-10

Family

ID=74737868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011274868.2A Active CN112449009B (en) 2020-11-12 2020-11-12 SVD-based communication compression method and device for Federal learning recommendation system

Country Status (1)

Country Link
CN (1) CN112449009B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114125070A (en) * 2021-11-10 2022-03-01 深圳大学 Communication method, system, electronic device and storage medium for quantization compression
CN114339252A (en) * 2021-12-31 2022-04-12 深圳大学 Data compression method and device
CN114861790A (en) * 2022-04-29 2022-08-05 深圳大学 Method, system and device for optimizing federal learning compression communication
CN115022316A (en) * 2022-05-20 2022-09-06 阿里巴巴(中国)有限公司 End cloud cooperative data processing system, method, equipment and computer storage medium
WO2023092323A1 (en) * 2021-11-24 2023-06-01 Intel Corporation Learning-based data compression method and system for inter-system or inter-component communications
WO2024060400A1 (en) * 2022-09-20 2024-03-28 天翼电子商务有限公司 Discrete variable preprocessing method in vertical federated learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018226527A1 (en) * 2017-06-08 2018-12-13 D5Ai Llc Data splitting by gradient direction for neural networks
US20190147366A1 (en) * 2017-11-13 2019-05-16 International Business Machines Corporation Intelligent Recommendations Implemented by Modelling User Profile Through Deep Learning of Multimodal User Data
CN110297848A (en) * 2019-07-09 2019-10-01 深圳前海微众银行股份有限公司 Recommended models training method, terminal and storage medium based on federation's study
CN111079022A (en) * 2019-12-20 2020-04-28 深圳前海微众银行股份有限公司 Personalized recommendation method, device, equipment and medium based on federal learning
CN111324812A (en) * 2020-02-20 2020-06-23 深圳前海微众银行股份有限公司 Federal recommendation method, device, equipment and medium based on transfer learning
CN111582505A (en) * 2020-05-14 2020-08-25 深圳前海微众银行股份有限公司 Federal modeling method, device, equipment and computer readable storage medium
CN111865815A (en) * 2020-09-24 2020-10-30 中国人民解放军国防科技大学 Flow classification method and system based on federal learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018226527A1 (en) * 2017-06-08 2018-12-13 D5Ai Llc Data splitting by gradient direction for neural networks
US20190147366A1 (en) * 2017-11-13 2019-05-16 International Business Machines Corporation Intelligent Recommendations Implemented by Modelling User Profile Through Deep Learning of Multimodal User Data
CN110297848A (en) * 2019-07-09 2019-10-01 深圳前海微众银行股份有限公司 Recommended models training method, terminal and storage medium based on federation's study
CN111079022A (en) * 2019-12-20 2020-04-28 深圳前海微众银行股份有限公司 Personalized recommendation method, device, equipment and medium based on federal learning
CN111324812A (en) * 2020-02-20 2020-06-23 深圳前海微众银行股份有限公司 Federal recommendation method, device, equipment and medium based on transfer learning
CN111582505A (en) * 2020-05-14 2020-08-25 深圳前海微众银行股份有限公司 Federal modeling method, device, equipment and computer readable storage medium
CN111865815A (en) * 2020-09-24 2020-10-30 中国人民解放军国防科技大学 Flow classification method and system based on federal learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LAIZHONG CUI,ETC: "CREAT: Blockchain-assisted Compression Algorithm of Federated Learning for Content Caching in Edge Computing", 《IEEE INTERNET OF THINGS JOURNAL》 *
贾延延; 张昭; 冯键; 王春凯: "联邦学习模型在涉密数据处理中的应用", 《中国电子科学研究院学报》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114125070A (en) * 2021-11-10 2022-03-01 深圳大学 Communication method, system, electronic device and storage medium for quantization compression
WO2023092323A1 (en) * 2021-11-24 2023-06-01 Intel Corporation Learning-based data compression method and system for inter-system or inter-component communications
CN114339252A (en) * 2021-12-31 2022-04-12 深圳大学 Data compression method and device
CN114339252B (en) * 2021-12-31 2023-10-31 深圳大学 Data compression method and device
CN114861790A (en) * 2022-04-29 2022-08-05 深圳大学 Method, system and device for optimizing federal learning compression communication
CN115022316A (en) * 2022-05-20 2022-09-06 阿里巴巴(中国)有限公司 End cloud cooperative data processing system, method, equipment and computer storage medium
CN115022316B (en) * 2022-05-20 2023-08-11 阿里巴巴(中国)有限公司 End cloud collaborative data processing system, method, equipment and computer storage medium
WO2024060400A1 (en) * 2022-09-20 2024-03-28 天翼电子商务有限公司 Discrete variable preprocessing method in vertical federated learning

Also Published As

Publication number Publication date
CN112449009B (en) 2023-01-10

Similar Documents

Publication Publication Date Title
CN112449009B (en) SVD-based communication compression method and device for Federal learning recommendation system
Ma et al. Layer-wised model aggregation for personalized federated learning
CN110222048B (en) Sequence generation method, device, computer equipment and storage medium
CN110084365B (en) Service providing system and method based on deep learning
Zhu et al. Network latency estimation for personal devices: A matrix completion approach
CN110941598A (en) Data deduplication method, device, terminal and storage medium
CN105138647A (en) Travel network cell division method based on Simhash algorithm
WO2019019649A1 (en) Method and apparatus for generating investment portfolio product, storage medium and computer device
WO2023024749A1 (en) Video retrieval method and apparatus, device, and storage medium
CN114138231B (en) Method, circuit and SOC for executing matrix multiplication operation
CN115496970A (en) Training method of image task model, image recognition method and related device
CN110266834B (en) Area searching method and device based on internet protocol address
CN110135465B (en) Model parameter representation space size estimation method and device and recommendation method
CN109428774B (en) Data processing method of DPI equipment and related DPI equipment
CN107342798B (en) Method and device for determining codebook
CN104391916A (en) GPEH data analysis method and device based on distributed computing platform
CN113807370A (en) Data processing method, device, equipment, storage medium and computer program product
Ma et al. Efl: Elastic federated learning on non-iid data
CN115329032B (en) Learning data transmission method, device, equipment and storage medium based on federated dictionary
CN115905871B (en) Matrix similarity-based network transmission file information rapid judging method and system
CN109831469B (en) Network recovery method, device, server and storage medium
Li et al. A novel data compression technique incorporated with computer offloading in RGB-D SLAM
CN117437010A (en) Resource borrowing level prediction method, device, equipment, storage medium and program product
CN117081822A (en) Traffic detection method, traffic detection device, communication equipment, storage medium and chip
Guo et al. Aggregated Learning: A Deep Learning Framework Based on Information-Bottleneck Vector Quantization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant