CN110990776B - Coding distributed computing method, device, computer equipment and storage medium - Google Patents

Coding distributed computing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN110990776B
CN110990776B CN201911201661.XA CN201911201661A CN110990776B CN 110990776 B CN110990776 B CN 110990776B CN 201911201661 A CN201911201661 A CN 201911201661A CN 110990776 B CN110990776 B CN 110990776B
Authority
CN
China
Prior art keywords
matrix
shift
data
data packets
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911201661.XA
Other languages
Chinese (zh)
Other versions
CN110990776A (en
Inventor
代明军
郑子英
张胜利
王晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201911201661.XA priority Critical patent/CN110990776B/en
Publication of CN110990776A publication Critical patent/CN110990776A/en
Application granted granted Critical
Publication of CN110990776B publication Critical patent/CN110990776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to a coding distributed computing method, a device, a computer device and a storage medium, wherein the method comprises the steps of obtaining a matrix and a vector to be coded when the coding obtains a combination characteristic; dividing the matrix to be encoded by K equally in the vertical direction to obtain N data packets, wherein N is more than or equal to K; performing transposition operation on the N data packets respectively to obtain N middle data packets; shifting and adding N intermediate data packets according to a preset shift matrix to obtain N synthesis matrices; after the vector is transposed, calculating the vector and N synthetic matrixes respectively to obtain combination characteristics; and outputting the combination characteristic to a terminal for display at the terminal. The invention reduces the extra calculation burden introduced by the existing CDC framework during encoding and decoding, and accelerates the encoding and decoding processes, thereby accelerating the whole training process.

Description

Coding distributed computing method, device, computer equipment and storage medium
Technical Field
The present invention relates to computers, and more particularly, to a method, apparatus, computer device, and storage medium for encoding distributed computing.
Background
Terminals of the internet of things often need to do large-scale computing tasks such as large-scale matrix vector multiplication. In recent years, edge devices are greatly increased, and the edge devices have certain computing power and can be regarded as working nodes of edge computing. Edge calculation can be performed on data at or near a data generation source end, because edge equipment is often not burdened with heavy calculation load, such as weak calculation capacity and excessively high battery energy consumption, and is very important for data analysis and data processing to artificial intelligence, many machine learning algorithms such as gradient descent and logistic regression are involved in the whole training process, and furthermore, based on application of image recognition and voice recognition, training is often required by using a deep learning model, the training process also involves matrix large-scale vector multiplication, in the big data age, as the data scale is enlarged, a single computer cannot complete training, and coding distributed calculation is required.
The above calculation process mostly adopts the traditional CDC (code distributed computing, coded Distributed Computation) framework to carry out encoding and decoding, the core idea of the CDC is to obtain (N, K) combination characteristics through encoding, namely, K original independent calculation tasks are encoded into N (N is larger than or equal to K), and the N calculation results areAny K can recover the original K calculation results. The CDC system calculates a matrix multiplied by a vector as shown in FIG. 1 by dividing A vertically by K to first encode N packets C 1 ,C 2 ,…,C N Respectively to nodes 1,2, … N, while broadcasting x to N nodes. N working nodes respectively calculate C 1 x,C 2 x,…,C N And x, returning a calculation result to a main node when the calculation is completed, and recovering Ax by the main node after receiving any K calculation results in N nodes, wherein the existing CDC based on linear combination introduces a large number of multiplication and division in the calculation process, the calculation burden of the CDC coding process and the decoding process based on LC is heavier, and the overall training time is increased.
Therefore, there is a need to devise a new approach to reduce the additional computational burden introduced by existing CDC frameworks in encoding and decoding, and to speed up the encoding and decoding process, and thus the overall training process.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a coding distributed computing method, a device, computer equipment and a storage medium.
In order to achieve the above purpose, the present invention adopts the following technical scheme: a method of encoding a distributed computing comprising:
when the combination characteristic is obtained by coding, a matrix and a vector which need to be coded are obtained;
dividing the matrix to be encoded by K equally in the vertical direction to obtain N data packets, wherein N is more than or equal to K;
performing transposition operation on the N data packets respectively to obtain N middle data packets;
shifting and adding N intermediate data packets according to a preset shift matrix to obtain N synthesis matrices;
after the vector is transposed, calculating the vector and N synthetic matrixes respectively to obtain combination characteristics;
and outputting the combination characteristic to a terminal for display at the terminal.
The further technical scheme is as follows: the shifting and adding processing is performed on the N middle data packets according to a preset shifting matrix to obtain N synthesis matrices, including:
determining a preset shift matrix according to the numerical values of N and K to obtain a target shift matrix;
intercepting data of each row in N middle data packets to obtain a matrix to be shifted;
shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data;
and filling zero in the position where the shift data is in the empty state, and overlapping the shift data with the same column of data to obtain N synthesis matrixes.
The further technical scheme is as follows: the preset shift matrix includes: or->
The further technical scheme is as follows: the preset shift matrix includes:
the further technical scheme is as follows: the preset shift matrix includes: n=m+k, K being a given number, M corresponding to the number of synthesis matrices.
The further technical scheme is as follows: the outputting the combined characteristic to the terminal to further include, after the terminal displays:
when decoding is carried out, a target shift matrix is obtained;
element recovery is carried out on the combination characteristics according to the target shift matrix so as to obtain a decoding result;
and feeding back the decoding result to the terminal so as to display the decoding result on the terminal.
The invention also provides an encoded distributed computing device comprising:
the acquisition unit is used for acquiring a matrix and a vector to be encoded when the encoding acquires the combination characteristic;
the equally dividing unit is used for equally dividing the matrix to be encoded by K in the vertical direction so as to obtain N data packets, wherein N is more than or equal to K;
the data packet transposition unit is used for carrying out respective transposition operation on the N data packets so as to obtain N middle data packets;
the shift adding unit is used for shifting and adding the N middle data packets according to a preset shift matrix to obtain N synthesis matrices;
the characteristic forming unit is used for calculating N synthetic matrixes respectively after carrying out transposition operation on the vectors so as to obtain combination characteristics;
and the characteristic output unit is used for outputting the combination characteristic to the terminal so as to be displayed at the terminal.
The further technical scheme is as follows: the shift addition unit includes:
a shift matrix determining subunit, configured to determine a preset shift matrix according to the values of N and K, so as to obtain a target shift matrix;
the intercepting subunit is used for intercepting the data of each row in the N middle data packets to obtain a matrix to be shifted;
the shift subunit is used for shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data;
and the superposition subunit is used for filling zero in the position where the shift data is in the empty state and superposing the shift data in the same column of data to obtain N synthesis matrixes.
The invention also provides a computer device which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the method when executing the computer program.
The invention also provides a storage medium storing a computer program which, when executed by a processor, implements the method described above.
Compared with the prior art, the invention has the beneficial effects that: the invention realizes linear independence by carrying out shift addition on a real number domain, and the whole coding process only comprises shift and addition operation, thereby avoiding a large amount of multiplication operation in a linear combination coding stage and greatly reducing the calculation burden; the extra calculation burden introduced by the existing CDC framework in the encoding and decoding process is reduced, and the encoding and decoding processes are accelerated, so that the whole training process is accelerated.
The invention is further described below with reference to the drawings and specific embodiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a prior art encoding distributed computing;
fig. 2 is a schematic diagram of an application scenario of a coding distributed computing method according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method for encoding distributed computing according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a sub-flowchart of a coding distributed computing method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of performing K division on a matrix to be encoded in a vertical direction according to an embodiment of the present invention;
fig. 6 is a schematic diagram of performing a respective transpose operation on N data packets according to an embodiment of the present invention;
FIG. 7 is a first schematic diagram of a shift operation according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of an encoding operation provided by an embodiment of the present invention;
FIG. 9 is a second schematic diagram of a shift operation according to an embodiment of the present invention;
FIG. 10 is a flowchart of a method for encoding distributed computing according to another embodiment of the present invention;
FIG. 11 is a schematic block diagram of an encoded distributed computing device provided by an embodiment of the present invention;
FIG. 12 is a schematic block diagram of an encoded distributed computing device provided in accordance with another embodiment of the present invention;
fig. 13 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 2 and fig. 3, fig. 2 is a schematic diagram of an application scenario of a coding distributed computing method according to an embodiment of the present invention. Fig. 3 is a schematic flow chart of a coding distributed computing method according to an embodiment of the present invention. The coding distributed computing method is applied to a server. The server performs data interaction with one or more terminals, and the terminals can be electronic equipment with computing capability and communication capability, such as smart phones, tablet computers, notebook computers, desktop computers and the like.
The combination characteristic can be a result output by the model, the matrix is sample data, the vector is a parameter, the server encodes the input matrix and the vector and then sends the encoded matrix and the vector to the terminal, the terminal performs matrix vector multiplication operation, and the result is returned to the server for decoding, and the result meeting the requirement, namely the combination characteristic is output for display at the terminal.
Fig. 3 is a flowchart of a coding distributed computing method according to an embodiment of the present invention. As shown in fig. 3, the method includes the following steps S110 to S160.
S110, when the combination characteristic is obtained through encoding, a matrix and a vector which need to be encoded are obtained.
In this embodiment, the server encodes the input matrix and the vector, then sends the encoded input matrix and the vector to the terminal, the terminal performs matrix-vector multiplication operation, and returns the result to the server for decoding, and outputs a result meeting the requirements, that is, a combination characteristic, where in the training process, the matrix may be training sample data, and the vector is a parameter of the model, and so on.
S120, carrying out K equal division on the matrix to be encoded in the vertical direction to obtain N data packets, wherein N is more than or equal to K.
In this embodiment, the data packet means that it is neededThe matrix to be encoded is divided into several sub-matrices by dividing the matrix in the vertical direction by K, as shown in FIG. 5, the matrix A is divided by dividing the matrix in the vertical direction by K, and is denoted by A respectively 1 To A K The A is 1 To A K Record A as data packet i[j] Is A i Is arranged in the row j of the (c), w in FIG. 4 represents matrix A as H rows and W columns, vector x as W rows and 1 columns, and W in FIG. 5 represents sub-matrix A i After transposed, there are W rows, K 'and W' are all sets.
S130, performing transposition operation on the N data packets respectively to obtain N middle data packets.
In this embodiment, the intermediate data packet refers to a matrix formed by performing column transposition operation on the data packet.
Referring to FIG. 6, for A as described above 1 To A K Transposed to obtainMatrix->Comprising a row W' and a row of the same,to->Wherein->Is A r[j] R e K ', j e W'.
And S140, shifting and adding the N middle data packets according to a preset shift matrix to obtain N synthesis matrices.
In this embodiment, the preset shift matrix is a matrix including information of corresponding shifts of all intermediate data packets to be shifted; the composite matrix is a matrix formed by shifting the intermediate data packet according to the data in the preset shift matrix and overlapping the data with the column data.
In this embodiment, a shift-and-add operation is performed on the intermediate packets in columns. In Addition, the columns of these intermediate packets need to be transposed to obtain rows, and then SA (Shift-and-Addition) operations are performed on these rows, with the same SA operations being performed on other corresponding rows, the overall operation being equivalent to that of A 1 To A K Is subjected to SA operations.
Of course, in other embodiments, since the SA is operated on multiple rows at the same time per encoding, N rows are taken side-by-side from the intermediate packets, one row per intermediate packet, and the SA operations are performed on these rows. The same SA operation is performed for other corresponding rows. Thus, the above SA operation is equivalent to that for A 1 To A K Corresponding SA operations are performed, i.e. the matrices are right shifted by different elements and then added in the real number domain.
Since the SA includes a right shift operation, the packed packet will be longer than the original packet. In this case, the dimension of the coding matrix obtained by performing the row shift addition operation isWherein L is op1 Larger than W and different from one coding matrix to another. L (L) op1 The problem that the obtained coding matrix is not matched with the multiplied dimension of x is caused by the condition that the symbol is not equal to W;
the dimension of the coding matrix obtained by column-shifting plus operation is W L op2 Wherein L is op1 Ratio ofLarger and different from one coding matrix to another. The first dimension of the coding matrix obtained by performing shift-plus-operation on the columns is W, which is the same as the dimension of x, and in this embodiment, the shift-plus-operation on the columns is selected as the SA coding mode.
To->Right shifting different displacements according to a preset shift matrix, and adding in a real number domain to obtain C 1 To C N The method comprises the steps of carrying out a first treatment on the surface of the I.e.i is E K'; when->There is-> Wherein D is ir Indicating that at the time of construction, a->The number of real numbers shifted right, i.e. shift,/->The representation will->Right shift D ir And real numbers. I.e. displacement of D ir The number of units, r e K ', and M' is the set, M corresponds to the number of composite matrices, the first K are uncoded, and the last M are coded.
In other embodiments, referring to fig. 6, the SA operation may be described as a row shift plus operation, where the corresponding j-th row is first extracted from the corresponding intermediate data packetTo the point ofThen according to C described above 1 To C N Acquisition mode acquisition D of (2) i1 To D iK That is, SA operation is performed on the extracted data to generate row vector C K+i Is the j-th behavior of: /> Wherein X is [j]i X represents i Is the j-th row of (2).
In one embodiment, referring to fig. 4, the step S140 may include steps S141 to S144.
S141, determining a preset shift matrix according to the numerical values of N and K to obtain a target shift matrix.
In the present embodiment, the preset shift matrix needs to satisfy the following conditions:
ensure N calculation results x T C 1 To x T C N Ax can be recovered by any K of the above, the preset shift matrix can ensure C 1 To C N Possess (N, K) combination properties;
n calculation results x T C 1 To x T C N Should be saw-tooth decodable so that the decoding stage achieves less computational burden;
the largest element D in the shift matrix Max Should be as small as possible.
According to the above condition, in this embodiment, for a given K value, the preset shift matrix may beN=m+k, M corresponds to the number of synthesis matrices; or, k is not equal to 2,3,4; or is D Cyc-Shift The submatrix N formed by any N different rows is less than or equal to 2K; for a particular k=2, 3,4, the optimal preset shift matrix is given as +.> D Inc-Diff And D Cyc-Shift Is two different coding schemes, D Cyc-Shift Is pair D Inc-Diff The cost is smaller, but N is limited, and N is more than or equal to K and less than or equal to N and less than or equal to 2; d (D) Inc-Diff Scheme N may be any value greater than K.
Of course, in other embodiments, the optimal preset shift matrix may be other matrices, for example,depending on the actual situation.
Once N and K are determined, indicating that the preset shift matrix is also determined, a target shift matrix may be formed, the N and K being determined based on the matrix and the number of aliquots input by the terminal.
S142, intercepting data of each row in the N middle data packets to obtain a matrix to be shifted.
In this embodiment, the matrix to be shifted refers to a matrix of a plurality of column single rows formed by data corresponding to each row in the intermediate data packet.
S143, shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data.
In this embodiment, the shift data refers to data formed by shifting corresponding real numbers of the corresponding matrix to be shifted according to corresponding numerical values in the target shift matrix. C as shown in FIG. 7 3 ,c 4
S144, zero is filled in the position where the shift data is in the empty state, and the shift data is subjected to the same-column data superposition to obtain N synthesis matrixes.
The null positions in the shift data due to the shift are filled with a value of zero.
Referring to fig. 7, for example, a matrix includes a packet a 1 And a 2 Packet a 1 And a 2 Encoding 4 synthesis matrices c 1 To c 4 Wherein c 1 =a 1 ,c 2 =a 2 ,c 3 ,c 4 Can be obtained by SA operations, more specifically a 1 And a 2 Shifting different real numbers to the right and then adding the real number domains to form c 3 And c 4 。c 3 And c 4 Middle (a) 1 ,a 2 ) The right-shifted real numbers are (0, 0) (0, 1) respectively, and the corresponding target shift matrices are respectively First and second rows of (N, K) = (4, 2), at which point c 4 Is 5, different from the other three synthesis matrices, the shift operation results in empty positions, which are set to 0, c 4 The first two elements of a are a 1,1 ,a 1,2 +a 2,1
S150, after transposition operation is carried out on vectors, the vectors are respectively calculated with N synthesis matrixes, so that the combination characteristic is obtained.
In the present embodiment, the combination characteristic refers to the product of matrix and vector multiplication.
The calculation Ax is equivalent to the calculation x T A T Or equivalent to calculating (A 1 x) T =x T A 1 T =x T C 1 To (A) K x) T =x T A K T =x T C K From N calculation results x by saw tooth decoding T C 1 To x T C N Decoding to obtain. The calculated row vector isi∈M′。
For example, the combination characteristic of the codes is thatWherein (1)> First of all for A 1 And A 2 And (3) performing transposition operation to obtain:and->Take out->And->The result is marked +.>Andtake out->And->The second line of results is marked +.> And->Pair a 1 ,a 2 Performing SA encoding to obtain c 1 To c 4 For->And->Performs the same operation on the second row of b 1 ,b 2 Performing SA encoding to obtain d 1 To d 4 ,d 1 =b 1 ,d 2 =b 2 ,d 3 =b 1 +b 2 =(b 1,1 +b 2,1 ,b 1,2 +b 2,2 ,b 1,3 +b 2,3 ,b 1,4 +b 2,4 ),d 4 =b 1 +b 2 =(b 1,1 ,b 1,2 +b 2,1 ,b 1,3 +b 2,2 ,b 1,4 +b 2,3 ,b 2,4 ) Formed composite matrix The combined characteristic is +.>
And S160, outputting the combination characteristic to a terminal for display at the terminal.
The combined characteristics are output to the terminal for ease of terminal review.
In the encoding stage, linear independence is achieved by shifting additions over the real number domain. SA only comprises shift and addition operations, so that a large number of multiplication operations in a linear combination coding stage are avoided; greatly reduced computational burden.
According to the coding distribution calculation method, linear independence is realized by adopting shift addition on a real number domain, and the whole coding process only comprises shift and addition operation, so that a large number of multiplication operations in a linear combination coding stage are avoided, and the calculation load is greatly reduced; in the decoding stage, the sawtooth decoding corresponding to the shift addition is adopted, and the sawtooth decoding only needs simple reverse substitution, so that a large number of multiplication and division operations in the linear code decoding stage are avoided, the calculation burden is greatly reduced, the extra calculation burden introduced in the encoding and decoding process of the existing CDC framework is reduced, and the encoding and decoding processes are accelerated, so that the whole training process is accelerated.
Fig. 10 is a flowchart of a coding distributed computing method according to another embodiment of the present invention. As shown in fig. 10, the encoding distributed computing method of the present embodiment includes steps S210 to S290. Steps S210 to S260 are similar to steps S110 to S160 in the above embodiment, and are not described herein. Steps S270 to S290 added in the present embodiment are described in detail below.
S270, when decoding is carried out, a target shift matrix is obtained;
s280, performing element recovery on the combination characteristic according to the target shift matrix to obtain a decoding result;
and S290, feeding back the decoding result to the terminal to display the decoding result on the terminal.
For example, as shown in FIG. 7, from c 3 And c 4 Recovery of a from 1 And a 2 As shown in fig. 7, the lower left brackets indicate the restoration order of the corresponding elements. Due to c 4 The first element of (a) is a 1,1 Indexing to 1, and directly recovering a without other elements 1,1 From c 3 Subtracting a from the first element of (a) 1,1 I.e. recover a 2,1 The index is 2. Similarly, by from c 4 Subtracting a from the first element of (a) 1,2 I.e. recover a 1,2 Indexing to 3, continuing the above process until all elements are restored.
Calculating a matrix multiplicationThe vector-based decoding process is shown in FIG. 8, except that the saw-tooth decoded codeword is decoded from C i Becomes x T C i . As shown in fig. 9, when (N, K) = (4, 2),(N, K) = (4, 2) as target shift matrix, corresponding x T C i As shown in FIG. 8, adjacent elements are divided by vertical dashed lines, initially due to x T C 4 Is actually A [1]1 x, can be recovered directly. Then, by passing from x T C 3 Subtracting A from the first element of (2) [1]1 x, can recover A [1]2 x. Similarly, the method can be implemented by using the method of x T C 4 Subtracting A from the second element of (2) [1]2 x, can recover A [2]1 x. Continuing the above process until A [j]1 x and A [j]2 All elements in x are restored, wherein +.>Finally, A is [j]1 X are vertically connected to obtain A 1 x and A 2 x, the whole decoding process is completed.
For example, for the above example: the combination characteristic of the codes is thatWhen decoding, x T C 4 Is actually the first element of Ax, by subtracting from x T C 3 Subtracting this element x from the first element of (2) T C 4 A can be obtained 21 x 1 +b 21 x 2 The 5 th element, in effect Ax, is obtained by subtracting from x T C 4 Subtracting a from the second element of (2) 21 x 1 +b 21 x 2 Can obtain, a 12 x 1 +b 12 x 2 The 2 nd element of Ax is actually, the above process is continued until all elements of Ax are recovered.
Ax is also calculated with N nodes, where,A∈R H×W ,x∈R W×1 dividing a by row K equally, comparing the computational burden of the CDC framework based on linear combination and the method of the present embodiment from three stages, including an encoding stage, a computing stage, and a decoding stage. The number of multiplication, addition and division by element introduced by the three processes is shown in table 1, table 2 and table 3.
TABLE 1 number of multiplication by element
CDC framework based on linear combination The method of the present embodiment
Encoding stage HWN 0
Calculation stage 0 D max
Decoding stage HK+o(K 3 ) 0
Total amount of H(WN+K)+o(K 3 ) D max
TABLE 2 number of element-wise additions
CDC framework based on linear combination The method of the present embodiment
Encoding stage HWN HWM+D max
Calculation stage 0 D max
Decoding stage HK+o(K 3 ) HM+D max
Total amount of H(WN+K)+o(K 3 ) HM(W+1)+D max
TABLE 3 number of element-by-element divisions
CDC framework based on linear combination The method of the present embodiment
Encoding stage 0 0
Calculation stage 0 0
Decoding stage o(K 3 ) 0
Total amount of o(K 3 ) 0
Since the matrix is encoded to form a combination characteristic according to the method of the present embodiment, D is more than D according to the CDC framework based on linear combination max Elements, create additional computational burden during the computation phase. Therefore, the multiplication of the calculation stage of the CDC framework based on the linear combination is set to 0 as the addition amount. The method of the present embodiment calculates x T C K+i At most by additional D max Columns containing additional D max The element-wise multiplication and addition are performed M times, thus combining the characteristics x T C K+i CDC box based on linear combinationMultiple rack lead-in D max The next element-wise multiplication is added, as shown in table 1), neither of which involves division during the calculation phase.
Whereas the encoding and decoding phases of the CDC framework based on linear combinations: as shown in Table 2, the computational burden introduced by the encoding and decoding stages is HWN and HK+o (K 3 ). Matrix a is an H x W matrix, W representing the total W columns of matrix a, and o (K 3 ) Element-wise division of (2), where o (K 3 ) Is the inverse of a K x K matrix.
While the method of the present embodiment is in the encoding stage and decoding stage: looking first at the multiplication, there is no multiplication in the codec stage, and multiplication and division are set to 0 as shown in table 2. Then looking at the addition, in the encoding stage, since C is obtained K+i Involves K additions, and at C K+i At most WX (HK+D) max Elements, obtain all M C K+i IncludedAnd adding the elements once. In the decoding stage, since x is recovered T C K+i K times of addition are needed for each codeword, M-1 codewords are reduced after (M-1) K times of addition, and x is x T C K+i The length of (2) is at most +.>The decoding stage comprises->And adding the elements once. There is no division in the codec stage.
Total amount: h is generally much larger than the other parameters, H representing the number of rows of matrix a, and therefore the term containing H dominates over the term not containing H. Thus, the computational cost of the method of the present embodiment is significantly lower than the CDC framework based on linear combinations in terms of the total number of element-wise multiplications. Because M < N, H (WN+K) is generally much greater than HM (W+1), the present embodiment provides for a total element-wise addition ofThe computational cost of the method of (a) is significantly lower than that of the CDC framework based on linear combinations. The CDC framework based on linear combinations is K in terms of the total number of element-by-element divisions 3 While the method of this embodiment is 0.
In summary, the method of this embodiment has a significantly reduced total amount of computation for multiplication, addition, and division compared to the CDC framework based on linear combinations. Consider the method of the present embodiment and the total computation burden ratio of the CDC framework based on linear combinations. When H tends to infinity, for multiplication, the ratio of the sum of addition and division tends to 0,0. note that the addition is the least computationally intensive between the three operations of multiplication, addition and division. In fact, the method of the present embodiment avoids the heavy computational burden of the codec stage due to the introduction of a large number of multiplication and division, and reduces the computational burden of the addition operation to the original +.> Multiple times. In the decoding stage, saw tooth decoding corresponding to SA is adopted, and the saw tooth decoding only needs simple reverse generation, so that a large amount of multiplication and division operations in the linear code decoding stage are avoided.
Fig. 11 is a schematic block diagram of an encoded distributed computing device 300 provided by an embodiment of the present invention. As shown in fig. 11, the present invention further provides a coding distributed computing device 300 corresponding to the above coding distributed computing method. The encoded distributed computing apparatus 300, which may be configured in a server, includes means for performing the encoded distributed computing method described above. Specifically, referring to fig. 11, the encoding distributed computing apparatus 300 includes an acquisition unit 301, an equally dividing unit 302, a packet transpose unit 303, a shift addition unit 304, a feature forming unit 305, and a feature output unit 306.
An obtaining unit 301, configured to obtain a matrix and a vector to be encoded when the encoding obtains the combination characteristic; an equally dividing unit 302, configured to divide the matrix to be encoded by K equally in a vertical direction, so as to obtain N data packets, where N is greater than or equal to K; a packet transposition unit 303, configured to perform a respective transposition operation on the N packets, so as to obtain N intermediate packets; a shift adding unit 304, configured to shift and add the N intermediate data packets according to a preset shift matrix, so as to obtain N synthesis matrices; a feature forming unit 305, configured to perform a transposition operation on the vector, and calculate the vector with the N synthesis matrices respectively, so as to obtain a combination characteristic; and a feature output unit 306, configured to output the combination characteristic to a terminal, so as to display the combination characteristic on the terminal.
In an embodiment, the shift-and-add unit 304 includes a shift matrix determination subunit, a truncation subunit, a shift subunit, and a superposition subunit.
A shift matrix determining subunit, configured to determine a preset shift matrix according to the values of N and K, so as to obtain a target shift matrix; the intercepting subunit is used for intercepting the data of each row in the N middle data packets to obtain a matrix to be shifted; the shift subunit is used for shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data; and the superposition subunit is used for filling zero in the position where the shift data is in the empty state and superposing the shift data in the same column of data to obtain N synthesis matrixes.
Fig. 12 is a schematic block diagram of an encoded distributed computing device 300 provided in accordance with another embodiment of the present invention. As shown in fig. 12, the encoding distributed computing apparatus 300 of the present embodiment is an embodiment in which a matrix acquisition unit 307, a recovery unit 308, and a result feedback unit 309 are added to the above embodiments.
A matrix acquisition unit 307 for acquiring a target shift matrix when decoding is performed; a restoring unit 308, configured to restore the element of the combination characteristic according to the target shift matrix, so as to obtain a decoding result; and the result feedback unit 309 is configured to feed back the decoding result to the terminal, so as to display the decoding result on the terminal.
It should be noted that, as will be clearly understood by those skilled in the art, the specific implementation process of the encoding distributed computing device 300 and each unit may refer to the corresponding description in the foregoing method embodiments, and for convenience and brevity of description, the description is omitted here.
The encoded distributed computing apparatus 300 described above may be implemented in the form of a computer program that may be run on a computer device as shown in fig. 13.
Referring to fig. 13, fig. 13 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device may be a server, and the server may be an independent server or a server cluster formed by a plurality of servers.
With reference to FIG. 13, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform a method of encoded distributed computing.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform a method of encoded distributed computing.
The network interface 505 is used for network communication with other devices. It will be appreciated by those skilled in the art that the structure shown in fig. 13 is merely a block diagram of some of the structures associated with the present application and does not constitute a limitation of the computer device 500 to which the present application is applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to implement the steps of:
when the combination characteristic is obtained by coding, a matrix and a vector which need to be coded are obtained; dividing the matrix to be encoded by K equally in the vertical direction to obtain N data packets, wherein N is more than or equal to K; performing transposition operation on the N data packets respectively to obtain N middle data packets; shifting and adding N intermediate data packets according to a preset shift matrix to obtain N synthesis matrices; after the vector is transposed, calculating the vector and N synthetic matrixes respectively to obtain combination characteristics; and outputting the combination characteristic to a terminal for display at the terminal.
Wherein the preset shift matrix includes:(N, K) = (4, 2) or(N,K)=(4,2);/>(N,K)=(6,3);(N, K) = (8, 4). The preset shift matrix includes:
k.noteq.2, 3,4. The preset shift matrix includes: />N=m+k, K being a given number, M corresponding to the number of synthesis matrices.
In an embodiment, when implementing the step of shifting and adding N intermediate data packets according to a preset shift matrix to obtain N composite matrices, the processor 502 specifically implements the following steps:
determining a preset shift matrix according to the numerical values of N and K to obtain a target shift matrix; intercepting data of each row in N middle data packets to obtain a matrix to be shifted; shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data; and filling zero in the position where the shift data is in the empty state, and overlapping the shift data with the same column of data to obtain N synthesis matrixes.
In an embodiment, the processor 502 further implements the following steps after implementing the outputting the combined characteristic to the terminal, to display the terminal:
when decoding is carried out, a target shift matrix is obtained; element recovery is carried out on the combination characteristics according to the target shift matrix so as to obtain a decoding result; and feeding back the decoding result to the terminal so as to display the decoding result on the terminal.
It should be appreciated that in embodiments of the present application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), the processor 502 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, causes the processor to perform the steps of:
when the combination characteristic is obtained by coding, a matrix and a vector which need to be coded are obtained; dividing the matrix to be encoded by K equally in the vertical direction to obtain N data packets, wherein N is more than or equal to K; performing transposition operation on the N data packets respectively to obtain N middle data packets; shifting and adding N intermediate data packets according to a preset shift matrix to obtain N synthesis matrices; after the vector is transposed, calculating the vector and N synthetic matrixes respectively to obtain combination characteristics; and outputting the combination characteristic to a terminal for display at the terminal.
Wherein the preset shift matrix includes:(N, K) = (4, 2) or(N,K)=(4,2);/>(N,K)=(6,3);/>(N, K) = (8, 4). The preset shift matrix includes:
k.noteq.2, 3,4. The preset shift matrix includes: />N=m+k, K being a given number, M corresponding to the number of synthesis matrices.
In one embodiment, when the processor executes the computer program to perform the shifting and adding processing on the N intermediate data packets according to a preset shift matrix to obtain N composite matrices, the following steps are specifically implemented:
determining a preset shift matrix according to the numerical values of N and K to obtain a target shift matrix; intercepting data of each row in N middle data packets to obtain a matrix to be shifted; shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data; and filling zero in the position where the shift data is in the empty state, and overlapping the shift data with the same column of data to obtain N synthesis matrixes.
In an embodiment, the processor, upon execution of the computer program, implements the outputting of the combined characteristic to a terminal to, after the terminal displaying step, further implement the steps of:
when decoding is carried out, a target shift matrix is obtained; element recovery is carried out on the combination characteristics according to the target shift matrix so as to obtain a decoding result; and feeding back the decoding result to the terminal so as to display the decoding result on the terminal.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (7)

1. A method of encoding a distributed computing, comprising:
when the combination characteristic is obtained by coding, a matrix and a vector which need to be coded are obtained;
dividing the matrix to be encoded by K equally in the vertical direction to obtain N data packets, wherein N is more than or equal to K;
performing transposition operation on the N data packets respectively to obtain N middle data packets;
shifting and adding N intermediate data packets according to a preset shift matrix to obtain N synthesis matrices;
after the vector is transposed, calculating the vector and N synthetic matrixes respectively to obtain combination characteristics;
outputting the combination characteristic to a terminal for display at the terminal;
the shifting and adding processing is performed on the N middle data packets according to a preset shifting matrix to obtain N synthesis matrices, including:
determining a preset shift matrix according to the numerical values of N and K to obtain a target shift matrix;
intercepting data of each row in N middle data packets to obtain a matrix to be shifted;
shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data;
filling zero in the position where the shift data is in the empty position, and overlapping the shift data with the same column data to obtain N synthesis matrixes;
the preset shift matrix includes:(N, K) = (4, 2) or (N, K) = (8, 4); k refers to the number of equally divided matrixes to be coded in the vertical direction; n is the number of data packets obtained after K equally dividing the matrix to be encoded in the vertical direction; (N, K) = (4, 2) means N is 4, K is 2; (N, K) = (6, 3) means N is 6,K is 3; (N, K) = (8, 4) means that N is 8,K is 4.
2. The encoding distributed computing method according to claim 1, wherein the preset shift matrix includes:
3. the encoding distributed computing method according to claim 1, wherein the preset shift matrix includes:n=m+k, K being a given number, M corresponding to the number of synthesis matrices.
4. The method of claim 1, wherein outputting the combined characteristic to a terminal to, after display by the terminal, further comprises:
when decoding is carried out, a target shift matrix is obtained;
element recovery is carried out on the combination characteristics according to the target shift matrix so as to obtain a decoding result;
and feeding back the decoding result to the terminal so as to display the decoding result on the terminal.
5. A coded distributed computing device, comprising:
the acquisition unit is used for acquiring a matrix and a vector to be encoded when the encoding acquires the combination characteristic;
the equally dividing unit is used for equally dividing the matrix to be encoded by K in the vertical direction so as to obtain N data packets, wherein N is more than or equal to K;
the data packet transposition unit is used for carrying out respective transposition operation on the N data packets so as to obtain N middle data packets;
the shift adding unit is used for shifting and adding the N middle data packets according to a preset shift matrix to obtain N synthesis matrices;
the characteristic forming unit is used for calculating N synthetic matrixes respectively after carrying out transposition operation on the vectors so as to obtain combination characteristics;
a feature output unit for outputting the combination characteristic to a terminal for display at the terminal;
the shift addition unit includes:
a shift matrix determining subunit, configured to determine a preset shift matrix according to the values of N and K, so as to obtain a target shift matrix;
the intercepting subunit is used for intercepting the data of each row in the N middle data packets to obtain a matrix to be shifted;
the shift subunit is used for shifting the matrix to be shifted according to the data of the corresponding row in the target shift matrix to obtain shift data;
the superposition subunit is used for filling zero for the position where the shift data is in the empty state, and superposing the shift data with the same column of data to obtain N synthesis matrixes;
the preset shift matrix includes:(N, K) = (4, 2) or (N, K) = (8, 4); k refers to the number of equally divided matrixes to be coded in the vertical direction; n is the number of data packets obtained after K equally dividing the matrix to be encoded in the vertical direction; (N, K) = (4, 2) means N is 4, K is 2; (N, K) = (6, 3) means N is 6,K is 3; (N, K) = (8, 4) means that N is 8,K is 4.
6. A computer device, characterized in that it comprises a memory on which a computer program is stored and a processor which, when executing the computer program, implements the method according to any of claims 1-4.
7. A storage medium storing a computer program which, when executed by a processor, performs the method of any one of claims 1 to 4.
CN201911201661.XA 2019-11-29 2019-11-29 Coding distributed computing method, device, computer equipment and storage medium Active CN110990776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911201661.XA CN110990776B (en) 2019-11-29 2019-11-29 Coding distributed computing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911201661.XA CN110990776B (en) 2019-11-29 2019-11-29 Coding distributed computing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110990776A CN110990776A (en) 2020-04-10
CN110990776B true CN110990776B (en) 2023-12-29

Family

ID=70088382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911201661.XA Active CN110990776B (en) 2019-11-29 2019-11-29 Coding distributed computing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110990776B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507430B (en) * 2020-06-17 2023-08-18 同盾控股有限公司 Feature coding method, device, equipment and medium based on matrix multiplication

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105703782A (en) * 2016-03-11 2016-06-22 深圳大学 Incremental shift matrix construction method, network coding method and system
CN105915233A (en) * 2016-03-28 2016-08-31 联想(北京)有限公司 Encoding method and apparatus, and decoding method and apparatus
CN105955839A (en) * 2016-05-09 2016-09-21 东南大学 Finite field binary addition and shift-based regeneration code fault tolerance method
CN109067446A (en) * 2018-10-24 2018-12-21 北京科技大学 A kind of mixing method for precoding of the extensive antenna of multi-antenna multi-user

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105703782A (en) * 2016-03-11 2016-06-22 深圳大学 Incremental shift matrix construction method, network coding method and system
CN105915233A (en) * 2016-03-28 2016-08-31 联想(北京)有限公司 Encoding method and apparatus, and decoding method and apparatus
CN105955839A (en) * 2016-05-09 2016-09-21 东南大学 Finite field binary addition and shift-based regeneration code fault tolerance method
CN109067446A (en) * 2018-10-24 2018-12-21 北京科技大学 A kind of mixing method for precoding of the extensive antenna of multi-antenna multi-user

Also Published As

Publication number Publication date
CN110990776A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
JP5957120B1 (en) Secret sharing method, secret sharing system, distribution apparatus, and program
WO2018165514A1 (en) Transposing neural network matrices in hardware
CN101273532B (en) Decoding device, and receiving device
US11379556B2 (en) Apparatus and method for matrix operations
US11341400B1 (en) Systems and methods for high-throughput computations in a deep neural network
WO2014069464A1 (en) Error-correction coding method and error-correction coding device
CN112200713A (en) Business data processing method, device and equipment in federated learning
Fardad et al. A low-complexity hardware for deterministic compressive sensing reconstruction
Yuan et al. Double shuffle relations of double zeta values and the double Eisenstein series at level N
CN110990776B (en) Coding distributed computing method, device, computer equipment and storage medium
JP5918884B1 (en) Decoding device, decoding method, and program
CN105099467A (en) QC-LDPC code coding method and device
CN110119265A (en) Multiplication implementation method, device, computer storage medium and electronic equipment
CN112364985A (en) Convolution optimization method based on distributed coding
US10050775B2 (en) Element replication device, element replication method, and program
CN107534450B (en) Matrix application device, matrix application method, and storage medium
CN109274460B (en) Multi-bit parallel structure serial offset decoding method and device
US20170337465A1 (en) Reduction of parameters in fully connected layers of neural networks by low rank factorizations
Weingartner On the degrees of polynomial divisors over finite fields
CN103023519B (en) A kind of method and apparatus of Fermat number transform
KR100954843B1 (en) Method and Apparatus of elliptic curve cryptographic operation based on block indexing on sensor mote and Recording medium using by the same
CN113114276B (en) Network coding and decoding method and device based on cyclic shift and related components
CN117724854B (en) Data processing method, device, equipment and readable storage medium
US9336579B2 (en) System and method of performing multi-level integration
JP2007079972A (en) Inverse conversion circuit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant