CN109255429A - Parameter decompression method for sparse neural network model - Google Patents

Parameter decompression method for sparse neural network model Download PDF

Info

Publication number
CN109255429A
CN109255429A CN201810845949.XA CN201810845949A CN109255429A CN 109255429 A CN109255429 A CN 109255429A CN 201810845949 A CN201810845949 A CN 201810845949A CN 109255429 A CN109255429 A CN 109255429A
Authority
CN
China
Prior art keywords
matrix
weight
boundary
mark
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810845949.XA
Other languages
Chinese (zh)
Other versions
CN109255429B (en
Inventor
刘必慰
陈胜刚
彭瑾
刘畅
郭阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201810845949.XA priority Critical patent/CN109255429B/en
Publication of CN109255429A publication Critical patent/CN109255429A/en
Application granted granted Critical
Publication of CN109255429B publication Critical patent/CN109255429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Abstract

The invention discloses a parameter decompression method for a sparse neural network model, which comprises the following steps: s1, storing a required sparse matrix to a specified position, wherein when non-zero elements in the matrix are stored, a relative index and a weight quantization value corresponding to each non-zero element are stored, and if zero number between two non-zero elements is larger than a preset threshold value, a zero value is stored; s2, data stored in the matrix to be decompressed are obtained, relative indexes and weight quantization values in the data are extracted, the relative indexes are restored to absolute indexes, positions of non-zero elements and zero elements in the dense matrix and corresponding weight quantization values are determined according to the restored absolute indexes, a weight vector table is reconstructed according to the positions of the non-zero elements, and decompression of the dense matrix is completed after weight values in the weight vector table are restored. The invention has the advantages of simple realization operation, high decompression efficiency and resource utilization rate, wide and flexible application range and the like.

Description

A kind of parameter decompressing method for sparse neural network model
Technical field
The present invention relates to nerual network technique field more particularly to a kind of parameter decompressions for sparse neural network model Method.
Background technique
With the burning hot development of deep learning, the models such as DNN, RNN in deep learning are widely used, these Model overcomes the obstacle of traditional technology, achieves huge achievement in many fields such as speech recognition, image recognition.Either DNN or RNN neural network, can all there is the calculating of full articulamentum, and the calculating of full articulamentum is by weight matrix and corresponding Multiplication of vectors.In training neural network, need to model using suitable compression algorithm, mainly in neural network Full articulamentum carries out compression processing, i.e., carries out compression of parameters, in the case where not influencing its accuracy, Ke Yi great to weight matrix The big execution speed for improving ANN Reasoning, after training, by the parameter after training there are in memory, in needs When again solution be pressed into weight matrix.
Typical model compression algorithm as shown in Figure 1, mainly by beta pruning, quantization is trained and variable length code forms, After initializing the training stage, by removing beta pruning of the weight lower than the connection implementation model of threshold value, this beta pruning turns dense layer Sparse layer is turned to, the first stage needs the topological structure of learning network, and pays close attention to important connection and remove unessential connection; Quantization training is the process of a shared weight, and multiple connections is enabled to share identical weight;Beta pruning and quantization training are in mutually not shadow In the case where sound, very high compression ratio can produce, so that the demand of memory space is reduced.After training, it will train Parameter in the case where not influencing its precision, it is necessary for being decompressed into accurate weight matrix.
There are many hardware structures directly can carry out operation to the parameter after model compression at present, but needs more complex hard Part architecture design, realization is complex and costly, how to avoid realizing sparse neural network model using complicated hardware structure The correct decompression of parameter is a problem to be solved, and to realize the correct decompression of sparse neural network model parameter, there is pressure Parameter after contracting is organized to store in which way, and how the parameter of storage in the case where guaranteeing its accuracy is decompressed into one The problems such as a complete dense matrix, and usually compression can once compress multiple matrixes at present, it is also necessary to it can be realized primary Decompress multiple matrixes.
Summary of the invention
The technical problem to be solved in the present invention is that, for technical problem of the existing technology, the present invention provides one Kind realizes that easy to operate, decompression efficiency and resource utilization are high, has a wide range of application and flexibly for sparse neural network model Parameter decompressing method.
In order to solve the above technical problems, technical solution proposed by the present invention are as follows:
A kind of parameter decompressing method for sparse neural network model, which is characterized in that step includes:
S1. compression parameters store: required sparse matrix being stored to designated position, wherein storing to nonzero element in matrix When, store the corresponding relative indexing of each nonzero element, weight quantized value, the relative indexing for identify two nonzero elements it Between zero number, if zero number between two nonzero elements stores a zero when being greater than preset threshold;
S2. decompress: the data that the matrix decompressed needed for obtaining is stored according to step S1 extract therein described opposite Index and weight quantized value, and by the relative indexing revert to the one-to-one absolute indices of element position, according to extensive The absolute indices regained determine the weight of nonzero element in dense matrix, the position of neutral element and determining position Quantized value rebuilds weight vectors table according to the position of nonzero element, and weighted value in the weight vectors table is restored Afterwards, the decompression of dense matrix is completed.
As a further improvement of the present invention, including Instruction decoding and data acquisition step S21 in the step S2, specifically Step are as follows: instruct and be decoded by receiving decompression, the decompression length of matrix, source address needed for being determined according to decoded information with And the destination address of weight matrix after decompression is stored, the number that decompression matrix is stored needed for being taken out according to the obtained source address According to extracting the relative indexing therein.
It as a further improvement of the present invention, include that will index recovering step S22, specific steps in the step S2 are as follows: The relative indexing is reverted into absolute indices by way of cumulative.
It as a further improvement of the present invention, include rebuilding weight to quantify table step S23, specific steps in the step S2 Are as follows: according to the position restored each of to obtain the absolute indices and determine nonzero element, weighed by the position reconstruction of each nonzero element Re-quantization table, it is effective weighted value that the weight, which quantifies nonzero element corresponding position in table,.
It as a further improvement of the present invention, include dequantization step S24, specific steps are as follows: by institute in the step S2 It states effective weighted value in weight quantization table to be restored, obtains complete dense matrix.
As a further improvement of the present invention, the weight is quantified in table especially by look-up table in the step S24 Effective weighted value is restored.
It as a further improvement of the present invention, further include storing multiple required matrixes according to cross-border mode, And boundary marker is configured to identify the state of inter-bank storage, the boundary marker includes do not have matrix boundaries in mark vector the There are matrix boundaries in one mark, mark vector, and second at vector ending does not indicate, has matrix side in mark vector Boundary, and have matrix boundaries in the third mark and mark vector at vector ending, and abandon the data after boundary The 4th mark.
As a further improvement of the present invention, when executing multiple matrixes decompression, by the relative indexing in the step S2 Revert to the specific steps with the one-to-one absolute indices of element position are as follows:
Absolute indices are converted by all relative indexings of storage;
Judged when getting the boundary marker, when if it is second mark, the 4th mark, according to described Absolute indices judgment matrix boundary, by high two progress exclusive or of the adjacent absolute indices, if result is 1, before determining One position is the boundary position of first matrix, and the latter position is the starting position of next matrix, obtains matrix boundaries;
It generates and indexes the signal whether useful signal and judgment matrix terminate, and root correspondingly with each relative indexing According to the signal that the index useful signal is arranged in the boundary marker, whether matrix terminates;
Export the signal whether the index useful signal, absolute indices and matrix terminate.
As a further improvement of the present invention, described that the index useful signal, matrix are arranged according to the boundary marker The specific steps of the signal whether terminated are as follows:
When the boundary marker is the described first mark, no matrix boundaries are determined, and matrix is also not finished, it will be each Effective index signal is assigned a value of 1, and the matrix is not finished signal and is set as 0;
When the boundary marker is the third mark, trip current boundary, will be each described effective at vector end Index signal is assigned a value of 1, and the matrix end signal is also configured as 1;
When the boundary marker is the described second mark, trip current boundary is in the center of vector, according to the institute of judgement Stating matrix boundaries is divided to two bats to carry out, and the index useful signal before the matrix boundaries is set 1, and the square by first count Battle array end signal is set as 1, and the index useful signal after the matrix boundaries is set 1, and the matrix by second count End signal is set as 0;
When the boundary marker is the described 4th mark, determine to have arrived the last line of vector, according to determining The last one matrix boundary, set 1 for the index useful signal in the matrix boundaries, abandon unwanted data.
As a further improvement of the present invention, weighted value in the weight vectors table is restored in the step S2 Specific steps are as follows:
According to the absolute indices and the index useful signal, weight quantized value is assigned to corresponding position;
Judge whether that reaching matrix boundaries deposits the weight quantized value of corresponding position if matrix does not reach boundary, if Matrix reaches boundary, by the weight quantization table output of matrix.
Compared with the prior art, the advantages of the present invention are as follows:
1, the present invention by by compression parameters according to by the storage of nonzero element corresponding relative indexing, weight quantized value Mode, it is possible to reduce store the memory space of parameter, then by the way that relative indexing is reverted to absolute indices and based on absolute Index restores weight vectors table, may finally decompress to obtain correct weight matrix, after model compression is complete, set without hardware The decompression processing to compact model can be realized in meter, completes the reconstruction of weight matrix, decompresses fast speed, high efficiency and realizes spirit It is living.
2, the sparse matrix in support model compression of the present invention and weight are shared, not only support RNN class model compression algorithm Decompression, the decompression that can also support DNN class model to compress has considerable flexibility and scalability, and realizes simple spirit It is living, the dimension of the weight matrix of decompression is not required.
3, the present invention is arranged boundary marker between matrix, works as needs by the way that matrix after multiple compressions is carried out Coutinuous store When decompressing multiple matrixes, it is only necessary to send corresponding decompression instruction, determine the boundary condition between each matrix by boundary marker, then may be used Multiple weight matrixs are correctly once decompressed to realize, the efficiency and speed of decompression are effectively improved, in every a line spv, it is only necessary to 2 Matrix boundaries mark, the required auxiliary data being additionally arranged is few, has saved memory space and memory bandwidth, and by continuous Condensation matrix is stored, memory space and memory bandwidth can be further saved.The maximum quantity of matrix is decompressed by storing in spv Number of parameters and the length of decoding determine.
Detailed description of the invention
Fig. 1 is the realization principle schematic diagram of typical model compression algorithm.
Fig. 2 is the realization principle schematic diagram for the parameter decompression that the present embodiment realizes sparse neural network model.
Fig. 3 is implementation process schematic diagram of the present embodiment for the parameter decompressing method of sparse neural network model.
Fig. 4 is the realization principle schematic diagram that nonzero element in compressed matrix is stored in SPV by the present embodiment.
Fig. 5 is that in concrete application embodiment of the present invention three matrixes to be decompressed are carried out with cross-border to be stored as spv cross-border The schematic illustration of storage.
Fig. 6 is the specific implementation flow schematic diagram that absolute indices are restored when the more matrixes of the present embodiment decompress.
Fig. 7 is the implementation process schematic diagram that weight quantization table is rebuild when the more matrixes of the present embodiment decompress.
Fig. 8 is the implementation process schematic diagram of inverse quantization in the present embodiment.
Specific embodiment
Below in conjunction with Figure of description and specific preferred embodiment, the invention will be further described, but not therefore and It limits the scope of the invention.
As shown in Figure 2,3, the present embodiment is used for the parameter decompressing method of sparse neural network model, and step includes:
S1. compression parameters store: required sparse matrix being stored to designated position, wherein storing to nonzero element in matrix When, the corresponding relative indexing of each nonzero element, weight quantized value are stored, relative indexing is for identifying zero between two nonzero elements Number, if zero number between two nonzero elements stores a zero when being greater than preset threshold;
S2. decompress: the data that the matrix decompressed needed for obtaining is stored according to step S1 extract opposite rope therein Regard it as and weight quantized value, and by relative indexing revert to the one-to-one absolute indices of element position, according to restoring to obtain Absolute indices determine nonzero element in dense matrix, neutral element position and determine position weight quantized value, root Weight vectors table is rebuild according to the position of nonzero element, after weighted value is restored in weight vectors table, completes dense square The decompression of battle array, the dense matrix that is to say the weight matrix in neural network.
The present embodiment can be realized without hardware design to compact model after model compression is complete by the above method Decompression processing, complete the reconstruction of weight matrix, by by compression parameters according to by the corresponding relative indexing of nonzero element, power The mode of the storage of re-quantization value, it is possible to reduce store the memory space of parameter, then by the way that relative indexing is reverted to absolute rope Regard it as and restore weight vectors table based on absolute indices, may finally decompress to obtain correct weight matrix.
Compressed parameter is specifically stored in vector memory spv by the present embodiment, and the width of spv is 32 words, solution Weight matrix after pressure is stored into memory spm, and spm width is 1024 words, is stored and is pressed by multiplexing vectors memory spv Matrix after contracting can make hardware resource utilization height.There are two types of the compress techniques that the above-mentioned decompressing method of the present embodiment is supported: Sparse matrix and weight are shared, wherein the storage to nonzero element in sparse matrix, passes through one relative indexing of storage and one Weight quantized value stores one zero if zero number between two nonzero elements is greater than preset threshold (specifically taking 15) Value indicates zero number before nonzero element by relative indexing;Determination for quantized value, due to the power in compression algorithm It shares again, weight quantized value is reduced to 5bit from 16bit, therefore 9bit is only needed to store a nonzero element.Compressed ginseng Number is stored in the manner described above in vector memory spv, and an address vector memory spv can specifically store 56 non-zeros Element.
The above-mentioned decompressing method of the present embodiment, the sparse matrix and weight in support model compression are shared, not only support RNN class The decompression of model compression algorithm, the decompression that can also support DNN class model to compress have considerable flexibility and scalability, And realize simple and flexible, the dimension of the weight matrix of decompression is not required.
As shown in figure 3, sending decompression instruction when the present embodiment realizes decompression, after receiving decompression instruction, start to depositing It stores up the parameter in spv and carries out including index recovery, the decompression processing for determining weight quantization table and inverse quantization operation, will decompress Weight matrix afterwards is stored in memory spm.
It include Instruction decoding and data acquisition step S21, specific steps in the present embodiment step S2 are as follows: decompressed by receiving It instructs and is decoded, weight square after the length, source address and storage decompression of decompression matrix needed for being determined according to decoded information The destination address of battle array, the data that decompression matrix is stored needed for being taken out according to obtained source address extract opposite rope therein Regard it as and weight quantized value.
Include the steps that reverting to relative indexing into absolute indices S22 in the present embodiment step S2, step S22 especially by Relative indexing is reverted to absolute indices by cumulative mode.
It include rebuilding weight vectors table step S23, specific steps in the present embodiment step S2 are as follows: obtained according to recovery Each absolute indices determine the position of nonzero element, table is quantified by the position reconstruction weight of each nonzero element, weight quantifies table Middle nonzero element corresponding position is effective weighted value.The specifically position in weight quantization table, if there is corresponding absolute indices, Nonzero element position is corresponded to, which is assigned a value of corresponding weight quantized value, otherwise by the weight of the position Change value is assigned a value of 0.
It include dequantization step S24, specific steps in the present embodiment step S2 are as follows: weight is quantified into effective weighted value in table Restored, obtain complete dense matrix, is restored weighted value effective in weight quantization table especially by look-up table.
The detailed process of the present embodiment above-mentioned realization decompression includes:
S21. Instruction decoding: sending different decompressions according to the matrix of required decompression and instruct, and includes the source spv in decompression instruction Address, spm destination address, decompress spv vector length len;It is decoded after receiving decompression instruction, determines that solution presses to The length of amount, the source address of decompression and the destination address for storing weight matrix,.It is taken out and is deposited from memory spv according to source address The parameter of storage extracts relative indexing and weight quantized value;
S22. index restores: relative indexing is reverted to by absolute indices by the method for the sum that adds up, it is true according to absolute indices Determine the position of neutral element and nonzero element in dense matrix, exports absolute indices and weight quantized value;
S23. it rebuilds weight vectors table: determining the position of nonzero element in weight quantization table by each absolute indices It sets, according to weight quantized value corresponding to the corresponding nonzero element position of the absolute indices of nonzero element establishment, if having corresponding exhausted To index, the position is assigned to corresponding weight quantized value, if the weight quantized value of the position is answered without corresponding absolute indices It is 0, rebuilds the weight quantization table for obtaining matrix;
S24. inverse quantization: effective weighted value corresponding to weight quantization table is restored by searching for table, obtains one Complete dense matrix.
According to the method described above it is every decompressed a complete dense matrix after, will according to the destination address of Instruction decoding Dense matrix is stored in corresponding spm at destination address.
In the present embodiment, multiple required matrixes are stored according to cross-border mode, and configure boundary marker to mark The state for knowing inter-bank storage, the situation of different inter-bank storages is represent by different boundary markers, and boundary marker includes mark There is no the first of matrix boundaries indicate, there are matrix boundaries in mark vector in vector, and the second mark not at vector ending There are matrix boundaries in will, mark vector, and there are matrix boundaries in the third mark and mark vector at vector ending, And abandon the 4th mark of the data after boundary.One address spv can store 56 nonzero elements, weight square after compression The element of battle array cannot have been stored all usually within a line, can there is the case where spv inter-bank storage.
When carrying out model compression, a weight matrix once will not be only compressed, the present embodiment is by will be after multiple compressions Matrix carries out Coutinuous store, boundary marker is arranged between matrix, when needing to decompress multiple matrixes, it is only necessary to send corresponding decompression Instruction, is decompressed each matrix using the above method respectively, determines the boundary condition between each matrix by boundary marker, then may be used Multiple weight matrixs are correctly once decompressed to realize, the efficiency and speed of decompression are effectively improved, in every a line spv, it is only necessary to 2 Matrix boundaries mark, the required auxiliary data being additionally arranged is few, has saved memory space and memory bandwidth, and by continuous Condensation matrix is stored, memory space and memory bandwidth can be further saved.The maximum quantity of matrix is decompressed by storing in spv Number of parameters and the length of decoding determine.Compressed parameter can also be stored in spv across matrix boundaries according to certain way In, when storage, has not needed matrix boundaries differentiation, can further save a large amount of memory space, it is only necessary to send finger accordingly It enables, so that it may the corresponding parameter in address in spv be taken out, obtain weight matrix after decompression.
Nonzero element in compressed matrix is stored in spv as shown in figure 4, wherein after low level storage compression by the present embodiment The weight table of nonzero element, specially 5bit, high position storage relative indexing, specially 4bit, two storage boundary markers of highest, One of address spv can store 56 nonzero elements, and the element of weight matrix cannot be complete usually within a line after compression Portion has stored, and can have the case where inter-bank storage;The index of the 29th element is absolute indices in spv, and due to the 29th member Element index is absolute indices, and the recovery of this two-part absolute indices of 0-28 and 30-56 can carry out parallel.
As shown in figure 5, in concrete application embodiment of the present invention to three matrixes to be decompressed carry out it is cross-border be stored as spv across Boundary storage is represent the situation of different inter-bank storages by different boundary markers, wherein the first mark 00 refers in the vector There is no matrix boundaries;Second mark 01, which refers in the vector, matrix boundaries, and not at vector ending;Third mark 10 refers to should There are matrix boundaries in vector, and at vector ending;4th mark 11, which refers in the vector, matrix boundaries, and abandons boundary Data later.
As shown in fig. 6, relative indexing is reverted to and element position in step S2 when the present embodiment executes multiple matrixes decompression Set the specific steps of one-to-one absolute indices are as follows:
1. converting absolute indices for all relative indexings (specially 56) of storage by cumulative, as it is added item by item Obtain absolute indices;Absolute indices are arranged one in order to processing array boundary by the present embodiment more;
2. generating weight quantifies table;
3. configuring boundary marker between matrix each in spv;Obtain spv in parameter when, when getting boundary marker into Row judgement is transferred to when if it is the second mark (01), the 4th mark (11) and executes step 4., no to then follow the steps 5.;
4. according to absolute indices judgment matrix boundary, by high two progress exclusive or of adjacent absolute indices, if result is 1, then determine that previous position is the boundary position of first matrix, the latter position is the starting position of next matrix, is obtained To matrix boundaries;
5. generating with whether each relative indexing indexes useful signal (specially 56) and judgment matrix correspondingly The signal of end, and be arranged according to boundary marker and index useful signal, the signal whether matrix terminates;
6. the signal whether output index useful signal, absolute indices, weight vectors table and matrix terminate.
The present embodiment above-mentioned steps 5. in signal that whether index useful signal, matrix terminate is arranged according to boundary marker Specific steps are as follows:
5.1) when boundary marker is the first mark (00), no matrix boundaries are determined, and matrix is also not finished, by 56 Effective index signal is assigned a value of 1, and matrix is not finished signal and is set as 0;
5.2) when boundary marker is third mark (10), trip current boundary effectively indexes at vector end, by 56 Signal is assigned a value of 1, and matrix end signal is also configured as 1;
5.3) when boundary marker is the second mark (01), trip current boundary is in the center of vector, according to the square of judgement Battle array boundary is divided to two bats to carry out, and first count sets 1 for the index useful signal before matrix boundaries, and matrix end signal is set as 1, second count sets 1 for the index useful signal after matrix boundaries, and matrix end signal is set as 0;
5.4) when boundary marker is the 4th mark (11), determine to have arrived the last line of vector, according to determining The last one matrix boundary, set 1 for the index useful signal in matrix boundaries, abandon unwanted data.
As shown in fig. 7, weighted value in weight vectors table is carried out in step S2 when the present embodiment executes multiple matrixes decompression The specific steps of recovery are as follows:
1. weight quantized value is assigned to corresponding position according to absolute indices and index useful signal;
2. judge whether that reaching matrix boundaries deposits the weight quantized value of corresponding position if matrix does not reach boundary, If matrix reaches boundary, by the weight quantization table output of matrix.
As shown in figure 8, when executing inverse quantization in the present embodiment, it is by searching for the mode of table that each weight quantized value is corresponding Effective weight restore, whether the corresponding position of judgment matrix has corresponding weighted value, if so, weight is restored, if nothing, Corresponding position is assigned a value of 0, finally obtains a complete dense matrix.
By above-mentioned decompressing method, decompression processing can be carried out for neural network model compression algorithm, to number after compression According to carrying out storage organization mode in the manner described above, it is possible to reduce the memory space of storage parameter, and can once decompress multiple Matrix supports weight shared and sparse matrix, and decompression speed is fast, high efficiency and flexibly.
Above-mentioned only presently preferred embodiments of the present invention, is not intended to limit the present invention in any form.Although of the invention It has been disclosed in a preferred embodiment above, however, it is not intended to limit the invention.Therefore, all without departing from technical solution of the present invention Content, technical spirit any simple modifications, equivalents, and modifications made to the above embodiment, should all fall according to the present invention In the range of technical solution of the present invention protection.

Claims (10)

1. a kind of parameter decompressing method for sparse neural network model, which is characterized in that step includes:
S1. compression parameters store: required sparse matrix is stored to designated position, wherein when being stored to nonzero element in matrix, The corresponding relative indexing of each nonzero element, weight quantized value are stored, the relative indexing is for identifying between two nonzero elements Zero number, if the zero number between two nonzero elements stores a zero when being greater than preset threshold;
S2. decompress: the data that the matrix decompressed needed for obtaining is stored according to step S1 extract the relative indexing therein And weight quantized value, and by the relative indexing revert to the one-to-one absolute indices of element position, according to restoring To the absolute indices determine nonzero element in dense matrix, neutral element position and determine position weight quantization Value, rebuilds weight vectors table according to the position of nonzero element, complete after weighted value is restored in the weight vectors table At the decompression of dense matrix.
2. the parameter decompressing method according to claim 1 for sparse neural network model, which is characterized in that the step It include Instruction decoding and data acquisition step S21, specific steps in rapid S2 are as follows: by receiving decompression instruction and being decoded, root The destination address of weight matrix after the length, source address and storage decompression of decompression matrix needed for being determined according to decoded information, according to The data that decompression matrix is stored needed for the obtained source address is taken out, extract the relative indexing therein.
3. the parameter decompressing method according to claim 2 for sparse neural network model, which is characterized in that the step It include that will index recovering step S22, specific steps in rapid S2 are as follows: reverted to the relative indexing absolutely by way of cumulative Index.
4. the parameter decompressing method according to claim 3 for sparse neural network model, which is characterized in that the step It include rebuilding weight to quantify table step S23, specific steps in rapid S2 are as follows: determined according to each of restoring to obtain the absolute indices The position of nonzero element quantifies table by the position reconstruction weight of each nonzero element, and nonzero element is corresponding in the weight quantization table Position is effective weighted value.
5. the parameter decompressing method according to claim 4 for sparse neural network model, which is characterized in that the step It include dequantization step S24, specific steps are as follows: weighted value effective in the weight quantization table is restored, is obtained in rapid S2 Complete dense matrix.
6. the parameter decompressing method according to claim 5 for sparse neural network model, which is characterized in that the step Weighted value effective in the weight quantization table is restored especially by look-up table in rapid S24.
7. the parameter decompressing method of sparse neural network model is used for described according to claim 1~any one of 6, it is special Sign is, further includes storing multiple required matrixes according to cross-border mode, and configure boundary marker to identify inter-bank The state of storage, the boundary marker include not having the first of matrix boundaries indicate, have matrix in mark vector in mark vector Boundary, and second at vector ending does not indicate, has matrix boundaries, and the third at vector ending in mark vector There are matrix boundaries in mark and mark vector, and abandons the 4th mark of the data after boundary.
8. the parameter decompressing method according to claim 7 for sparse neural network model, which is characterized in that execute more When a matrix decompresses, the relative indexing is reverted to the tool with the one-to-one absolute indices of element position in the step S2 Body step are as follows:
Absolute indices are converted by all relative indexings of storage;
Judged when getting the boundary marker, when if it is second mark, the 4th mark, according to described absolute Judgment matrix boundary is indexed, high two progress exclusive or of the adjacent absolute indices is determined previous if result is 1 Position is the boundary position of first matrix, and the latter position is the starting position of next matrix, obtains matrix boundaries;
It generates and indexes the signal whether useful signal and judgment matrix terminate correspondingly with each relative indexing, and according to institute State the signal that the index useful signal is arranged in boundary marker, whether matrix terminates;
Export the signal whether the index useful signal, absolute indices and matrix terminate.
9. the parameter decompressing method according to claim 8 for sparse neural network model, which is characterized in that described The specific steps of the index useful signal, the signal whether matrix terminates are set according to the boundary marker are as follows:
When the boundary marker is the described first mark, no matrix boundaries are determined, and matrix is also not finished, it will be each described Effective index signal is assigned a value of 1, and the matrix is not finished signal and is set as 0;
When the boundary marker is the third mark, trip current boundary is at vector end, by each effective index Signal is assigned a value of 1, and the matrix end signal is also configured as 1;
When the boundary marker is the described second mark, trip current boundary is in the center of vector, according to the square of judgement Battle array boundary is divided to two bats to carry out, and the index useful signal before the matrix boundaries is set 1, and the matrix knot by first count Beam signal is set as 1, and second count sets 1 for the index useful signal after the matrix boundaries, and the matrix terminates Signal is set as 0;
When the boundary marker is the described 4th mark, determine to have arrived the last line of vector, according to determining most The boundary of the latter matrix sets 1 for the index useful signal in the matrix boundaries, abandons unwanted data.
10. the parameter decompressing method according to claim 9 for sparse neural network model, which is characterized in that described By weighted value restores in the weight vectors table specific steps in step S2 are as follows:
According to the absolute indices and the index useful signal, weight quantized value is assigned to corresponding position;
Judge whether that reaching matrix boundaries deposits the weight quantized value of corresponding position, if matrix if matrix does not reach boundary Boundary is reached, by the weight quantization table output of matrix.
CN201810845949.XA 2018-07-27 2018-07-27 Parameter decompression method for sparse neural network model Active CN109255429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810845949.XA CN109255429B (en) 2018-07-27 2018-07-27 Parameter decompression method for sparse neural network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810845949.XA CN109255429B (en) 2018-07-27 2018-07-27 Parameter decompression method for sparse neural network model

Publications (2)

Publication Number Publication Date
CN109255429A true CN109255429A (en) 2019-01-22
CN109255429B CN109255429B (en) 2020-11-20

Family

ID=65049925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810845949.XA Active CN109255429B (en) 2018-07-27 2018-07-27 Parameter decompression method for sparse neural network model

Country Status (1)

Country Link
CN (1) CN109255429B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766136A (en) * 2019-10-16 2020-02-07 北京航空航天大学 Compression method of sparse matrix and vector
US10872186B2 (en) 2017-01-04 2020-12-22 Stmicroelectronics S.R.L. Tool to create a reconfigurable interconnect framework
US11227086B2 (en) 2017-01-04 2022-01-18 Stmicroelectronics S.R.L. Reconfigurable interconnect
US11295199B2 (en) 2019-12-09 2022-04-05 UMNAI Limited XAI and XNN conversion
US11531873B2 (en) 2020-06-23 2022-12-20 Stmicroelectronics S.R.L. Convolution acceleration with embedded vector decompression
US11593609B2 (en) 2020-02-18 2023-02-28 Stmicroelectronics S.R.L. Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007323401A (en) * 2006-06-01 2007-12-13 Kagawa Univ Data processor, data restoration device, data processing method and data restoration method
CN105260776A (en) * 2015-09-10 2016-01-20 华为技术有限公司 Neural network processor and convolutional neural network processor
CN107229967A (en) * 2016-08-22 2017-10-03 北京深鉴智能科技有限公司 A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA
CN107944555A (en) * 2017-12-07 2018-04-20 广州华多网络科技有限公司 Method, storage device and the terminal that neutral net is compressed and accelerated
CN108111863A (en) * 2017-12-22 2018-06-01 洛阳中科信息产业研究院(中科院计算技术研究所洛阳分所) A kind of online real-time three-dimensional model video coding-decoding method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007323401A (en) * 2006-06-01 2007-12-13 Kagawa Univ Data processor, data restoration device, data processing method and data restoration method
CN105260776A (en) * 2015-09-10 2016-01-20 华为技术有限公司 Neural network processor and convolutional neural network processor
CN107229967A (en) * 2016-08-22 2017-10-03 北京深鉴智能科技有限公司 A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA
CN107944555A (en) * 2017-12-07 2018-04-20 广州华多网络科技有限公司 Method, storage device and the terminal that neutral net is compressed and accelerated
CN108111863A (en) * 2017-12-22 2018-06-01 洛阳中科信息产业研究院(中科院计算技术研究所洛阳分所) A kind of online real-time three-dimensional model video coding-decoding method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SONG HAN 等: "DEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING", 《ARXIV:1510.00149V5 [CS.CV]》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10872186B2 (en) 2017-01-04 2020-12-22 Stmicroelectronics S.R.L. Tool to create a reconfigurable interconnect framework
US11227086B2 (en) 2017-01-04 2022-01-18 Stmicroelectronics S.R.L. Reconfigurable interconnect
US11562115B2 (en) 2017-01-04 2023-01-24 Stmicroelectronics S.R.L. Configurable accelerator framework including a stream switch having a plurality of unidirectional stream links
US11675943B2 (en) 2017-01-04 2023-06-13 Stmicroelectronics S.R.L. Tool to create a reconfigurable interconnect framework
CN110766136A (en) * 2019-10-16 2020-02-07 北京航空航天大学 Compression method of sparse matrix and vector
CN110766136B (en) * 2019-10-16 2022-09-09 北京航空航天大学 Compression method of sparse matrix and vector
US11295199B2 (en) 2019-12-09 2022-04-05 UMNAI Limited XAI and XNN conversion
US11593609B2 (en) 2020-02-18 2023-02-28 Stmicroelectronics S.R.L. Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks
US11880759B2 (en) 2020-02-18 2024-01-23 Stmicroelectronics S.R.L. Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks
US11531873B2 (en) 2020-06-23 2022-12-20 Stmicroelectronics S.R.L. Convolution acceleration with embedded vector decompression
US11836608B2 (en) 2020-06-23 2023-12-05 Stmicroelectronics S.R.L. Convolution acceleration with embedded vector decompression

Also Published As

Publication number Publication date
CN109255429B (en) 2020-11-20

Similar Documents

Publication Publication Date Title
CN109255429A (en) Parameter decompression method for sparse neural network model
CN107832837B (en) Convolutional neural network compression method and decompression method based on compressed sensing principle
CN101183873B (en) BP neural network based embedded system data compression/decompression method
CN106407285B (en) A kind of optimization bit file compression & decompression method based on RLE and LZW
CN103295198B (en) Based on redundant dictionary and the sparse non-convex compressed sensing image reconstructing method of structure
CN110223234A (en) Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion
CN110069644A (en) A kind of compression domain large-scale image search method based on deep learning
CN101430881A (en) Encoding, decoding and encoding/decoding method, encoding/decoding system and correlated apparatus
CN105024702A (en) Floating-point-type data lossless compression method for scientific calculation
Li et al. A dual graph approach to 3D triangular mesh compression
CN101924562B (en) Compression-type coding scheme of curve vector data based on integer wavelet transformation
CN116051156B (en) New energy dynamic electricity price data management system based on digital twin
CN107025273A (en) The optimization method and device of a kind of data query
CN104300988B (en) Signal processing method and equipment based on compressed sensing
CN113595993A (en) Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation
CN116743182B (en) Lossless data compression method
CN103456148B (en) The method and apparatus of signal reconstruction
CN117353754A (en) Coding and decoding method, system, equipment and medium of Gaussian mixture model information source
CN104376585A (en) Non-protruding compressed sensing image reconstructing method based on image block structure attribute strategy
CN103701468A (en) Data compression and decompression method on basis of orthogonal wavelet packet transform and rotating door algorithm
CN107343201A (en) CABAC coding methods and system
CN103746701A (en) Rapid encoding option selecting method applied to Rice lossless data compression
CN115866252A (en) Image compression method, device, equipment and storage medium
CN111797991A (en) Deep network model compression system, method and device
KR101603467B1 (en) Method and device for compression of vertex data in three-dimensional image data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant