CN109255429A - Parameter decompression method for sparse neural network model - Google Patents
Parameter decompression method for sparse neural network model Download PDFInfo
- Publication number
- CN109255429A CN109255429A CN201810845949.XA CN201810845949A CN109255429A CN 109255429 A CN109255429 A CN 109255429A CN 201810845949 A CN201810845949 A CN 201810845949A CN 109255429 A CN109255429 A CN 109255429A
- Authority
- CN
- China
- Prior art keywords
- matrix
- weight
- boundary
- mark
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
Abstract
The invention discloses a parameter decompression method for a sparse neural network model, which comprises the following steps: s1, storing a required sparse matrix to a specified position, wherein when non-zero elements in the matrix are stored, a relative index and a weight quantization value corresponding to each non-zero element are stored, and if zero number between two non-zero elements is larger than a preset threshold value, a zero value is stored; s2, data stored in the matrix to be decompressed are obtained, relative indexes and weight quantization values in the data are extracted, the relative indexes are restored to absolute indexes, positions of non-zero elements and zero elements in the dense matrix and corresponding weight quantization values are determined according to the restored absolute indexes, a weight vector table is reconstructed according to the positions of the non-zero elements, and decompression of the dense matrix is completed after weight values in the weight vector table are restored. The invention has the advantages of simple realization operation, high decompression efficiency and resource utilization rate, wide and flexible application range and the like.
Description
Technical field
The present invention relates to nerual network technique field more particularly to a kind of parameter decompressions for sparse neural network model
Method.
Background technique
With the burning hot development of deep learning, the models such as DNN, RNN in deep learning are widely used, these
Model overcomes the obstacle of traditional technology, achieves huge achievement in many fields such as speech recognition, image recognition.Either
DNN or RNN neural network, can all there is the calculating of full articulamentum, and the calculating of full articulamentum is by weight matrix and corresponding
Multiplication of vectors.In training neural network, need to model using suitable compression algorithm, mainly in neural network
Full articulamentum carries out compression processing, i.e., carries out compression of parameters, in the case where not influencing its accuracy, Ke Yi great to weight matrix
The big execution speed for improving ANN Reasoning, after training, by the parameter after training there are in memory, in needs
When again solution be pressed into weight matrix.
Typical model compression algorithm as shown in Figure 1, mainly by beta pruning, quantization is trained and variable length code forms,
After initializing the training stage, by removing beta pruning of the weight lower than the connection implementation model of threshold value, this beta pruning turns dense layer
Sparse layer is turned to, the first stage needs the topological structure of learning network, and pays close attention to important connection and remove unessential connection;
Quantization training is the process of a shared weight, and multiple connections is enabled to share identical weight;Beta pruning and quantization training are in mutually not shadow
In the case where sound, very high compression ratio can produce, so that the demand of memory space is reduced.After training, it will train
Parameter in the case where not influencing its precision, it is necessary for being decompressed into accurate weight matrix.
There are many hardware structures directly can carry out operation to the parameter after model compression at present, but needs more complex hard
Part architecture design, realization is complex and costly, how to avoid realizing sparse neural network model using complicated hardware structure
The correct decompression of parameter is a problem to be solved, and to realize the correct decompression of sparse neural network model parameter, there is pressure
Parameter after contracting is organized to store in which way, and how the parameter of storage in the case where guaranteeing its accuracy is decompressed into one
The problems such as a complete dense matrix, and usually compression can once compress multiple matrixes at present, it is also necessary to it can be realized primary
Decompress multiple matrixes.
Summary of the invention
The technical problem to be solved in the present invention is that, for technical problem of the existing technology, the present invention provides one
Kind realizes that easy to operate, decompression efficiency and resource utilization are high, has a wide range of application and flexibly for sparse neural network model
Parameter decompressing method.
In order to solve the above technical problems, technical solution proposed by the present invention are as follows:
A kind of parameter decompressing method for sparse neural network model, which is characterized in that step includes:
S1. compression parameters store: required sparse matrix being stored to designated position, wherein storing to nonzero element in matrix
When, store the corresponding relative indexing of each nonzero element, weight quantized value, the relative indexing for identify two nonzero elements it
Between zero number, if zero number between two nonzero elements stores a zero when being greater than preset threshold;
S2. decompress: the data that the matrix decompressed needed for obtaining is stored according to step S1 extract therein described opposite
Index and weight quantized value, and by the relative indexing revert to the one-to-one absolute indices of element position, according to extensive
The absolute indices regained determine the weight of nonzero element in dense matrix, the position of neutral element and determining position
Quantized value rebuilds weight vectors table according to the position of nonzero element, and weighted value in the weight vectors table is restored
Afterwards, the decompression of dense matrix is completed.
As a further improvement of the present invention, including Instruction decoding and data acquisition step S21 in the step S2, specifically
Step are as follows: instruct and be decoded by receiving decompression, the decompression length of matrix, source address needed for being determined according to decoded information with
And the destination address of weight matrix after decompression is stored, the number that decompression matrix is stored needed for being taken out according to the obtained source address
According to extracting the relative indexing therein.
It as a further improvement of the present invention, include that will index recovering step S22, specific steps in the step S2 are as follows:
The relative indexing is reverted into absolute indices by way of cumulative.
It as a further improvement of the present invention, include rebuilding weight to quantify table step S23, specific steps in the step S2
Are as follows: according to the position restored each of to obtain the absolute indices and determine nonzero element, weighed by the position reconstruction of each nonzero element
Re-quantization table, it is effective weighted value that the weight, which quantifies nonzero element corresponding position in table,.
It as a further improvement of the present invention, include dequantization step S24, specific steps are as follows: by institute in the step S2
It states effective weighted value in weight quantization table to be restored, obtains complete dense matrix.
As a further improvement of the present invention, the weight is quantified in table especially by look-up table in the step S24
Effective weighted value is restored.
It as a further improvement of the present invention, further include storing multiple required matrixes according to cross-border mode,
And boundary marker is configured to identify the state of inter-bank storage, the boundary marker includes do not have matrix boundaries in mark vector the
There are matrix boundaries in one mark, mark vector, and second at vector ending does not indicate, has matrix side in mark vector
Boundary, and have matrix boundaries in the third mark and mark vector at vector ending, and abandon the data after boundary
The 4th mark.
As a further improvement of the present invention, when executing multiple matrixes decompression, by the relative indexing in the step S2
Revert to the specific steps with the one-to-one absolute indices of element position are as follows:
Absolute indices are converted by all relative indexings of storage;
Judged when getting the boundary marker, when if it is second mark, the 4th mark, according to described
Absolute indices judgment matrix boundary, by high two progress exclusive or of the adjacent absolute indices, if result is 1, before determining
One position is the boundary position of first matrix, and the latter position is the starting position of next matrix, obtains matrix boundaries;
It generates and indexes the signal whether useful signal and judgment matrix terminate, and root correspondingly with each relative indexing
According to the signal that the index useful signal is arranged in the boundary marker, whether matrix terminates;
Export the signal whether the index useful signal, absolute indices and matrix terminate.
As a further improvement of the present invention, described that the index useful signal, matrix are arranged according to the boundary marker
The specific steps of the signal whether terminated are as follows:
When the boundary marker is the described first mark, no matrix boundaries are determined, and matrix is also not finished, it will be each
Effective index signal is assigned a value of 1, and the matrix is not finished signal and is set as 0;
When the boundary marker is the third mark, trip current boundary, will be each described effective at vector end
Index signal is assigned a value of 1, and the matrix end signal is also configured as 1;
When the boundary marker is the described second mark, trip current boundary is in the center of vector, according to the institute of judgement
Stating matrix boundaries is divided to two bats to carry out, and the index useful signal before the matrix boundaries is set 1, and the square by first count
Battle array end signal is set as 1, and the index useful signal after the matrix boundaries is set 1, and the matrix by second count
End signal is set as 0;
When the boundary marker is the described 4th mark, determine to have arrived the last line of vector, according to determining
The last one matrix boundary, set 1 for the index useful signal in the matrix boundaries, abandon unwanted data.
As a further improvement of the present invention, weighted value in the weight vectors table is restored in the step S2
Specific steps are as follows:
According to the absolute indices and the index useful signal, weight quantized value is assigned to corresponding position;
Judge whether that reaching matrix boundaries deposits the weight quantized value of corresponding position if matrix does not reach boundary, if
Matrix reaches boundary, by the weight quantization table output of matrix.
Compared with the prior art, the advantages of the present invention are as follows:
1, the present invention by by compression parameters according to by the storage of nonzero element corresponding relative indexing, weight quantized value
Mode, it is possible to reduce store the memory space of parameter, then by the way that relative indexing is reverted to absolute indices and based on absolute
Index restores weight vectors table, may finally decompress to obtain correct weight matrix, after model compression is complete, set without hardware
The decompression processing to compact model can be realized in meter, completes the reconstruction of weight matrix, decompresses fast speed, high efficiency and realizes spirit
It is living.
2, the sparse matrix in support model compression of the present invention and weight are shared, not only support RNN class model compression algorithm
Decompression, the decompression that can also support DNN class model to compress has considerable flexibility and scalability, and realizes simple spirit
It is living, the dimension of the weight matrix of decompression is not required.
3, the present invention is arranged boundary marker between matrix, works as needs by the way that matrix after multiple compressions is carried out Coutinuous store
When decompressing multiple matrixes, it is only necessary to send corresponding decompression instruction, determine the boundary condition between each matrix by boundary marker, then may be used
Multiple weight matrixs are correctly once decompressed to realize, the efficiency and speed of decompression are effectively improved, in every a line spv, it is only necessary to 2
Matrix boundaries mark, the required auxiliary data being additionally arranged is few, has saved memory space and memory bandwidth, and by continuous
Condensation matrix is stored, memory space and memory bandwidth can be further saved.The maximum quantity of matrix is decompressed by storing in spv
Number of parameters and the length of decoding determine.
Detailed description of the invention
Fig. 1 is the realization principle schematic diagram of typical model compression algorithm.
Fig. 2 is the realization principle schematic diagram for the parameter decompression that the present embodiment realizes sparse neural network model.
Fig. 3 is implementation process schematic diagram of the present embodiment for the parameter decompressing method of sparse neural network model.
Fig. 4 is the realization principle schematic diagram that nonzero element in compressed matrix is stored in SPV by the present embodiment.
Fig. 5 is that in concrete application embodiment of the present invention three matrixes to be decompressed are carried out with cross-border to be stored as spv cross-border
The schematic illustration of storage.
Fig. 6 is the specific implementation flow schematic diagram that absolute indices are restored when the more matrixes of the present embodiment decompress.
Fig. 7 is the implementation process schematic diagram that weight quantization table is rebuild when the more matrixes of the present embodiment decompress.
Fig. 8 is the implementation process schematic diagram of inverse quantization in the present embodiment.
Specific embodiment
Below in conjunction with Figure of description and specific preferred embodiment, the invention will be further described, but not therefore and
It limits the scope of the invention.
As shown in Figure 2,3, the present embodiment is used for the parameter decompressing method of sparse neural network model, and step includes:
S1. compression parameters store: required sparse matrix being stored to designated position, wherein storing to nonzero element in matrix
When, the corresponding relative indexing of each nonzero element, weight quantized value are stored, relative indexing is for identifying zero between two nonzero elements
Number, if zero number between two nonzero elements stores a zero when being greater than preset threshold;
S2. decompress: the data that the matrix decompressed needed for obtaining is stored according to step S1 extract opposite rope therein
Regard it as and weight quantized value, and by relative indexing revert to the one-to-one absolute indices of element position, according to restoring to obtain
Absolute indices determine nonzero element in dense matrix, neutral element position and determine position weight quantized value, root
Weight vectors table is rebuild according to the position of nonzero element, after weighted value is restored in weight vectors table, completes dense square
The decompression of battle array, the dense matrix that is to say the weight matrix in neural network.
The present embodiment can be realized without hardware design to compact model after model compression is complete by the above method
Decompression processing, complete the reconstruction of weight matrix, by by compression parameters according to by the corresponding relative indexing of nonzero element, power
The mode of the storage of re-quantization value, it is possible to reduce store the memory space of parameter, then by the way that relative indexing is reverted to absolute rope
Regard it as and restore weight vectors table based on absolute indices, may finally decompress to obtain correct weight matrix.
Compressed parameter is specifically stored in vector memory spv by the present embodiment, and the width of spv is 32 words, solution
Weight matrix after pressure is stored into memory spm, and spm width is 1024 words, is stored and is pressed by multiplexing vectors memory spv
Matrix after contracting can make hardware resource utilization height.There are two types of the compress techniques that the above-mentioned decompressing method of the present embodiment is supported:
Sparse matrix and weight are shared, wherein the storage to nonzero element in sparse matrix, passes through one relative indexing of storage and one
Weight quantized value stores one zero if zero number between two nonzero elements is greater than preset threshold (specifically taking 15)
Value indicates zero number before nonzero element by relative indexing;Determination for quantized value, due to the power in compression algorithm
It shares again, weight quantized value is reduced to 5bit from 16bit, therefore 9bit is only needed to store a nonzero element.Compressed ginseng
Number is stored in the manner described above in vector memory spv, and an address vector memory spv can specifically store 56 non-zeros
Element.
The above-mentioned decompressing method of the present embodiment, the sparse matrix and weight in support model compression are shared, not only support RNN class
The decompression of model compression algorithm, the decompression that can also support DNN class model to compress have considerable flexibility and scalability,
And realize simple and flexible, the dimension of the weight matrix of decompression is not required.
As shown in figure 3, sending decompression instruction when the present embodiment realizes decompression, after receiving decompression instruction, start to depositing
It stores up the parameter in spv and carries out including index recovery, the decompression processing for determining weight quantization table and inverse quantization operation, will decompress
Weight matrix afterwards is stored in memory spm.
It include Instruction decoding and data acquisition step S21, specific steps in the present embodiment step S2 are as follows: decompressed by receiving
It instructs and is decoded, weight square after the length, source address and storage decompression of decompression matrix needed for being determined according to decoded information
The destination address of battle array, the data that decompression matrix is stored needed for being taken out according to obtained source address extract opposite rope therein
Regard it as and weight quantized value.
Include the steps that reverting to relative indexing into absolute indices S22 in the present embodiment step S2, step S22 especially by
Relative indexing is reverted to absolute indices by cumulative mode.
It include rebuilding weight vectors table step S23, specific steps in the present embodiment step S2 are as follows: obtained according to recovery
Each absolute indices determine the position of nonzero element, table is quantified by the position reconstruction weight of each nonzero element, weight quantifies table
Middle nonzero element corresponding position is effective weighted value.The specifically position in weight quantization table, if there is corresponding absolute indices,
Nonzero element position is corresponded to, which is assigned a value of corresponding weight quantized value, otherwise by the weight of the position
Change value is assigned a value of 0.
It include dequantization step S24, specific steps in the present embodiment step S2 are as follows: weight is quantified into effective weighted value in table
Restored, obtain complete dense matrix, is restored weighted value effective in weight quantization table especially by look-up table.
The detailed process of the present embodiment above-mentioned realization decompression includes:
S21. Instruction decoding: sending different decompressions according to the matrix of required decompression and instruct, and includes the source spv in decompression instruction
Address, spm destination address, decompress spv vector length len;It is decoded after receiving decompression instruction, determines that solution presses to
The length of amount, the source address of decompression and the destination address for storing weight matrix,.It is taken out and is deposited from memory spv according to source address
The parameter of storage extracts relative indexing and weight quantized value;
S22. index restores: relative indexing is reverted to by absolute indices by the method for the sum that adds up, it is true according to absolute indices
Determine the position of neutral element and nonzero element in dense matrix, exports absolute indices and weight quantized value;
S23. it rebuilds weight vectors table: determining the position of nonzero element in weight quantization table by each absolute indices
It sets, according to weight quantized value corresponding to the corresponding nonzero element position of the absolute indices of nonzero element establishment, if having corresponding exhausted
To index, the position is assigned to corresponding weight quantized value, if the weight quantized value of the position is answered without corresponding absolute indices
It is 0, rebuilds the weight quantization table for obtaining matrix;
S24. inverse quantization: effective weighted value corresponding to weight quantization table is restored by searching for table, obtains one
Complete dense matrix.
According to the method described above it is every decompressed a complete dense matrix after, will according to the destination address of Instruction decoding
Dense matrix is stored in corresponding spm at destination address.
In the present embodiment, multiple required matrixes are stored according to cross-border mode, and configure boundary marker to mark
The state for knowing inter-bank storage, the situation of different inter-bank storages is represent by different boundary markers, and boundary marker includes mark
There is no the first of matrix boundaries indicate, there are matrix boundaries in mark vector in vector, and the second mark not at vector ending
There are matrix boundaries in will, mark vector, and there are matrix boundaries in the third mark and mark vector at vector ending,
And abandon the 4th mark of the data after boundary.One address spv can store 56 nonzero elements, weight square after compression
The element of battle array cannot have been stored all usually within a line, can there is the case where spv inter-bank storage.
When carrying out model compression, a weight matrix once will not be only compressed, the present embodiment is by will be after multiple compressions
Matrix carries out Coutinuous store, boundary marker is arranged between matrix, when needing to decompress multiple matrixes, it is only necessary to send corresponding decompression
Instruction, is decompressed each matrix using the above method respectively, determines the boundary condition between each matrix by boundary marker, then may be used
Multiple weight matrixs are correctly once decompressed to realize, the efficiency and speed of decompression are effectively improved, in every a line spv, it is only necessary to 2
Matrix boundaries mark, the required auxiliary data being additionally arranged is few, has saved memory space and memory bandwidth, and by continuous
Condensation matrix is stored, memory space and memory bandwidth can be further saved.The maximum quantity of matrix is decompressed by storing in spv
Number of parameters and the length of decoding determine.Compressed parameter can also be stored in spv across matrix boundaries according to certain way
In, when storage, has not needed matrix boundaries differentiation, can further save a large amount of memory space, it is only necessary to send finger accordingly
It enables, so that it may the corresponding parameter in address in spv be taken out, obtain weight matrix after decompression.
Nonzero element in compressed matrix is stored in spv as shown in figure 4, wherein after low level storage compression by the present embodiment
The weight table of nonzero element, specially 5bit, high position storage relative indexing, specially 4bit, two storage boundary markers of highest,
One of address spv can store 56 nonzero elements, and the element of weight matrix cannot be complete usually within a line after compression
Portion has stored, and can have the case where inter-bank storage;The index of the 29th element is absolute indices in spv, and due to the 29th member
Element index is absolute indices, and the recovery of this two-part absolute indices of 0-28 and 30-56 can carry out parallel.
As shown in figure 5, in concrete application embodiment of the present invention to three matrixes to be decompressed carry out it is cross-border be stored as spv across
Boundary storage is represent the situation of different inter-bank storages by different boundary markers, wherein the first mark 00 refers in the vector
There is no matrix boundaries;Second mark 01, which refers in the vector, matrix boundaries, and not at vector ending;Third mark 10 refers to should
There are matrix boundaries in vector, and at vector ending;4th mark 11, which refers in the vector, matrix boundaries, and abandons boundary
Data later.
As shown in fig. 6, relative indexing is reverted to and element position in step S2 when the present embodiment executes multiple matrixes decompression
Set the specific steps of one-to-one absolute indices are as follows:
1. converting absolute indices for all relative indexings (specially 56) of storage by cumulative, as it is added item by item
Obtain absolute indices;Absolute indices are arranged one in order to processing array boundary by the present embodiment more;
2. generating weight quantifies table;
3. configuring boundary marker between matrix each in spv;Obtain spv in parameter when, when getting boundary marker into
Row judgement is transferred to when if it is the second mark (01), the 4th mark (11) and executes step 4., no to then follow the steps 5.;
4. according to absolute indices judgment matrix boundary, by high two progress exclusive or of adjacent absolute indices, if result is
1, then determine that previous position is the boundary position of first matrix, the latter position is the starting position of next matrix, is obtained
To matrix boundaries;
5. generating with whether each relative indexing indexes useful signal (specially 56) and judgment matrix correspondingly
The signal of end, and be arranged according to boundary marker and index useful signal, the signal whether matrix terminates;
6. the signal whether output index useful signal, absolute indices, weight vectors table and matrix terminate.
The present embodiment above-mentioned steps 5. in signal that whether index useful signal, matrix terminate is arranged according to boundary marker
Specific steps are as follows:
5.1) when boundary marker is the first mark (00), no matrix boundaries are determined, and matrix is also not finished, by 56
Effective index signal is assigned a value of 1, and matrix is not finished signal and is set as 0;
5.2) when boundary marker is third mark (10), trip current boundary effectively indexes at vector end, by 56
Signal is assigned a value of 1, and matrix end signal is also configured as 1;
5.3) when boundary marker is the second mark (01), trip current boundary is in the center of vector, according to the square of judgement
Battle array boundary is divided to two bats to carry out, and first count sets 1 for the index useful signal before matrix boundaries, and matrix end signal is set as
1, second count sets 1 for the index useful signal after matrix boundaries, and matrix end signal is set as 0;
5.4) when boundary marker is the 4th mark (11), determine to have arrived the last line of vector, according to determining
The last one matrix boundary, set 1 for the index useful signal in matrix boundaries, abandon unwanted data.
As shown in fig. 7, weighted value in weight vectors table is carried out in step S2 when the present embodiment executes multiple matrixes decompression
The specific steps of recovery are as follows:
1. weight quantized value is assigned to corresponding position according to absolute indices and index useful signal;
2. judge whether that reaching matrix boundaries deposits the weight quantized value of corresponding position if matrix does not reach boundary,
If matrix reaches boundary, by the weight quantization table output of matrix.
As shown in figure 8, when executing inverse quantization in the present embodiment, it is by searching for the mode of table that each weight quantized value is corresponding
Effective weight restore, whether the corresponding position of judgment matrix has corresponding weighted value, if so, weight is restored, if nothing,
Corresponding position is assigned a value of 0, finally obtains a complete dense matrix.
By above-mentioned decompressing method, decompression processing can be carried out for neural network model compression algorithm, to number after compression
According to carrying out storage organization mode in the manner described above, it is possible to reduce the memory space of storage parameter, and can once decompress multiple
Matrix supports weight shared and sparse matrix, and decompression speed is fast, high efficiency and flexibly.
Above-mentioned only presently preferred embodiments of the present invention, is not intended to limit the present invention in any form.Although of the invention
It has been disclosed in a preferred embodiment above, however, it is not intended to limit the invention.Therefore, all without departing from technical solution of the present invention
Content, technical spirit any simple modifications, equivalents, and modifications made to the above embodiment, should all fall according to the present invention
In the range of technical solution of the present invention protection.
Claims (10)
1. a kind of parameter decompressing method for sparse neural network model, which is characterized in that step includes:
S1. compression parameters store: required sparse matrix is stored to designated position, wherein when being stored to nonzero element in matrix,
The corresponding relative indexing of each nonzero element, weight quantized value are stored, the relative indexing is for identifying between two nonzero elements
Zero number, if the zero number between two nonzero elements stores a zero when being greater than preset threshold;
S2. decompress: the data that the matrix decompressed needed for obtaining is stored according to step S1 extract the relative indexing therein
And weight quantized value, and by the relative indexing revert to the one-to-one absolute indices of element position, according to restoring
To the absolute indices determine nonzero element in dense matrix, neutral element position and determine position weight quantization
Value, rebuilds weight vectors table according to the position of nonzero element, complete after weighted value is restored in the weight vectors table
At the decompression of dense matrix.
2. the parameter decompressing method according to claim 1 for sparse neural network model, which is characterized in that the step
It include Instruction decoding and data acquisition step S21, specific steps in rapid S2 are as follows: by receiving decompression instruction and being decoded, root
The destination address of weight matrix after the length, source address and storage decompression of decompression matrix needed for being determined according to decoded information, according to
The data that decompression matrix is stored needed for the obtained source address is taken out, extract the relative indexing therein.
3. the parameter decompressing method according to claim 2 for sparse neural network model, which is characterized in that the step
It include that will index recovering step S22, specific steps in rapid S2 are as follows: reverted to the relative indexing absolutely by way of cumulative
Index.
4. the parameter decompressing method according to claim 3 for sparse neural network model, which is characterized in that the step
It include rebuilding weight to quantify table step S23, specific steps in rapid S2 are as follows: determined according to each of restoring to obtain the absolute indices
The position of nonzero element quantifies table by the position reconstruction weight of each nonzero element, and nonzero element is corresponding in the weight quantization table
Position is effective weighted value.
5. the parameter decompressing method according to claim 4 for sparse neural network model, which is characterized in that the step
It include dequantization step S24, specific steps are as follows: weighted value effective in the weight quantization table is restored, is obtained in rapid S2
Complete dense matrix.
6. the parameter decompressing method according to claim 5 for sparse neural network model, which is characterized in that the step
Weighted value effective in the weight quantization table is restored especially by look-up table in rapid S24.
7. the parameter decompressing method of sparse neural network model is used for described according to claim 1~any one of 6, it is special
Sign is, further includes storing multiple required matrixes according to cross-border mode, and configure boundary marker to identify inter-bank
The state of storage, the boundary marker include not having the first of matrix boundaries indicate, have matrix in mark vector in mark vector
Boundary, and second at vector ending does not indicate, has matrix boundaries, and the third at vector ending in mark vector
There are matrix boundaries in mark and mark vector, and abandons the 4th mark of the data after boundary.
8. the parameter decompressing method according to claim 7 for sparse neural network model, which is characterized in that execute more
When a matrix decompresses, the relative indexing is reverted to the tool with the one-to-one absolute indices of element position in the step S2
Body step are as follows:
Absolute indices are converted by all relative indexings of storage;
Judged when getting the boundary marker, when if it is second mark, the 4th mark, according to described absolute
Judgment matrix boundary is indexed, high two progress exclusive or of the adjacent absolute indices is determined previous if result is 1
Position is the boundary position of first matrix, and the latter position is the starting position of next matrix, obtains matrix boundaries;
It generates and indexes the signal whether useful signal and judgment matrix terminate correspondingly with each relative indexing, and according to institute
State the signal that the index useful signal is arranged in boundary marker, whether matrix terminates;
Export the signal whether the index useful signal, absolute indices and matrix terminate.
9. the parameter decompressing method according to claim 8 for sparse neural network model, which is characterized in that described
The specific steps of the index useful signal, the signal whether matrix terminates are set according to the boundary marker are as follows:
When the boundary marker is the described first mark, no matrix boundaries are determined, and matrix is also not finished, it will be each described
Effective index signal is assigned a value of 1, and the matrix is not finished signal and is set as 0;
When the boundary marker is the third mark, trip current boundary is at vector end, by each effective index
Signal is assigned a value of 1, and the matrix end signal is also configured as 1;
When the boundary marker is the described second mark, trip current boundary is in the center of vector, according to the square of judgement
Battle array boundary is divided to two bats to carry out, and the index useful signal before the matrix boundaries is set 1, and the matrix knot by first count
Beam signal is set as 1, and second count sets 1 for the index useful signal after the matrix boundaries, and the matrix terminates
Signal is set as 0;
When the boundary marker is the described 4th mark, determine to have arrived the last line of vector, according to determining most
The boundary of the latter matrix sets 1 for the index useful signal in the matrix boundaries, abandons unwanted data.
10. the parameter decompressing method according to claim 9 for sparse neural network model, which is characterized in that described
By weighted value restores in the weight vectors table specific steps in step S2 are as follows:
According to the absolute indices and the index useful signal, weight quantized value is assigned to corresponding position;
Judge whether that reaching matrix boundaries deposits the weight quantized value of corresponding position, if matrix if matrix does not reach boundary
Boundary is reached, by the weight quantization table output of matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810845949.XA CN109255429B (en) | 2018-07-27 | 2018-07-27 | Parameter decompression method for sparse neural network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810845949.XA CN109255429B (en) | 2018-07-27 | 2018-07-27 | Parameter decompression method for sparse neural network model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109255429A true CN109255429A (en) | 2019-01-22 |
CN109255429B CN109255429B (en) | 2020-11-20 |
Family
ID=65049925
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810845949.XA Active CN109255429B (en) | 2018-07-27 | 2018-07-27 | Parameter decompression method for sparse neural network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109255429B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110766136A (en) * | 2019-10-16 | 2020-02-07 | 北京航空航天大学 | Compression method of sparse matrix and vector |
US10872186B2 (en) | 2017-01-04 | 2020-12-22 | Stmicroelectronics S.R.L. | Tool to create a reconfigurable interconnect framework |
US11227086B2 (en) | 2017-01-04 | 2022-01-18 | Stmicroelectronics S.R.L. | Reconfigurable interconnect |
US11295199B2 (en) | 2019-12-09 | 2022-04-05 | UMNAI Limited | XAI and XNN conversion |
US11531873B2 (en) | 2020-06-23 | 2022-12-20 | Stmicroelectronics S.R.L. | Convolution acceleration with embedded vector decompression |
US11593609B2 (en) | 2020-02-18 | 2023-02-28 | Stmicroelectronics S.R.L. | Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007323401A (en) * | 2006-06-01 | 2007-12-13 | Kagawa Univ | Data processor, data restoration device, data processing method and data restoration method |
CN105260776A (en) * | 2015-09-10 | 2016-01-20 | 华为技术有限公司 | Neural network processor and convolutional neural network processor |
CN107229967A (en) * | 2016-08-22 | 2017-10-03 | 北京深鉴智能科技有限公司 | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA |
CN107944555A (en) * | 2017-12-07 | 2018-04-20 | 广州华多网络科技有限公司 | Method, storage device and the terminal that neutral net is compressed and accelerated |
CN108111863A (en) * | 2017-12-22 | 2018-06-01 | 洛阳中科信息产业研究院(中科院计算技术研究所洛阳分所) | A kind of online real-time three-dimensional model video coding-decoding method |
-
2018
- 2018-07-27 CN CN201810845949.XA patent/CN109255429B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007323401A (en) * | 2006-06-01 | 2007-12-13 | Kagawa Univ | Data processor, data restoration device, data processing method and data restoration method |
CN105260776A (en) * | 2015-09-10 | 2016-01-20 | 华为技术有限公司 | Neural network processor and convolutional neural network processor |
CN107229967A (en) * | 2016-08-22 | 2017-10-03 | 北京深鉴智能科技有限公司 | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA |
CN107944555A (en) * | 2017-12-07 | 2018-04-20 | 广州华多网络科技有限公司 | Method, storage device and the terminal that neutral net is compressed and accelerated |
CN108111863A (en) * | 2017-12-22 | 2018-06-01 | 洛阳中科信息产业研究院(中科院计算技术研究所洛阳分所) | A kind of online real-time three-dimensional model video coding-decoding method |
Non-Patent Citations (1)
Title |
---|
SONG HAN 等: "DEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING", 《ARXIV:1510.00149V5 [CS.CV]》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10872186B2 (en) | 2017-01-04 | 2020-12-22 | Stmicroelectronics S.R.L. | Tool to create a reconfigurable interconnect framework |
US11227086B2 (en) | 2017-01-04 | 2022-01-18 | Stmicroelectronics S.R.L. | Reconfigurable interconnect |
US11562115B2 (en) | 2017-01-04 | 2023-01-24 | Stmicroelectronics S.R.L. | Configurable accelerator framework including a stream switch having a plurality of unidirectional stream links |
US11675943B2 (en) | 2017-01-04 | 2023-06-13 | Stmicroelectronics S.R.L. | Tool to create a reconfigurable interconnect framework |
CN110766136A (en) * | 2019-10-16 | 2020-02-07 | 北京航空航天大学 | Compression method of sparse matrix and vector |
CN110766136B (en) * | 2019-10-16 | 2022-09-09 | 北京航空航天大学 | Compression method of sparse matrix and vector |
US11295199B2 (en) | 2019-12-09 | 2022-04-05 | UMNAI Limited | XAI and XNN conversion |
US11593609B2 (en) | 2020-02-18 | 2023-02-28 | Stmicroelectronics S.R.L. | Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks |
US11880759B2 (en) | 2020-02-18 | 2024-01-23 | Stmicroelectronics S.R.L. | Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks |
US11531873B2 (en) | 2020-06-23 | 2022-12-20 | Stmicroelectronics S.R.L. | Convolution acceleration with embedded vector decompression |
US11836608B2 (en) | 2020-06-23 | 2023-12-05 | Stmicroelectronics S.R.L. | Convolution acceleration with embedded vector decompression |
Also Published As
Publication number | Publication date |
---|---|
CN109255429B (en) | 2020-11-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109255429A (en) | Parameter decompression method for sparse neural network model | |
CN107832837B (en) | Convolutional neural network compression method and decompression method based on compressed sensing principle | |
CN101183873B (en) | BP neural network based embedded system data compression/decompression method | |
CN106407285B (en) | A kind of optimization bit file compression & decompression method based on RLE and LZW | |
CN103295198B (en) | Based on redundant dictionary and the sparse non-convex compressed sensing image reconstructing method of structure | |
CN110223234A (en) | Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion | |
CN110069644A (en) | A kind of compression domain large-scale image search method based on deep learning | |
CN101430881A (en) | Encoding, decoding and encoding/decoding method, encoding/decoding system and correlated apparatus | |
CN105024702A (en) | Floating-point-type data lossless compression method for scientific calculation | |
Li et al. | A dual graph approach to 3D triangular mesh compression | |
CN101924562B (en) | Compression-type coding scheme of curve vector data based on integer wavelet transformation | |
CN116051156B (en) | New energy dynamic electricity price data management system based on digital twin | |
CN107025273A (en) | The optimization method and device of a kind of data query | |
CN104300988B (en) | Signal processing method and equipment based on compressed sensing | |
CN113595993A (en) | Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation | |
CN116743182B (en) | Lossless data compression method | |
CN103456148B (en) | The method and apparatus of signal reconstruction | |
CN117353754A (en) | Coding and decoding method, system, equipment and medium of Gaussian mixture model information source | |
CN104376585A (en) | Non-protruding compressed sensing image reconstructing method based on image block structure attribute strategy | |
CN103701468A (en) | Data compression and decompression method on basis of orthogonal wavelet packet transform and rotating door algorithm | |
CN107343201A (en) | CABAC coding methods and system | |
CN103746701A (en) | Rapid encoding option selecting method applied to Rice lossless data compression | |
CN115866252A (en) | Image compression method, device, equipment and storage medium | |
CN111797991A (en) | Deep network model compression system, method and device | |
KR101603467B1 (en) | Method and device for compression of vertex data in three-dimensional image data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |