CN106847268B - Neural network acoustic model compression and voice recognition method - Google Patents

Neural network acoustic model compression and voice recognition method Download PDF

Info

Publication number
CN106847268B
CN106847268B CN201510881044.4A CN201510881044A CN106847268B CN 106847268 B CN106847268 B CN 106847268B CN 201510881044 A CN201510881044 A CN 201510881044A CN 106847268 B CN106847268 B CN 106847268B
Authority
CN
China
Prior art keywords
matrix
vector
codebook
vectors
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510881044.4A
Other languages
Chinese (zh)
Other versions
CN106847268A (en
Inventor
张鹏远
邢安昊
潘接林
颜永红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Beijing Kexin Technology Co Ltd
Original Assignee
Institute of Acoustics CAS
Beijing Kexin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS, Beijing Kexin Technology Co Ltd filed Critical Institute of Acoustics CAS
Priority to CN201510881044.4A priority Critical patent/CN106847268B/en
Publication of CN106847268A publication Critical patent/CN106847268A/en
Application granted granted Critical
Publication of CN106847268B publication Critical patent/CN106847268B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Abstract

The invention provides a compression method of a neural network acoustic model, which comprises the following steps: dividing row vectors of an output layer weight matrix W of the neural network acoustic model according to a specified dimensionDividing the vector into a plurality of sub-vectors; performing first-stage vector quantization on a plurality of sub-vectors to obtain a first-stage codebook, and replacing the sub-vectors of the matrix W with the first-stage codebook vectors to obtain the matrix W*(ii) a Using matrices W and W*Calculating a residual error matrix R, and performing two-stage vector quantization on a vector of R; obtaining a secondary codebook, and replacing the vector of the matrix R with the vector of the secondary codebook to obtain the matrix R*(ii) a Finally using the matrix W*And R*Representing the weight matrix W. The method can reduce the storage space of the neural network acoustic model, greatly reduce the quantization error and avoid the exponential increase of the codebook scale.

Description

Neural network acoustic model compression and voice recognition method
Technical Field
The invention relates to the field of voice recognition, in particular to a neural network acoustic model compression and voice recognition method.
Background
In the field of speech recognition, acoustic modeling by using Deep Neural Networks (DNN) has a good effect. The deep structure of DNN makes the model have strong learning ability, and results in huge model parameters, so it is difficult to apply DNN to acoustic modeling for speech recognition on mobile devices with weak computing power: the problems of large storage requirement and high computational complexity are mainly faced.
Vector quantization based methods are used to compress the DNN model, saving storage space and computational effort. The principle is as follows:
weight matrix for DNN
Figure BDA0000866448360000011
Each line thereof is orientedAll quantities are split into
Figure BDA0000866448360000012
Subvectors of dimension d:
Figure BDA0000866448360000013
wherein
Figure BDA0000866448360000014
Is the jth sub-vector of the ith row of the weight matrix W, the superscript T denotes the transpose,
Figure BDA0000866448360000015
thereafter, all the subvectors are quantized into K codebook vectors using a vector quantization method. Thus, the original M × N matrix can be represented by a codebook including K d-dimensional vectors, and further needs (log)2K) X (MJ) bits to record the index of each sub-vector in the codebook. In the forward calculation of DNN, because the sub-vectors in the same column are multiplied by the same activation vector, if a plurality of sub-vectors exist in the sub-vectors in the same column and are quantized into the same codebook vector, the multiplication results of the sub-vectors and the activation vector can be shared, thereby reducing the calculation times.
The method for compressing DNN by using vector quantization may cause DNN performance to be affected, and the affected degree depends on quantization error of vector quantization, however, in the conventional vector quantization, only a single-stage codebook is used, and when the codebook is small (i.e. the number of codebook vectors in the codebook is small), the quantization error is high, and in order to reduce the quantization error, the codebook size has to be exponentially increased, which may greatly increase the amount of computation, so that the method loses the significance of saving space and computation.
Disclosure of Invention
The invention aims to solve the problem of large quantization error of a method for compressing DNN by vector quantization, and provides a method for compressing DNN by using a multi-stage vector quantization method, adding a second-stage quantization, quantizing the residual error of the first-stage quantization again, and finally replacing the original weight matrix by using a two-stage codebook, thereby greatly reducing the quantization error and avoiding exponential increase of the codebook scale.
In order to achieve the above object, the present invention provides a compression method of a neural network acoustic model, the method comprising: dividing row vectors of an output layer weight matrix W of the neural network acoustic model into a plurality of sub-vectors according to a specified dimension; performing first-stage vector quantization on a plurality of sub-vectors to obtain a first-stage codebook, and replacing the sub-vectors of the matrix W with the first-stage codebook vectors to obtain the matrix W*(ii) a Using matrices W and W*Calculating a residual error matrix R, and performing two-stage vector quantization on a vector of R; obtaining a secondary codebook, and replacing the vector of the matrix R with the vector of the secondary codebook to obtain the matrix R*(ii) a Finally using the matrix W*And R*Representing the weight matrix W.
In the above technical solution, the method specifically includes:
step S1) splits the row vector of the output layer weight matrix W of the neural network acoustic model into sub-vectors of dimension d:
Figure BDA0000866448360000021
wherein W is an M × N matrix;
step S2) carrying out primary vector quantization on the sub-vectors obtained in the step S1) to obtain a primary codebook, and replacing the sub-vectors of the matrix W with the primary codebook vectors to obtain the matrix W*
Performing primary vector quantization on the subvectors obtained in the step S1) to obtain a primary codebook
Figure BDA0000866448360000022
Figure BDA0000866448360000023
The codebook contains K1A codebook vector, wherein a first-level codebook vector corresponding to the jth sub-vector of the ith row of the weight matrix W is set at C(1)In (1)Index value id(1)(i,j)∈{1,…,K1Is the corresponding codebook vector of
Figure BDA0000866448360000024
Using codebook vectors
Figure BDA0000866448360000025
Subvectors replacing the matrix W
Figure BDA0000866448360000026
Obtain matrix W*
Figure BDA0000866448360000027
Step S3) using the matrices W and W*Calculating a residual error matrix R, and performing two-stage vector quantization on a vector of R; obtaining a secondary codebook, and replacing the vector of the matrix R with the vector of the secondary codebook to obtain the matrix R*
Calculating a residual matrix R:
Figure BDA0000866448360000031
wherein the content of the first and second substances,
Figure BDA0000866448360000032
for vector
Figure BDA0000866448360000033
Performing two-stage vector quantization to obtain a two-stage codebook
Figure BDA0000866448360000034
The codebook contains K2A codebook vector corresponding to the jth sub-vector in the ith row of the weight matrix R is set as C(2)Index value in is id(2)(i,j)∈{1,…,K2Is the corresponding codebook vector of
Figure BDA0000866448360000035
Replacement of the corresponding sub-vectors of the matrix R by codebook vectors
Figure BDA0000866448360000036
Obtain a matrix R*
Figure BDA0000866448360000037
Step S4) uses the matrix W*And R*Representing the weight matrix W:
Figure BDA0000866448360000038
subvectors in the matrix W
Figure BDA0000866448360000039
Index in the two-level codebook is id(1)(i, j) and id(2)(i, j); thus storage W is converted to storage id(1)(i, j) and id(2)(i,j)。
In the above technical solution, the value of d in the step 1) satisfies: d is divisible by the number of rows N of the matrix W.
Based on the compression method of the neural network acoustic model, the invention also provides a voice recognition method, which comprises the following steps:
step T1) for the input speech feature vector, after the forward calculation of the input layer and the hidden layer, obtaining the vector
Figure BDA00008664483600000310
Splitting the vector into sub-vectors with dimension d to obtain
Figure BDA00008664483600000311
Wherein
Figure BDA00008664483600000312
Step T2) computing the output layer
Figure BDA00008664483600000313
The method specifically comprises the following steps:
the weight matrix W is composed of two codebooks C(1)And C(2)And corresponding index id(1)(i, j) and id(2)(i, j) where i ∈ {1,2, …, M },
Figure BDA00008664483600000314
go through
Figure BDA00008664483600000315
For i ═ 1,2, …, M, calculated sequentially
Figure BDA00008664483600000316
And
Figure BDA00008664483600000317
if in the process there is an id(k)(i,j)=id(k)(i′,j),k∈{1,2},i′>i, then calculating
Figure BDA0000866448360000041
When it is used directly
Figure BDA0000866448360000042
The result of (1); and (3) calculating:
Figure BDA0000866448360000043
obtaining an output: y ═ y1,…,yi,…,yM];
Step T3) carrying out softmax warping on y to obtain a likelihood value
Figure BDA0000866448360000044
Wherein
Figure BDA0000866448360000045
Step T4) sending a to a decoder for decoding; a recognition result in text form is obtained.
The invention has the advantages that: the method can reduce the storage space of the neural network acoustic model, greatly reduce the quantization error and avoid the exponential increase of the codebook scale.
Drawings
FIG. 1 is a flow chart of a neural network acoustic model compression method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
As shown in fig. 1, a method for compressing an acoustic model of a neural network, the method comprising:
step S1) splits the row vectors of the output layer weight matrix W of the neural network acoustic model (DNN) into subvectors of dimension d:
Figure BDA0000866448360000046
wherein W is an M × N matrix;
in this embodiment, the DNN model has 7 layers, wherein the scale of the weight matrix of 5 hidden layers is 5000 × 500, the scale of the weight matrix of the input layer is 5000 × 360, and the scale of the weight matrix of the output layer is 20000 × 500. The dimension of the input observation vector is 360, specifically, the dimension of the input observation vector is 40-dimensional features obtained by performing expansion, Linear Discriminant Analysis (LDA), Maximum Likelihood Linear Transformation (MLLT) and maximum likelihood linear regression (FMLLR) on 13-dimensional mel-domain cepstrum coefficient (MFCC) features, and then the input observation vector is subjected to expansion of 4 frames of context to obtain input features of (4+1+4) × 40 ═ 360 dimensions. The adopted data set is a standard English data set Switchboard, the training data is 286 hours, and the testing data is 3 hours; the output layer parameters account for about half of the total model parameters.
In the present embodiment, it is preferred that,
Figure BDA0000866448360000048
step S2) performs one-level vector quantization on the subvectors obtained in step 1) using a codebook of size 1024,obtaining a first-level codebook
Figure BDA0000866448360000047
The codebook contains K1A codebook vector corresponding to the jth sub-vector of the ith row of the weight matrix W is set as C(1)Index value in is id(1)(i,j)∈{1,…,K1Is the corresponding codebook vector of
Figure BDA0000866448360000051
Replacement of subvectors of matrix W by codebook vector identification
Figure BDA0000866448360000052
Obtain matrix W*
Figure BDA0000866448360000053
Step S3) using the matrices W and W*Calculating a residual error matrix R, and performing two-stage vector quantization on a vector of R; obtaining a secondary codebook, and replacing the vector of the matrix R with the vector of the secondary codebook to obtain the matrix R*
Calculating the residual error of the first-stage quantization to obtain a residual error matrix R:
Figure BDA0000866448360000054
wherein the content of the first and second substances,
Figure BDA0000866448360000055
performing secondary vector quantization on the residual vector by adopting a 1024-scale codebook to obtain a secondary codebook
Figure BDA0000866448360000056
The codebook contains K2A codebook vector corresponding to the jth sub-vector in the ith row of the weight matrix R is set as C(2)Index value in is id(2)(i,j)∈{1,…,K2Is the corresponding codebook vector of
Figure BDA0000866448360000057
Using codebook vectors
Figure BDA0000866448360000058
Sub-vectors replacing the corresponding matrix R
Figure BDA0000866448360000059
Obtain a matrix R*
Figure BDA00008664483600000510
Step S4) uses the matrix W*And R*Representing the weight matrix W:
Figure BDA00008664483600000511
subvectors in the matrix W
Figure BDA00008664483600000512
Index in the two-level codebook is id(1)(i, j) and id(2)(i, j); thus storage W is converted to storage id(1)(i, j) and id(2)(i,j);
The method of the invention inherits the characteristic that the traditional method can save the calculated amount, in the method, one sub-vector can be quantized into the sum of the codebook vectors belonging to two different levels of codebooks, therefore, in the DNN forward calculation process, the multiplication of a single sub-vector and an activation vector can be converted into the multiplication and the summation of two parts respectively:
Figure BDA00008664483600000513
the operation can be simplified if the sub-vectors in the same column share a codebook vector in the first or second stage quantization.
Based on the neural network acoustic model compression method, the invention also provides a voice recognition method; the method comprises the following steps:
step T1) for the input speech feature vector, after the forward calculation of the input layer and the hidden layer, obtaining the vector
Figure BDA0000866448360000061
Splitting the vector into sub-vectors with dimension d to obtain
Figure BDA0000866448360000062
Wherein
Figure BDA0000866448360000063
In the present embodiment, it is preferred that,
Figure BDA0000866448360000064
and output layer weight matrix
Figure BDA00008664483600000616
Correspondingly, M is 20000, N is 500, and d is 4.
Step T2) computing the output layer
Figure BDA0000866448360000065
Since the weight matrix W can be composed of two codebooks C(1)And C(2)And corresponding index id(1)(i, j) and id(2)(i, j) where i ∈ {1,2, …, M },
Figure BDA0000866448360000066
go through
Figure BDA0000866448360000067
For i ═ 1,2, …, M, calculated sequentially
Figure BDA0000866448360000068
And
Figure BDA0000866448360000069
if in the process there is an id(k)(i,j)=id(k)(i′,j),k∈{1,2},i′>i, then calculating
Figure BDA00008664483600000610
When it is used directly
Figure BDA00008664483600000611
Thereby saving the amount of calculation;
and (3) calculating:
Figure BDA00008664483600000612
obtaining an output: y ═ y1,…,yi,…,yM];
Step T3) carrying out softmax warping on y to obtain a likelihood value
Figure BDA00008664483600000613
Wherein
Figure BDA00008664483600000614
Step T4) sending a to a decoder for decoding; a recognition result in text form is obtained.
The performance of this example is analyzed below.
Testing Word Error Rates (WERs) of all models by using a test set, wherein the models are respectively an uncompressed model, a single-stage vector quantization compressed model (a 1024-scale codebook and an 8192-scale codebook) and a multi-stage vector quantization compressed model (the 1024-scale codebook is subjected to first-stage quantization, and the 1024-scale codebook is subjected to second-stage quantization);
the word error rate is calculated as follows:
Figure BDA00008664483600000615
the compression ratio is the ratio of the storage space required after the model is compressed and before the model is compressed, and the calculation formula is as follows:
Figure BDA0000866448360000071
wherein M and N are rows and columns of the matrix respectively and are respectively equal to 20000 and 500, J is the number of subvectors in each row, and the value is 500/4-125, K1And K2Respectively, the size of the two-stage codebook, sizeof (data) refers to the number of bits required to store a single data, such as 32 bits for floating point type data.
The storage space required by the weight matrix after the two-stage vector quantization compression is as follows:
sizeof(data)×d×(K1+K2)+log2(K1×K2)×M×J。
the results are shown in Table 1:
TABLE 1
Figure BDA0000866448360000072
The experimental result shows that the single-stage vector quantization is adopted, the quantization error is large, and the DNN performance after the single-stage vector quantization compression is obviously damaged; after the DNN is compressed by adopting multi-stage vector quantization, only two codebooks with smaller scale are needed, so that the quantization error can be greatly reduced, and the identification performance of the model is nearly lossless. The last two rows in the table are compared: "8192" and "1024 + 1024", although the compression ratio of the model after multi-level vector quantization is higher than that of the model after single-level vector quantization, because the newly added two-level codebook requires additional space to record indexes; however, due to the reduction of the total size of the codebook, the performance of the multi-stage vector quantization method in the aspect of reducing the calculation amount is better than that of the single-stage vector quantization method, and the performance lossless compression of DNN is realized while the exponential increase of the codebook size is avoided.

Claims (3)

1. A method of compression of a neural network acoustic model, the method comprising: dividing row vectors of an output layer weight matrix W of the neural network acoustic model into a plurality of sub-vectors according to a specified dimension; for a plurality of sub-vectorsLine-level vector quantization to obtain a level-level codebook, and replacing the sub-vector of the matrix W with the level-level codebook vector to obtain the matrix W*(ii) a Using matrices W and W*Calculating a residual error matrix R, and performing two-stage vector quantization on a vector of R; obtaining a secondary codebook, and replacing the vector of the matrix R with the vector of the secondary codebook to obtain the matrix R*(ii) a Finally using the matrix W*And R*Representing a weight matrix W;
the method specifically comprises the following steps:
step S1) splits the row vector of the output layer weight matrix W of the neural network acoustic model into sub-vectors of dimension d:
Figure FDA0002358137680000011
wherein W is an M × N matrix;
step S2) carrying out primary vector quantization on the sub-vectors obtained in the step S1) to obtain a primary codebook, and replacing the sub-vectors of the matrix W with the primary codebook vectors to obtain the matrix W*
Performing primary vector quantization on the subvectors obtained in the step S1) to obtain a primary codebook
Figure FDA0002358137680000012
Figure FDA0002358137680000013
The codebook contains K1A codebook vector, wherein a first-level codebook vector corresponding to the jth sub-vector of the ith row of the weight matrix W is set at C(1)Index value in is id(1)(i,j)∈{1,…,K1Is the corresponding codebook vector of
Figure FDA0002358137680000014
Using codebook vectors
Figure FDA0002358137680000015
Subvectors replacing the matrix W
Figure FDA0002358137680000016
Obtain matrix W*
Figure FDA0002358137680000017
Step S3) using the matrices W and W*Calculating a residual error matrix R, and performing two-stage vector quantization on a vector of R; obtaining a secondary codebook, and replacing the vector of the matrix R with the vector of the secondary codebook to obtain the matrix R*
Calculating a residual matrix R:
Figure FDA0002358137680000018
wherein the content of the first and second substances,
Figure FDA0002358137680000019
for vector
Figure FDA00023581376800000110
Performing two-stage vector quantization to obtain a two-stage codebook
Figure FDA00023581376800000111
The codebook contains K2A codebook vector corresponding to the jth sub-vector in the ith row of the weight matrix R is set as C(2)Index value in is id(2)(i,j)∈{1,…,K2Is the corresponding codebook vector of
Figure FDA0002358137680000021
Replacement of the corresponding sub-vectors of the matrix R by codebook vectors
Figure FDA0002358137680000022
Obtain a matrix R*
Figure FDA0002358137680000023
Step S4) uses the matrix W*And R*Representing the weight matrix:
Figure FDA0002358137680000024
subvectors in the matrix W
Figure FDA0002358137680000025
Index in the two-level codebook is id(1)(i, j) and id(2)(i, j); thus storage W is converted to storage id(1)(i, j) and id(2)(i,j)。
2. The compression method of the neural network acoustic model according to claim 1, wherein the value of d in the step S1) satisfies the following condition: d is divisible by the number of columns N of the matrix W.
3. A speech recognition method implemented based on the compression method of the neural network acoustic model of claim 2, the method comprising:
step T1) for the input speech feature vector, after the forward calculation of the input layer and the hidden layer, obtaining the vector
Figure FDA0002358137680000026
Splitting the vector into sub-vectors with dimension d to obtain
Figure FDA0002358137680000027
Wherein
Figure FDA0002358137680000028
Step T2) calculates the output layer y-W · x,
Figure FDA0002358137680000029
the method specifically comprises the following steps:
the weight matrix W is composed of two codebooks C(1)And C(2)And corresponding index id(1)(i, j) and id(2)(i, j) where i ∈ {1,2, …, M },
Figure FDA00023581376800000210
go through
Figure FDA00023581376800000211
For i ═ 1,2, …, M, calculated sequentially
Figure FDA00023581376800000212
And
Figure FDA00023581376800000213
if in the process there is an id(k)(i,j)=id(k)(i′,j),k∈{1,2},i′>i, then calculating
Figure FDA00023581376800000214
Figure FDA00023581376800000215
When it is used directly
Figure FDA00023581376800000216
The result of (1); and (3) calculating:
Figure FDA0002358137680000031
obtaining an output: y ═ y1,…,yi,…,yM];
Step T3) carrying out softmax warping on y to obtain a likelihood value
Figure FDA0002358137680000032
Wherein
Figure FDA0002358137680000033
Step T4) sending a to a decoder for decoding; a recognition result in text form is obtained.
CN201510881044.4A 2015-12-03 2015-12-03 Neural network acoustic model compression and voice recognition method Active CN106847268B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510881044.4A CN106847268B (en) 2015-12-03 2015-12-03 Neural network acoustic model compression and voice recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510881044.4A CN106847268B (en) 2015-12-03 2015-12-03 Neural network acoustic model compression and voice recognition method

Publications (2)

Publication Number Publication Date
CN106847268A CN106847268A (en) 2017-06-13
CN106847268B true CN106847268B (en) 2020-04-24

Family

ID=59149498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510881044.4A Active CN106847268B (en) 2015-12-03 2015-12-03 Neural network acoustic model compression and voice recognition method

Country Status (1)

Country Link
CN (1) CN106847268B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109147773B (en) * 2017-06-16 2021-10-26 上海寒武纪信息科技有限公司 Voice recognition device and method
CN110809771A (en) * 2017-07-06 2020-02-18 谷歌有限责任公司 System and method for compression and distribution of machine learning models

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982803A (en) * 2012-12-11 2013-03-20 华南师范大学 Isolated word speech recognition method based on HRSF and improved DTW algorithm

Also Published As

Publication number Publication date
CN106847268A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
Tang et al. Deep speaker embedding learning with multi-level pooling for text-independent speaker verification
US20100217753A1 (en) Multi-stage quantization method and device
US10115393B1 (en) Reduced size computerized speech model speaker adaptation
Chang et al. A Segment-based Speech Recognition System for Isolated Mandarin Syllables
JP2004341532A (en) Adaptation of compressed acoustic model
Senior et al. Fine context, low-rank, softplus deep neural networks for mobile speech recognition
Hong et al. Statistics pooling time delay neural network based on x-vector for speaker verification
US8386249B2 (en) Compressing feature space transforms
CN111008517A (en) Tensor decomposition technology-based neural language model compression method
CN106847268B (en) Neural network acoustic model compression and voice recognition method
CN111814448A (en) Method and device for quantizing pre-training language model
CN114418088A (en) Model training method
US9792910B2 (en) Method and apparatus for improving speech recognition processing performance
CN112652299B (en) Quantification method and device of time series speech recognition deep learning model
US20180165578A1 (en) Deep neural network compression apparatus and method
Sakthi et al. Speech Recognition model compression
US20220092382A1 (en) Quantization for neural network computation
Marcheret et al. Optimal quantization and bit allocation for compressing large discriminative feature space transforms
CN111368976B (en) Data compression method based on neural network feature recognition
CN117133275B (en) Parallelization voice recognition model establishment method based on unit dot product similarity characteristics
Paliwal et al. Scalable distributed speech recognition using multi-frame GMM-based block quantization.
Pereira et al. Evaluating Robustness to Noise and Compression of Deep Neural Networks for Keyword Spotting
Sun et al. Combination of sparse classification and multilayer perceptron for noise-robust ASR
Hamid Speaker Sound Coding Using Vector Quantization Technique (Vq)
Tan et al. Quantization of speech features: source coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant