CN105632505B - Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model - Google Patents
Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model Download PDFInfo
- Publication number
- CN105632505B CN105632505B CN201410710991.2A CN201410710991A CN105632505B CN 105632505 B CN105632505 B CN 105632505B CN 201410710991 A CN201410710991 A CN 201410710991A CN 105632505 B CN105632505 B CN 105632505B
- Authority
- CN
- China
- Prior art keywords
- frequency band
- vector
- coefficient
- mapping
- mapping matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013507 mapping Methods 0.000 title claims abstract description 147
- 238000000513 principal component analysis Methods 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims abstract description 47
- 239000011159 matrix material Substances 0.000 claims abstract description 121
- 238000013139 quantization Methods 0.000 claims abstract description 44
- 238000012545 processing Methods 0.000 claims abstract description 29
- 239000013598 vector Substances 0.000 claims description 109
- 230000005236 sound signal Effects 0.000 claims description 19
- 230000000873 masking effect Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to a coding and decoding method and a device of a Principal Component Analysis (PCA) mapping model, wherein the coding method comprises the following steps: performing frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group; determining a first mapping matrix for each of the band groups, the first mapping matrix being a mapping matrix of a set of PCA mapping models common to the bands in the band group; and carrying out quantization coding on the first mapping matrix. As can be seen from the above, when the PCA mapping model is encoded, the mapping matrix corresponding to each frequency band after the frequency band division is not encoded, but the number of the mapping matrices to be encoded is reduced from the original mapping matrix corresponding to each frequency band to the mapping matrix corresponding to each frequency band group through the frequency band combination processing, thereby effectively reducing the encoding code rate.
Description
Technical Field
The present invention relates to the field of audio processing technologies, and in particular, to a coding and decoding method and apparatus for a Principal Component Analysis (PCA) mapping model.
Background
With the development of technology, a variety of coding techniques for sound signals have appeared, and the sound signals are generally digital sounds including signals perceivable to human ears, such as speech, music, natural sounds, and artificially synthesized sounds. When a multi-channel sound signal is encoded, encoding of a PCA mapping model is usually involved.
In the prior art, when a multi-channel sound signal is encoded, a frequency band of the multi-channel sound signal is divided first, and accordingly, when a PCA mapping model is encoded, a mapping matrix corresponding to each divided frequency band is quantized and encoded, and because the number of mapping matrices to be encoded is large, the encoding rate of the PCA mapping model is too high.
Disclosure of Invention
The invention provides a coding and decoding method and device of a PCA mapping model, which effectively reduce the coding code rate of the PCA mapping model.
In order to achieve the above object, in a first aspect, the present invention provides a coding method for a PCA mapping model, the method including:
performing frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
determining a first mapping matrix for each of the band groups, the first mapping matrix being a mapping matrix of a set of PCA mapping models common to the bands in the band group;
and carrying out quantization coding on the first mapping matrix.
In a second aspect, the present invention provides a method for decoding a PCA mapping model, where the method includes:
determining a vector encoded in the encoded mapping matrix;
decoding the coded coefficient in the vector to obtain a reconstruction value of the coefficient;
reconstructing the vector from the reconstructed values of the coefficients;
and reconstructing the mapping matrix according to the vector, wherein the mapping matrix is determined for each band group in each band group after performing band combination processing on each band after the band division to obtain each band group.
In a third aspect, the present invention provides an apparatus for coding a PCA mapping model, the apparatus comprising:
a frequency band combination unit, configured to perform frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
a matrix determining unit, configured to determine a first mapping matrix for each of the band groups obtained by the band combining unit, where the first mapping matrix is a mapping matrix of a group of PCA mapping models shared by the bands in the band group;
and the coding unit is used for carrying out quantization coding on the first mapping matrix determined by the matrix determination unit.
In a fourth aspect, the present invention provides an apparatus for decoding a PCA mapping model, the apparatus comprising:
a vector determination unit for determining a vector encoded in the encoded mapping matrix;
a decoding unit, configured to decode the encoded coefficient in the vector determined by the vector determination unit to obtain a reconstruction value of the coefficient;
a vector reconstruction unit for reconstructing the vector from the reconstructed value of the coefficient obtained by the decoding unit;
and the matrix reconstruction unit is used for reconstructing the mapping matrix according to the vector reconstructed by the vector reconstruction unit, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division is obtained.
The coding method of the PCA mapping model according to the embodiment of the present invention includes performing band combination processing on each frequency band after frequency band division to obtain each frequency band group, then determining a first mapping matrix for each frequency band group in each frequency band group, where the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by each frequency band in the frequency band group, and then performing quantization coding on the first mapping matrix. As can be seen from the above, when the PCA mapping model is encoded, the mapping matrix corresponding to each frequency band after the frequency band division is not encoded, but the number of the mapping matrices to be encoded is reduced from the original mapping matrix corresponding to each frequency band to the mapping matrix corresponding to each frequency band group through the frequency band combination processing, thereby effectively reducing the encoding code rate.
Drawings
FIG. 1 is a flow chart of a method for encoding a PCA mapping model according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an encoding apparatus of a PCA mapping model according to another embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Fig. 1 is a flowchart of a coding method of a PCA mapping model in an embodiment of the present invention, in which a mapping matrix of each band after band division is first subjected to band combination processing, and then a mapping matrix of a selected code is subjected to quantization coding, the method including:
step 101, performing band combination processing on each band after band division to obtain each band group.
The frequency band groups can be obtained by performing frequency band combination processing on each frequency band after frequency band division according to the characteristics of the frequency band signals and/or the psychoacoustic model and/or the similarity of model parameters.
In the embodiment of the invention, the frequency band combination processing can be specifically performed by adopting any one of the following modes or any combination of the following modes, namely, the first mode is used for comparing the energy of two adjacent frequency bands, and when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, the two frequency bands are combined and divided into a frequency band group; a second mode, calculating a masking threshold of a certain frequency band according to a psychoacoustic model, and when the energy of the frequency band is lower than the masking threshold, combining the frequency band with an adjacent frequency band, and dividing the two frequency bands into a frequency band group; in a third way, the distance between the mapping matrices of two or several adjacent frequency bands is calculated, and when the maximum distance is smaller than the distance threshold, the two or several frequency bands are combined and divided into a band group.
Step 102, determining a first mapping matrix for each of the band groups, where the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by the bands in the band group.
When determining the first mapping matrix for the frequency band group, one mapping matrix may be selected from the mapping matrices corresponding to each frequency band in the frequency band group as the first mapping matrix, for example, the mapping matrix corresponding to the frequency band with the highest frequency band energy may be selected as the first mapping matrix; the mapping matrix may also be recalculated for the band group. In the embodiment of the present invention, the first mapping matrix may be determined for each frequency band group in various ways.
Step 103, performing quantization coding on the first mapping matrix.
In order to further reduce the coding rate, in the embodiment of the invention, all the coefficients in the first mapping matrix are not subjected to quantization coding, but part of the coefficients are selected from the first mapping matrix for quantization coding according to the characteristics of a PCA mapping model.
Specifically, the coefficients to be encoded may be selected from the first mapping matrix according to the dimensionality of the PCA analysis and the number of groups of the multi-channel sound signal to be encoded, and quantized and encoded.
Further, the vector needing to be coded in the first mapping matrix can be determined according to the PCA grouping number and the grouping condition of the multi-channel sound signal selected for coding; and carrying out quantization coding on the coefficient to be coded in the vector.
The following describes the quantization coding of the mapping matrix in detail.
Because the mapping matrix is composed of a series of coefficients, the coefficients to be coded can be selected from the mapping matrix and quantized and coded according to the dimensionality of PCA analysis and the grouping number of the multi-channel sound signals for coding in the embodiment of the invention. According to the relation among the coefficients of the mapping matrix, it can be known that not all matrix coefficients need to be quantized and encoded, and some matrix coefficients do not need to be encoded, and can be obtained by operation according to the encoded coefficient values, and some matrix coefficients only need to be encoded with sign bits. By organizing and selecting the coefficients, the purpose of reducing the coding rate can be achieved.
When performing PCA analysis on two channel signals, the mapping matrix W (t, k) is a2 x2 matrix with 4 coefficients, where t is the frame (or sub-frame) number and k is the frequency number.
W (t, k) can be represented by the following formula:
w (t, k) is an unit orthogonal matrix, and satisfies the following conditions:
thus, W (t, k) can be expressed as
From the above, it can be seen that only β or its transformed form, such as cos β or sin β, etc., need to be encoded.
When performing PCA analysis on four channel signals, the mapping matrix W (t, k) is a 4 x 4 matrix with 16 coefficients, W (t, k) can be represented by:
w (t, k) is an unit orthogonal matrix, and satisfies the following conditions:
when only the first principal component in a multi-channel sound signal is encoded, only a11, a12, a13, a14 need be encoded. Because of satisfaction ofThree important coefficients can be selected from the four coefficients a11, a12, a13 and a14 for quantization coding, while the fourth coefficient is only coded with sign bit or not coded, and the absolute value of the fourth coefficient is obtained by solving the first three coefficients. The selection may be based on the absolute value of the coefficient, the positional relationship, or the like. For example, if a14 is selected to encode only the sign bit and the rest coefficients are quantized and encoded, the absolute value of a14 can be represented by the formulaAnd (4) calculating. For example, the coefficient with the maximum absolute value is selected to be only subjected to symbol coding, and the rest coefficients are subjected to quantization coding; if the solving process of W (t, k) ensures that the coefficient with the maximum absolute value in each vector is a positive value or a negative value, the coefficient with the maximum absolute value is not coded, and the rest coefficients are quantized and coded.
When the first and second principal components in the second multi-channel sound signal are encoded, it is necessary to encode a11, a12, a13, a14, a21, a22, a23, and a 24. Because of satisfaction of
So that there are
Therefore, one coefficient from a11, a12, a13 and a14 can be selected to be only coded with sign bit or not coded, and the other 3 coefficients are quantized and coded; for a21, a22, a23 and a24, 2 coefficients can be selected for quantization coding, and the other 2 coefficients are derived by using the above relation, for example, a21 and a22 are selected for quantization coding, then a23 and a24 satisfy:
solving the equation can yield one or two solutionsWhen two groups of solutions are obtained, it is necessary to determine which group of solutions matches the original data a23 and a24, if soIf yes, making selectflag equal to 0; otherwise, let Selectflag equal to 1, Selectflag also needs to be coded. Due to errors in the quantization process of the coefficients, the equation set may sometimes be solved or solvedThere is a large error from the original data. At this time, the condition may not be utilized
But only utilize
One coefficient is selected from a11, a12, a13, a14, a21, a22, a23 and a24 to perform only sign bit coding, and the other 6 coefficients are subjected to quantization coding, for example, when only sign bit coding is selected for a14 and a24, the absolute value of a14 is obtained by solving a11, a12 and a13, and the absolute value of a24 is obtained by solving a21, a22 and a 23.
Generally, when performing PCA on M channel signals, W (t, k) is a matrix of M × M, having M × M coefficients, and W (t, k) is an orthonormal matrix, which can be expressed as the following equation:
when u groups of signals in the multi-channel sound signal are coded, only the coefficients need to be coded. Conditions satisfied by the matrix coefficients:the number of the coefficients needing to be quantized is reduced, and some coefficients are coded with only sign bits or are not coded.
In the embodiment of the present invention, the quantization coding of the coefficients of the mapping matrix W (t, k) may adopt a scalar coding mode, and may also adopt a vector coding method; the coefficients of W (t, k) may be encoded directly or in some transform form of W (t, k).
The step of quantization encoding the mapping matrix W (t, k) may include:
step 1, determining the vector needing to be coded in a mapping matrix W (t, k) according to the PCA grouping number M and the grouping condition of the selected coding in the multi-channel sound signal
In the step 2, the step of mixing the raw materials,for vectorThe coefficients to be coded are quantized and coded.
In this embodiment of the present invention, when the number of PCA packets is 2, the encoding method may further include: determining a location indicator, the location indicator being indicative of the first coefficient; and when the first coefficient is subjected to quantization coding, performing quantization coding on the position identifier. For example, when the PCA packet number M is 2, the pair vector is selectedThe encoding is carried out, and the specific steps are as follows:
step 1, determining a position identifier Dataposflag, wherein if the absolute value of a11 is less than the absolute value of a12, the Dataposflag is 1, and the data aq to be quantized is a11, otherwise, the Dataposflag is 0, and the data aq to be quantized is a 12;
and 2, carrying out quantization coding on the Dataposflag and the aq.
In the embodiment of the present invention, when the number of PCA packets is 3, performing quantization coding on a coefficient to be coded in a vector may specifically include: determining first position information and second position information according to the magnitude relation of each coefficient in the vector, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value; and carrying out quantization coding on the coefficient with the minimum absolute value, the coefficient with the second minimum absolute value, the first position information and the second position information in the vector.
For example, when the number M of PCA packets is 3 and the first and second principal components in the multi-channel audio signal are selectively encoded, the specific steps are as follows:
step 1, forA11, a12, anda21 in (1) for quantization codingAnd obtaining a reconstructed value
Step 2, coding the sign bit sign13 of a13, and calculating to obtain a13 reconstruction valueIf a13 is a positive number, sign13 equals 1, otherwise sign13 equals 0;the calculation formula of (a) is as follows:
step 3, solving the following equation setObtain two groups of solutions
Step 4, compare { a22, a23} withIf it is notCloser to { a22, a23}, then selectflag is 0; otherwise, selectflag is 1. A selectflag is encoded.
In the embodiment of the invention, the process utilizes the coefficient vector of the mapping matrixAndare unit vectors and are orthogonal to each other, and may cause the equation set to be unsolved or cause the equation set to be unsolved due to errors in the quantization process{ a22, a23} quantization error is large, which causes problems such as unstable mapping matrix, and so on, so that it is possible to choose to use only coefficient vectorAndall are the property of unit vector, and do not use the property of mutually orthogonal vectors, in this case, the specific encoding steps are as follows:
step 1, forA11, a12, anda21 and a22 in the above step (1) are subjected to quantization coding;
step 2, sign bits sign13 and sign23 of a13 and a23 are encoded, if a13 is a positive number, sign13 is equal to 1, otherwise sign13 is equal to 0; if a23 is positive, sign23 equals 1, otherwise sign23 equals 0.
When the first and second principal components in a multichannel audio signal are selectively encoded with the PCA packet number M being 3, a vector in a mapping matrix W (t, k) is selectedAndthe encoding may also be performed by first ordering the coefficients and then quantizing them, and the specific steps are as follows:
step 1, according toDetermining position information miniindex 11 and miniindex 12 according to the size relation of the coefficients, wherein miniindex 11 is the position of the coefficient with the smallest absolute value, and miniindex 12 is the position of the coefficient with the second smallest absolute value; according toDetermining position information miniindex 21 and miniindex 22 according to the size relation of the coefficients, wherein miniindex 21 is the position of the coefficient with the smallest absolute value, and miniindex 22 is the position of the coefficient with the second smallest absolute value;
step 2, encoding miniindex 11, miniindex 12, miniindex 21 and miniindex 22, and carrying out quantization encoding on the coefficient with the minimum absolute value and the second smallest absolute value;
in order to improve the coding efficiency, two or more of minidex 11, minidex 12, minidex 21 and minidex 22 can be combined together for quantization, and a huffman coding equal entropy coding method can be adopted to reduce the code rate.
In this case, the method can be further utilizedAndthe orthogonal property further reduces the number of coefficients to be quantized and encoded, and the specific process is similar to the process of encoding according to the original sequence, and is not described herein again.
As can be seen from the above processing procedure, in the coding method of the PCA mapping model according to the embodiment of the present invention, frequency band combination processing is performed on each frequency band after frequency band division to obtain each frequency band group, then a first mapping matrix is determined for each frequency band group in each frequency band group, the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by each frequency band in the frequency band group, and then quantization coding is performed on the first mapping matrix. As can be seen from the above, when the PCA mapping model is encoded, the mapping matrix corresponding to each frequency band after the frequency band division is not encoded, but the number of the mapping matrices to be encoded is reduced from the original mapping matrix corresponding to each frequency band to the mapping matrix corresponding to each frequency band group through the frequency band combination processing, thereby effectively reducing the encoding code rate.
The embodiment of the present invention further provides a decoding method of the PCA mapping model, which may specifically include the following processing procedures:
step one, determining a vector coded in a coded mapping matrix;
step two, decoding the coded coefficient in the vector to obtain a reconstruction value of the coefficient;
reconstructing the vector according to the reconstruction value of the coefficient;
and fourthly, reconstructing the mapping matrix according to the vector, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division to obtain each frequency band group.
Preferably, when the number of PCA packets is 2, before reconstructing the vector from the reconstructed values of the coefficients, the decoding method may further include: decoding the coding of the position identifier to obtain the position identifier, wherein the position identifier is used for indicating the position of the coded coefficient in the vector; the reconstructing the vector according to the reconstructed value of the coefficient may specifically include: and reconstructing the vector according to the position identification and the reconstruction value of the coefficient.
Preferably, when the number of PCA groups is 3, and the coefficients include a coefficient having a smallest absolute value and a coefficient having a second smallest absolute value in the vector, before reconstructing the vector from a reconstructed value of the coefficients, the method may further include: decoding the coding of the first position information and the coding of the second position information to obtain the first position information and the second position information, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value; the reconstructing the vector according to the reconstructed value of the coefficient may specifically include: determining a reconstruction value of a coefficient with the maximum absolute value in the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the first position information and the second position information; and reconstructing the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the reconstruction value of the coefficient with the maximum absolute value in the vector, the first position information and the second position information.
In particular, the decoding of the mapping matrix W (t, k) may comprise the steps of:
step 1, vector pairDecoding the coefficient to be coded;
step 2, according to the decoded vectorThe mapping matrix W (t, k) is reconstructed.
For example, when the number M of PCA packets is 2, the specific decoding steps are as follows:
step 1, decoding position identifiers Dataposflag and aq;
step 2, determining that if the Dataposflag is 1 according to the Dataposflag and the aqOtherwise
And step 3, reconstructing W (t, k).
When the PCA grouping number M is 3 and the first and second principal components in the multi-channel sound signal are selected to be coded, the decoding method specifically comprises the following steps:
step 1, decoding to obtainIn (1)A21 in (1), sign13 and sign bit selectflag;
step 2, according tosign13 is obtained by calculation
Step 3, solving the following equations to obtain two solutions
Step 4, if selectflag is 1, useReplacement of
Step 5, obtainingAnd reconstruct W (t, k).
When selecting to use only coefficient vectorsAndall are the property of unit vector, and when quantization coding is carried out without utilizing the property of mutual orthogonality of vectors, the corresponding decoding specific steps are as follows:
step 1, decoding to obtainIn (1)In (1)And sign bits sign13, sign 23;
step 2, according tosign13 and sign23Andand reconstruct
Step 3, according toReconstructing W (t, k);
when the method of sequencing the coefficients and then quantizing the codes is selected, the corresponding decoding steps are as follows:
step 1, decoding to obtain minidex 11, minidex 12, minidex 21, minidex 22, and reconstructed values aq11, aq12, aq21 and aq22 of coefficients with minimum absolute values and the second smallest absolute values;
step 2, calculating according to the reconstruction values aq11, aq12, aq21 and aq22Andthe coefficients aq13 and aq23 of the maximum absolute value of each of them.
And step 3, reconstructing according to the position information, namely, the nonindex 11, the minidex 12, the minidex 21, the minidex 22, the aq11, the aq12, the aq21, the aq22, the aq13 and the aq23 obtained by decoding
Step 4, according toW (t, k) is reconstructed.
Fig. 2 is a schematic structural diagram of an encoding apparatus of a PCA mapping model according to an embodiment of the present invention, the apparatus including:
a band combining unit 201, configured to perform band combining processing on each band after band division to obtain each band group;
a matrix determining unit 202, configured to determine a first mapping matrix for each of the band groups obtained by the band combining unit 201, where the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by the bands in the band group;
an encoding unit 203, configured to perform quantization encoding on the first mapping matrix determined by the matrix determination unit 202.
Preferably, the frequency band combining unit 201 is specifically configured to perform frequency band combining processing on each frequency band after frequency band division according to characteristics of the frequency band signal and/or a psychoacoustic model and/or a model parameter similarity, so as to obtain each frequency band group.
Preferably, the frequency band combining unit 201 specifically includes:
the first frequency band combination subunit is used for comparing the energy of two adjacent frequency bands, and when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, combining the two frequency bands to divide the two frequency bands into a frequency band group; and/or
A second frequency band combination subunit, configured to calculate a masking threshold of a certain frequency band according to a psychoacoustic model, and when energy of the frequency band is lower than the masking threshold, combine the frequency band with an adjacent frequency band, and divide the two frequency bands into a frequency band group; and/or
And the third frequency band combination subunit is used for calculating the distance between the mapping matrixes of two or more adjacent frequency bands, and combining the two or more frequency bands to divide the two or more frequency bands into a frequency band group when the maximum distance is smaller than a distance threshold value.
Preferably, the mapping matrix is composed of a series of coefficients, and the encoding unit 203 is specifically configured to select coefficients to be encoded from the first mapping matrix and perform quantization encoding according to the dimensionality of PCA analysis and the number of groups of the multi-channel sound signal to be encoded.
Preferably, the encoding unit 203 specifically includes:
the vector determining subunit is used for determining a vector to be encoded in the mapping matrix W (t, k) according to the PCA grouping number M and the grouping condition of the multi-channel sound signals selected for encoding;
and the coding subunit is used for carrying out quantization coding on the coefficient needing to be coded in the vector determined by the vector determination subunit.
Correspondingly, the embodiment of the invention also provides a decoding device of the PCA mapping model, and the device comprises:
a vector determination unit for determining a vector encoded in the encoded mapping matrix;
a decoding unit, configured to decode the encoded coefficient in the vector determined by the vector determination unit to obtain a reconstruction value of the coefficient;
a vector reconstruction unit for reconstructing the vector from the reconstructed value of the coefficient obtained by the decoding unit;
and the matrix reconstruction unit is used for reconstructing the mapping matrix according to the vector reconstructed by the vector reconstruction unit, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division is obtained.
Preferably, the decoding unit is further configured to: when the PCA grouping number is 2, before the vector reconstruction unit reconstructs the vector according to the reconstruction value of the coefficient, decoding the code of the position identifier to obtain the position identifier, wherein the position identifier is used for indicating the position of the coded coefficient in the vector;
the vector reconstruction unit is specifically configured to: and reconstructing the vector according to the position identifier obtained by the decoding unit and the reconstruction value of the coefficient.
Preferably, the decoding unit is further configured to: when the number of PCA packets is 3, and the coefficients comprise a coefficient with the minimum absolute value and a coefficient with the second minimum absolute value in the vector, before the vector reconstruction unit reconstructs the vector according to the reconstruction value of the coefficients, decoding the coding of first position information and the coding of second position information to obtain first position information and second position information, wherein the first position information is used for indicating the position of the coefficient with the minimum absolute value, and the second position information is used for indicating the position of the coefficient with the second minimum absolute value;
the vector reconstruction unit is specifically configured to: determining a reconstruction value of a coefficient with the maximum absolute value in the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector obtained by the decoding unit, the reconstruction value of the coefficient with the second smallest absolute value, the first position information and the second position information; and reconstructing the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the reconstruction value of the coefficient with the maximum absolute value in the vector, the first position information and the second position information.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (13)
1. A coding method of a Principal Component Analysis (PCA) mapping model is characterized by comprising the following steps:
performing frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
determining a first mapping matrix for each of the band groups, the first mapping matrix being a mapping matrix of a set of PCA mapping models common to the bands in the band group;
the mapping matrix is composed of a series of coefficients, and the first mapping matrix is subjected to quantization coding; and determining a vector required to be coded in the first mapping matrix according to the PCA grouping number and the grouping condition selected to be coded in the multi-channel sound signals, and carrying out quantization coding on a coefficient required to be coded in the vector.
2. The method of claim 1, wherein the performing band combination processing on each frequency band after the frequency band division to obtain each frequency band group specifically comprises:
and performing frequency band combination processing on each frequency band after frequency band division according to the characteristics of the frequency band signals and/or the psychoacoustic model and/or the similarity of model parameters to obtain each frequency band group.
3. The method of claim 1, wherein the performing band combination processing on each frequency band after the frequency band division to obtain each frequency band group specifically comprises:
comparing the energy of two adjacent frequency bands, and combining the two frequency bands when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, so as to divide the two frequency bands into a frequency band group; and/or
Calculating a masking threshold of a certain frequency band according to a psychoacoustic model, and when the energy of the frequency band is lower than the masking threshold, combining the frequency band with an adjacent frequency band, and dividing the two frequency bands into a frequency band group; and/or
And calculating the distance between the mapping matrixes of two or more adjacent frequency bands, and combining the two or more frequency bands when the maximum distance is smaller than a distance threshold value to divide the two or more frequency bands into a frequency band group.
4. The method according to claim 1, wherein the quantization encoding of the coefficients to be encoded in the vector specifically comprises:
and selecting a first coefficient from the vector according to the property that the first mapping matrix is an orthonormal matrix of the unit or the property that the first mapping matrix is an identity matrix, carrying out quantization coding on the first coefficient, and carrying out no coding or sign bit coding on the rest coefficients in the vector.
5. The method of claim 4, wherein the number of PCA packets is 2, the method further comprising:
determining a location indicator, the location indicator being indicative of the first coefficient;
and when the first coefficient is subjected to quantization coding, performing quantization coding on the position identifier.
6. The method according to any of claims 1 to 4, wherein the number of PCA packets is 3, and the performing quantization coding on the coefficients to be coded in the vector specifically comprises:
determining first position information and second position information according to the magnitude relation of each coefficient in the vector, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value;
and carrying out quantization coding on the coefficient with the minimum absolute value, the coefficient with the second minimum absolute value, the first position information and the second position information in the vector.
7. A method for decoding a Principal Component Analysis (PCA) mapping model, the method comprising:
determining a vector encoded in the encoded mapping matrix;
decoding the coded coefficient in the vector to obtain a reconstruction value of the coefficient;
reconstructing the vector from the reconstructed values of the coefficients;
and reconstructing the mapping matrix according to the vector, wherein the mapping matrix is determined for each band group in each band group after performing band combination processing on each band after the band division to obtain each band group.
8. The method of claim 7, wherein the number of PCA groupings is 2, and wherein prior to said reconstructing the vector from the reconstructed values of the coefficients, the method further comprises:
decoding the coding of the position identifier to obtain the position identifier, wherein the position identifier is used for indicating the position of the coded coefficient in the vector;
reconstructing the vector according to the reconstruction value of the coefficient specifically includes: and reconstructing the vector according to the position identification and the reconstruction value of the coefficient.
9. The method of claim 7, wherein the number of PCA groups is 3, the coefficients include a coefficient having a smallest absolute value and a coefficient having a second smallest absolute value in the vector, and before reconstructing the vector from the reconstructed values of the coefficients, the method further comprises:
decoding the coding of the first position information and the coding of the second position information to obtain the first position information and the second position information, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value;
reconstructing the vector according to the reconstruction value of the coefficient specifically includes:
determining a reconstruction value of a coefficient with the maximum absolute value in the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the first position information and the second position information;
and reconstructing the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the reconstruction value of the coefficient with the maximum absolute value in the vector, the first position information and the second position information.
10. An apparatus for encoding a Principal Component Analysis (PCA) mapping model, the apparatus comprising:
a frequency band combination unit, configured to perform frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
a matrix determining unit, configured to determine a first mapping matrix for each of the band groups obtained by the band combining unit, where the first mapping matrix is a mapping matrix of a group of PCA mapping models shared by the bands in the band group;
and the coding unit is used for carrying out quantization coding on the first mapping matrix determined by the matrix determination unit, wherein the mapping matrix consists of a series of coefficients, determining a vector needing to be coded in the first mapping matrix according to the PCA grouping number and the grouping condition of the selected coding in the multi-channel sound signals, and carrying out quantization coding on the coefficient needing to be coded in the vector.
11. The apparatus according to claim 10, wherein the frequency band combining unit is specifically configured to perform frequency band combining processing on each frequency band after the frequency band division according to characteristics of the frequency band signal and/or a psychoacoustic model and/or a similarity of model parameters to obtain each frequency band group.
12. The apparatus as claimed in claim 10, wherein said band combining unit specifically comprises:
the first frequency band combination subunit is used for comparing the energy of two adjacent frequency bands, and when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, combining the two frequency bands to divide the two frequency bands into a frequency band group; and/or
A second frequency band combination subunit, configured to calculate a masking threshold of a certain frequency band according to a psychoacoustic model, and when energy of the frequency band is lower than the masking threshold, combine the frequency band with an adjacent frequency band, and divide the two frequency bands into a frequency band group; and/or
And the third frequency band combination subunit is used for calculating the distance between the mapping matrixes of two or more adjacent frequency bands, and combining the two or more frequency bands to divide the two or more frequency bands into a frequency band group when the maximum distance is smaller than a distance threshold value.
13. An apparatus for decoding a Principal Component Analysis (PCA) mapping model, the apparatus comprising:
a vector determination unit for determining a vector encoded in the encoded mapping matrix;
a decoding unit, configured to decode the encoded coefficient in the vector determined by the vector determination unit to obtain a reconstruction value of the coefficient;
a vector reconstruction unit for reconstructing the vector from the reconstructed value of the coefficient obtained by the decoding unit;
and the matrix reconstruction unit is used for reconstructing the mapping matrix according to the vector reconstructed by the vector reconstruction unit, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division is obtained.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410710991.2A CN105632505B (en) | 2014-11-28 | 2014-11-28 | Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model |
PCT/CN2014/095393 WO2016082278A1 (en) | 2014-11-28 | 2014-12-29 | Encoding/decoding method and apparatus for principal component analysis (pca) mapping module |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410710991.2A CN105632505B (en) | 2014-11-28 | 2014-11-28 | Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105632505A CN105632505A (en) | 2016-06-01 |
CN105632505B true CN105632505B (en) | 2019-12-20 |
Family
ID=56047346
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410710991.2A Active CN105632505B (en) | 2014-11-28 | 2014-11-28 | Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105632505B (en) |
WO (1) | WO2016082278A1 (en) |
Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1208541A (en) * | 1996-11-15 | 1999-02-17 | 菲利浦电子有限公司 | A mono-stereo conversion device, an audio reproduction system using such a device and a mono-stereo conversion method |
EP0987827A2 (en) * | 1998-09-17 | 2000-03-22 | Matsushita Electric Industrial Co., Ltd. | Audio signal encoding method without transmission of bit allocation information |
CN1427989A (en) * | 2000-05-08 | 2003-07-02 | 诺基亚有限公司 | Method and arrangement for changing source signal bandwidth in telecommunication connection with multiple bandwidth capability |
CN1669359A (en) * | 2002-07-12 | 2005-09-14 | 皇家飞利浦电子股份有限公司 | Audio coding |
CN1909381A (en) * | 2005-08-03 | 2007-02-07 | 上海杰得微电子有限公司 | Frequency band partition method for broad band acoustic frequency compression encoder |
CN1942929A (en) * | 2004-04-05 | 2007-04-04 | 皇家飞利浦电子股份有限公司 | Multi-channel encoder |
CN1311426C (en) * | 2002-04-10 | 2007-04-18 | 皇家飞利浦电子股份有限公司 | Coding of stereo signals |
CN1969318A (en) * | 2004-09-17 | 2007-05-23 | 松下电器产业株式会社 | Audio encoding device, decoding device, method, and program |
CN101053017A (en) * | 2004-11-04 | 2007-10-10 | 皇家飞利浦电子股份有限公司 | Encoding and decoding a set of signals |
CN101105940A (en) * | 2007-06-27 | 2008-01-16 | 北京中星微电子有限公司 | Audio frequency encoding and decoding quantification method, reverse conversion method and audio frequency encoding and decoding device |
CN101151659A (en) * | 2005-03-30 | 2008-03-26 | 皇家飞利浦电子股份有限公司 | Scalable multi-channel audio coding |
JP2008185845A (en) * | 2007-01-30 | 2008-08-14 | National Institute Of Advanced Industrial & Technology | Method and device of hlac feature extraction from conversion value of one-dimensional signal |
CN101371447A (en) * | 2006-01-20 | 2009-02-18 | 微软公司 | Complex-transform channel coding with extended-band frequency coding |
CN101401152A (en) * | 2006-03-15 | 2009-04-01 | 法国电信公司 | Device and method for encoding by principal component analysis a multichannel audio signal |
CN101401151A (en) * | 2006-03-15 | 2009-04-01 | 法国电信公司 | Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis |
EP2287836A1 (en) * | 2008-05-30 | 2011-02-23 | Panasonic Corporation | Encoder, decoder, and the methods therefor |
CN102027535A (en) * | 2008-04-11 | 2011-04-20 | 诺基亚公司 | Processing of signals |
CN102150207A (en) * | 2008-07-24 | 2011-08-10 | Dts(英属维尔京群岛)有限公司 | Compression of audio scale-factors by two-dimensional transformation |
CN102682779A (en) * | 2012-06-06 | 2012-09-19 | 武汉大学 | Double-channel encoding and decoding method for 3D audio frequency and codec |
WO2014008786A1 (en) * | 2012-07-13 | 2014-01-16 | 华为技术有限公司 | Bit allocation method and device for audio signal |
-
2014
- 2014-11-28 CN CN201410710991.2A patent/CN105632505B/en active Active
- 2014-12-29 WO PCT/CN2014/095393 patent/WO2016082278A1/en active Application Filing
Patent Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1208541A (en) * | 1996-11-15 | 1999-02-17 | 菲利浦电子有限公司 | A mono-stereo conversion device, an audio reproduction system using such a device and a mono-stereo conversion method |
EP0987827A2 (en) * | 1998-09-17 | 2000-03-22 | Matsushita Electric Industrial Co., Ltd. | Audio signal encoding method without transmission of bit allocation information |
CN1427989A (en) * | 2000-05-08 | 2003-07-02 | 诺基亚有限公司 | Method and arrangement for changing source signal bandwidth in telecommunication connection with multiple bandwidth capability |
CN1311426C (en) * | 2002-04-10 | 2007-04-18 | 皇家飞利浦电子股份有限公司 | Coding of stereo signals |
CN1669359A (en) * | 2002-07-12 | 2005-09-14 | 皇家飞利浦电子股份有限公司 | Audio coding |
CN1942929A (en) * | 2004-04-05 | 2007-04-04 | 皇家飞利浦电子股份有限公司 | Multi-channel encoder |
CN1969318A (en) * | 2004-09-17 | 2007-05-23 | 松下电器产业株式会社 | Audio encoding device, decoding device, method, and program |
CN101053017A (en) * | 2004-11-04 | 2007-10-10 | 皇家飞利浦电子股份有限公司 | Encoding and decoding a set of signals |
CN101151659A (en) * | 2005-03-30 | 2008-03-26 | 皇家飞利浦电子股份有限公司 | Scalable multi-channel audio coding |
CN1909381A (en) * | 2005-08-03 | 2007-02-07 | 上海杰得微电子有限公司 | Frequency band partition method for broad band acoustic frequency compression encoder |
CN101371447A (en) * | 2006-01-20 | 2009-02-18 | 微软公司 | Complex-transform channel coding with extended-band frequency coding |
CN101401152A (en) * | 2006-03-15 | 2009-04-01 | 法国电信公司 | Device and method for encoding by principal component analysis a multichannel audio signal |
CN101401151A (en) * | 2006-03-15 | 2009-04-01 | 法国电信公司 | Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis |
JP2008185845A (en) * | 2007-01-30 | 2008-08-14 | National Institute Of Advanced Industrial & Technology | Method and device of hlac feature extraction from conversion value of one-dimensional signal |
CN101105940A (en) * | 2007-06-27 | 2008-01-16 | 北京中星微电子有限公司 | Audio frequency encoding and decoding quantification method, reverse conversion method and audio frequency encoding and decoding device |
CN102027535A (en) * | 2008-04-11 | 2011-04-20 | 诺基亚公司 | Processing of signals |
EP2287836A1 (en) * | 2008-05-30 | 2011-02-23 | Panasonic Corporation | Encoder, decoder, and the methods therefor |
CN102150207A (en) * | 2008-07-24 | 2011-08-10 | Dts(英属维尔京群岛)有限公司 | Compression of audio scale-factors by two-dimensional transformation |
CN102682779A (en) * | 2012-06-06 | 2012-09-19 | 武汉大学 | Double-channel encoding and decoding method for 3D audio frequency and codec |
WO2014008786A1 (en) * | 2012-07-13 | 2014-01-16 | 华为技术有限公司 | Bit allocation method and device for audio signal |
Also Published As
Publication number | Publication date |
---|---|
CN105632505A (en) | 2016-06-01 |
WO2016082278A1 (en) | 2016-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10841584B2 (en) | Method and apparatus for pyramid vector quantization de-indexing of audio/video sample vectors | |
EP1514355B1 (en) | Method and system for multi-rate lattice vector quantization of a signal | |
ES2635327T3 (en) | Compression of the decomposed representations of a sound field | |
TWI584271B (en) | Encoding apparatus and encoding method thereof, decoding apparatus and decoding method thereof, computer program | |
BR112020016948A2 (en) | METHODS AND DEVICES FOR GENERATING OR DECODING A BIT FLOW UNDERSTANDING IMMERSIVE AUDIO SIGNS | |
CN110249384B (en) | Quantizer with index coding and bit arrangement | |
CN111316353A (en) | Determining spatial audio parameter encoding and associated decoding | |
US10789964B2 (en) | Dynamic bit allocation methods and devices for audio signal | |
CN112997248A (en) | Encoding and associated decoding to determine spatial audio parameters | |
KR102613282B1 (en) | Variable alphabet size in digital audio signals | |
US10699721B2 (en) | Encoding and decoding of digital audio signals using difference data | |
US9691397B2 (en) | Device and method data for embedding data upon a prediction coding of a multi-channel signal | |
US8473288B2 (en) | Quantizer, encoder, and the methods thereof | |
CN105632505B (en) | Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model | |
CN103119649B (en) | Method and apparatus for lossless encoding and decoding based on context | |
GB2574873A (en) | Determination of spatial audio parameter encoding and associated decoding | |
JPWO2020089510A5 (en) | ||
CN110660400B (en) | Coding method, decoding method, coding device and decoding device for stereo signal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |