CN105632505B - Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model - Google Patents

Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model Download PDF

Info

Publication number
CN105632505B
CN105632505B CN201410710991.2A CN201410710991A CN105632505B CN 105632505 B CN105632505 B CN 105632505B CN 201410710991 A CN201410710991 A CN 201410710991A CN 105632505 B CN105632505 B CN 105632505B
Authority
CN
China
Prior art keywords
frequency band
vector
coefficient
mapping
mapping matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410710991.2A
Other languages
Chinese (zh)
Other versions
CN105632505A (en
Inventor
吴超刚
潘兴德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Original Assignee
BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd filed Critical BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Priority to CN201410710991.2A priority Critical patent/CN105632505B/en
Priority to PCT/CN2014/095393 priority patent/WO2016082278A1/en
Publication of CN105632505A publication Critical patent/CN105632505A/en
Application granted granted Critical
Publication of CN105632505B publication Critical patent/CN105632505B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to a coding and decoding method and a device of a Principal Component Analysis (PCA) mapping model, wherein the coding method comprises the following steps: performing frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group; determining a first mapping matrix for each of the band groups, the first mapping matrix being a mapping matrix of a set of PCA mapping models common to the bands in the band group; and carrying out quantization coding on the first mapping matrix. As can be seen from the above, when the PCA mapping model is encoded, the mapping matrix corresponding to each frequency band after the frequency band division is not encoded, but the number of the mapping matrices to be encoded is reduced from the original mapping matrix corresponding to each frequency band to the mapping matrix corresponding to each frequency band group through the frequency band combination processing, thereby effectively reducing the encoding code rate.

Description

Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model
Technical Field
The present invention relates to the field of audio processing technologies, and in particular, to a coding and decoding method and apparatus for a Principal Component Analysis (PCA) mapping model.
Background
With the development of technology, a variety of coding techniques for sound signals have appeared, and the sound signals are generally digital sounds including signals perceivable to human ears, such as speech, music, natural sounds, and artificially synthesized sounds. When a multi-channel sound signal is encoded, encoding of a PCA mapping model is usually involved.
In the prior art, when a multi-channel sound signal is encoded, a frequency band of the multi-channel sound signal is divided first, and accordingly, when a PCA mapping model is encoded, a mapping matrix corresponding to each divided frequency band is quantized and encoded, and because the number of mapping matrices to be encoded is large, the encoding rate of the PCA mapping model is too high.
Disclosure of Invention
The invention provides a coding and decoding method and device of a PCA mapping model, which effectively reduce the coding code rate of the PCA mapping model.
In order to achieve the above object, in a first aspect, the present invention provides a coding method for a PCA mapping model, the method including:
performing frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
determining a first mapping matrix for each of the band groups, the first mapping matrix being a mapping matrix of a set of PCA mapping models common to the bands in the band group;
and carrying out quantization coding on the first mapping matrix.
In a second aspect, the present invention provides a method for decoding a PCA mapping model, where the method includes:
determining a vector encoded in the encoded mapping matrix;
decoding the coded coefficient in the vector to obtain a reconstruction value of the coefficient;
reconstructing the vector from the reconstructed values of the coefficients;
and reconstructing the mapping matrix according to the vector, wherein the mapping matrix is determined for each band group in each band group after performing band combination processing on each band after the band division to obtain each band group.
In a third aspect, the present invention provides an apparatus for coding a PCA mapping model, the apparatus comprising:
a frequency band combination unit, configured to perform frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
a matrix determining unit, configured to determine a first mapping matrix for each of the band groups obtained by the band combining unit, where the first mapping matrix is a mapping matrix of a group of PCA mapping models shared by the bands in the band group;
and the coding unit is used for carrying out quantization coding on the first mapping matrix determined by the matrix determination unit.
In a fourth aspect, the present invention provides an apparatus for decoding a PCA mapping model, the apparatus comprising:
a vector determination unit for determining a vector encoded in the encoded mapping matrix;
a decoding unit, configured to decode the encoded coefficient in the vector determined by the vector determination unit to obtain a reconstruction value of the coefficient;
a vector reconstruction unit for reconstructing the vector from the reconstructed value of the coefficient obtained by the decoding unit;
and the matrix reconstruction unit is used for reconstructing the mapping matrix according to the vector reconstructed by the vector reconstruction unit, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division is obtained.
The coding method of the PCA mapping model according to the embodiment of the present invention includes performing band combination processing on each frequency band after frequency band division to obtain each frequency band group, then determining a first mapping matrix for each frequency band group in each frequency band group, where the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by each frequency band in the frequency band group, and then performing quantization coding on the first mapping matrix. As can be seen from the above, when the PCA mapping model is encoded, the mapping matrix corresponding to each frequency band after the frequency band division is not encoded, but the number of the mapping matrices to be encoded is reduced from the original mapping matrix corresponding to each frequency band to the mapping matrix corresponding to each frequency band group through the frequency band combination processing, thereby effectively reducing the encoding code rate.
Drawings
FIG. 1 is a flow chart of a method for encoding a PCA mapping model according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an encoding apparatus of a PCA mapping model according to another embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Fig. 1 is a flowchart of a coding method of a PCA mapping model in an embodiment of the present invention, in which a mapping matrix of each band after band division is first subjected to band combination processing, and then a mapping matrix of a selected code is subjected to quantization coding, the method including:
step 101, performing band combination processing on each band after band division to obtain each band group.
The frequency band groups can be obtained by performing frequency band combination processing on each frequency band after frequency band division according to the characteristics of the frequency band signals and/or the psychoacoustic model and/or the similarity of model parameters.
In the embodiment of the invention, the frequency band combination processing can be specifically performed by adopting any one of the following modes or any combination of the following modes, namely, the first mode is used for comparing the energy of two adjacent frequency bands, and when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, the two frequency bands are combined and divided into a frequency band group; a second mode, calculating a masking threshold of a certain frequency band according to a psychoacoustic model, and when the energy of the frequency band is lower than the masking threshold, combining the frequency band with an adjacent frequency band, and dividing the two frequency bands into a frequency band group; in a third way, the distance between the mapping matrices of two or several adjacent frequency bands is calculated, and when the maximum distance is smaller than the distance threshold, the two or several frequency bands are combined and divided into a band group.
Step 102, determining a first mapping matrix for each of the band groups, where the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by the bands in the band group.
When determining the first mapping matrix for the frequency band group, one mapping matrix may be selected from the mapping matrices corresponding to each frequency band in the frequency band group as the first mapping matrix, for example, the mapping matrix corresponding to the frequency band with the highest frequency band energy may be selected as the first mapping matrix; the mapping matrix may also be recalculated for the band group. In the embodiment of the present invention, the first mapping matrix may be determined for each frequency band group in various ways.
Step 103, performing quantization coding on the first mapping matrix.
In order to further reduce the coding rate, in the embodiment of the invention, all the coefficients in the first mapping matrix are not subjected to quantization coding, but part of the coefficients are selected from the first mapping matrix for quantization coding according to the characteristics of a PCA mapping model.
Specifically, the coefficients to be encoded may be selected from the first mapping matrix according to the dimensionality of the PCA analysis and the number of groups of the multi-channel sound signal to be encoded, and quantized and encoded.
Further, the vector needing to be coded in the first mapping matrix can be determined according to the PCA grouping number and the grouping condition of the multi-channel sound signal selected for coding; and carrying out quantization coding on the coefficient to be coded in the vector.
The following describes the quantization coding of the mapping matrix in detail.
Because the mapping matrix is composed of a series of coefficients, the coefficients to be coded can be selected from the mapping matrix and quantized and coded according to the dimensionality of PCA analysis and the grouping number of the multi-channel sound signals for coding in the embodiment of the invention. According to the relation among the coefficients of the mapping matrix, it can be known that not all matrix coefficients need to be quantized and encoded, and some matrix coefficients do not need to be encoded, and can be obtained by operation according to the encoded coefficient values, and some matrix coefficients only need to be encoded with sign bits. By organizing and selecting the coefficients, the purpose of reducing the coding rate can be achieved.
When performing PCA analysis on two channel signals, the mapping matrix W (t, k) is a2 x2 matrix with 4 coefficients, where t is the frame (or sub-frame) number and k is the frequency number.
W (t, k) can be represented by the following formula:
w (t, k) is an unit orthogonal matrix, and satisfies the following conditions:
thus, W (t, k) can be expressed as
From the above, it can be seen that only β or its transformed form, such as cos β or sin β, etc., need to be encoded.
When performing PCA analysis on four channel signals, the mapping matrix W (t, k) is a 4 x 4 matrix with 16 coefficients, W (t, k) can be represented by:
w (t, k) is an unit orthogonal matrix, and satisfies the following conditions:
when only the first principal component in a multi-channel sound signal is encoded, only a11, a12, a13, a14 need be encoded. Because of satisfaction ofThree important coefficients can be selected from the four coefficients a11, a12, a13 and a14 for quantization coding, while the fourth coefficient is only coded with sign bit or not coded, and the absolute value of the fourth coefficient is obtained by solving the first three coefficients. The selection may be based on the absolute value of the coefficient, the positional relationship, or the like. For example, if a14 is selected to encode only the sign bit and the rest coefficients are quantized and encoded, the absolute value of a14 can be represented by the formulaAnd (4) calculating. For example, the coefficient with the maximum absolute value is selected to be only subjected to symbol coding, and the rest coefficients are subjected to quantization coding; if the solving process of W (t, k) ensures that the coefficient with the maximum absolute value in each vector is a positive value or a negative value, the coefficient with the maximum absolute value is not coded, and the rest coefficients are quantized and coded.
When the first and second principal components in the second multi-channel sound signal are encoded, it is necessary to encode a11, a12, a13, a14, a21, a22, a23, and a 24. Because of satisfaction of
So that there are
Therefore, one coefficient from a11, a12, a13 and a14 can be selected to be only coded with sign bit or not coded, and the other 3 coefficients are quantized and coded; for a21, a22, a23 and a24, 2 coefficients can be selected for quantization coding, and the other 2 coefficients are derived by using the above relation, for example, a21 and a22 are selected for quantization coding, then a23 and a24 satisfy:
solving the equation can yield one or two solutionsWhen two groups of solutions are obtained, it is necessary to determine which group of solutions matches the original data a23 and a24, if soIf yes, making selectflag equal to 0; otherwise, let Selectflag equal to 1, Selectflag also needs to be coded. Due to errors in the quantization process of the coefficients, the equation set may sometimes be solved or solvedThere is a large error from the original data. At this time, the condition may not be utilized
But only utilize
One coefficient is selected from a11, a12, a13, a14, a21, a22, a23 and a24 to perform only sign bit coding, and the other 6 coefficients are subjected to quantization coding, for example, when only sign bit coding is selected for a14 and a24, the absolute value of a14 is obtained by solving a11, a12 and a13, and the absolute value of a24 is obtained by solving a21, a22 and a 23.
Generally, when performing PCA on M channel signals, W (t, k) is a matrix of M × M, having M × M coefficients, and W (t, k) is an orthonormal matrix, which can be expressed as the following equation:
when u groups of signals in the multi-channel sound signal are coded, only the coefficients need to be coded. Conditions satisfied by the matrix coefficients:the number of the coefficients needing to be quantized is reduced, and some coefficients are coded with only sign bits or are not coded.
In the embodiment of the present invention, the quantization coding of the coefficients of the mapping matrix W (t, k) may adopt a scalar coding mode, and may also adopt a vector coding method; the coefficients of W (t, k) may be encoded directly or in some transform form of W (t, k).
The step of quantization encoding the mapping matrix W (t, k) may include:
step 1, determining the vector needing to be coded in a mapping matrix W (t, k) according to the PCA grouping number M and the grouping condition of the selected coding in the multi-channel sound signal
In the step 2, the step of mixing the raw materials,for vectorThe coefficients to be coded are quantized and coded.
In this embodiment of the present invention, when the number of PCA packets is 2, the encoding method may further include: determining a location indicator, the location indicator being indicative of the first coefficient; and when the first coefficient is subjected to quantization coding, performing quantization coding on the position identifier. For example, when the PCA packet number M is 2, the pair vector is selectedThe encoding is carried out, and the specific steps are as follows:
step 1, determining a position identifier Dataposflag, wherein if the absolute value of a11 is less than the absolute value of a12, the Dataposflag is 1, and the data aq to be quantized is a11, otherwise, the Dataposflag is 0, and the data aq to be quantized is a 12;
and 2, carrying out quantization coding on the Dataposflag and the aq.
In the embodiment of the present invention, when the number of PCA packets is 3, performing quantization coding on a coefficient to be coded in a vector may specifically include: determining first position information and second position information according to the magnitude relation of each coefficient in the vector, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value; and carrying out quantization coding on the coefficient with the minimum absolute value, the coefficient with the second minimum absolute value, the first position information and the second position information in the vector.
For example, when the number M of PCA packets is 3 and the first and second principal components in the multi-channel audio signal are selectively encoded, the specific steps are as follows:
step 1, forA11, a12, anda21 in (1) for quantization codingAnd obtaining a reconstructed value
Step 2, coding the sign bit sign13 of a13, and calculating to obtain a13 reconstruction valueIf a13 is a positive number, sign13 equals 1, otherwise sign13 equals 0;the calculation formula of (a) is as follows:
step 3, solving the following equation setObtain two groups of solutions
Step 4, compare { a22, a23} withIf it is notCloser to { a22, a23}, then selectflag is 0; otherwise, selectflag is 1. A selectflag is encoded.
In the embodiment of the invention, the process utilizes the coefficient vector of the mapping matrixAndare unit vectors and are orthogonal to each other, and may cause the equation set to be unsolved or cause the equation set to be unsolved due to errors in the quantization process{ a22, a23} quantization error is large, which causes problems such as unstable mapping matrix, and so on, so that it is possible to choose to use only coefficient vectorAndall are the property of unit vector, and do not use the property of mutually orthogonal vectors, in this case, the specific encoding steps are as follows:
step 1, forA11, a12, anda21 and a22 in the above step (1) are subjected to quantization coding;
step 2, sign bits sign13 and sign23 of a13 and a23 are encoded, if a13 is a positive number, sign13 is equal to 1, otherwise sign13 is equal to 0; if a23 is positive, sign23 equals 1, otherwise sign23 equals 0.
When the first and second principal components in a multichannel audio signal are selectively encoded with the PCA packet number M being 3, a vector in a mapping matrix W (t, k) is selectedAndthe encoding may also be performed by first ordering the coefficients and then quantizing them, and the specific steps are as follows:
step 1, according toDetermining position information miniindex 11 and miniindex 12 according to the size relation of the coefficients, wherein miniindex 11 is the position of the coefficient with the smallest absolute value, and miniindex 12 is the position of the coefficient with the second smallest absolute value; according toDetermining position information miniindex 21 and miniindex 22 according to the size relation of the coefficients, wherein miniindex 21 is the position of the coefficient with the smallest absolute value, and miniindex 22 is the position of the coefficient with the second smallest absolute value;
step 2, encoding miniindex 11, miniindex 12, miniindex 21 and miniindex 22, and carrying out quantization encoding on the coefficient with the minimum absolute value and the second smallest absolute value;
in order to improve the coding efficiency, two or more of minidex 11, minidex 12, minidex 21 and minidex 22 can be combined together for quantization, and a huffman coding equal entropy coding method can be adopted to reduce the code rate.
In this case, the method can be further utilizedAndthe orthogonal property further reduces the number of coefficients to be quantized and encoded, and the specific process is similar to the process of encoding according to the original sequence, and is not described herein again.
As can be seen from the above processing procedure, in the coding method of the PCA mapping model according to the embodiment of the present invention, frequency band combination processing is performed on each frequency band after frequency band division to obtain each frequency band group, then a first mapping matrix is determined for each frequency band group in each frequency band group, the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by each frequency band in the frequency band group, and then quantization coding is performed on the first mapping matrix. As can be seen from the above, when the PCA mapping model is encoded, the mapping matrix corresponding to each frequency band after the frequency band division is not encoded, but the number of the mapping matrices to be encoded is reduced from the original mapping matrix corresponding to each frequency band to the mapping matrix corresponding to each frequency band group through the frequency band combination processing, thereby effectively reducing the encoding code rate.
The embodiment of the present invention further provides a decoding method of the PCA mapping model, which may specifically include the following processing procedures:
step one, determining a vector coded in a coded mapping matrix;
step two, decoding the coded coefficient in the vector to obtain a reconstruction value of the coefficient;
reconstructing the vector according to the reconstruction value of the coefficient;
and fourthly, reconstructing the mapping matrix according to the vector, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division to obtain each frequency band group.
Preferably, when the number of PCA packets is 2, before reconstructing the vector from the reconstructed values of the coefficients, the decoding method may further include: decoding the coding of the position identifier to obtain the position identifier, wherein the position identifier is used for indicating the position of the coded coefficient in the vector; the reconstructing the vector according to the reconstructed value of the coefficient may specifically include: and reconstructing the vector according to the position identification and the reconstruction value of the coefficient.
Preferably, when the number of PCA groups is 3, and the coefficients include a coefficient having a smallest absolute value and a coefficient having a second smallest absolute value in the vector, before reconstructing the vector from a reconstructed value of the coefficients, the method may further include: decoding the coding of the first position information and the coding of the second position information to obtain the first position information and the second position information, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value; the reconstructing the vector according to the reconstructed value of the coefficient may specifically include: determining a reconstruction value of a coefficient with the maximum absolute value in the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the first position information and the second position information; and reconstructing the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the reconstruction value of the coefficient with the maximum absolute value in the vector, the first position information and the second position information.
In particular, the decoding of the mapping matrix W (t, k) may comprise the steps of:
step 1, vector pairDecoding the coefficient to be coded;
step 2, according to the decoded vectorThe mapping matrix W (t, k) is reconstructed.
For example, when the number M of PCA packets is 2, the specific decoding steps are as follows:
step 1, decoding position identifiers Dataposflag and aq;
step 2, determining that if the Dataposflag is 1 according to the Dataposflag and the aqOtherwise
And step 3, reconstructing W (t, k).
When the PCA grouping number M is 3 and the first and second principal components in the multi-channel sound signal are selected to be coded, the decoding method specifically comprises the following steps:
step 1, decoding to obtainIn (1)A21 in (1), sign13 and sign bit selectflag;
step 2, according tosign13 is obtained by calculation
Step 3, solving the following equations to obtain two solutions
Step 4, if selectflag is 1, useReplacement of
Step 5, obtainingAnd reconstruct W (t, k).
When selecting to use only coefficient vectorsAndall are the property of unit vector, and when quantization coding is carried out without utilizing the property of mutual orthogonality of vectors, the corresponding decoding specific steps are as follows:
step 1, decoding to obtainIn (1)In (1)And sign bits sign13, sign 23;
step 2, according tosign13 and sign23Andand reconstruct
Step 3, according toReconstructing W (t, k);
when the method of sequencing the coefficients and then quantizing the codes is selected, the corresponding decoding steps are as follows:
step 1, decoding to obtain minidex 11, minidex 12, minidex 21, minidex 22, and reconstructed values aq11, aq12, aq21 and aq22 of coefficients with minimum absolute values and the second smallest absolute values;
step 2, calculating according to the reconstruction values aq11, aq12, aq21 and aq22Andthe coefficients aq13 and aq23 of the maximum absolute value of each of them.
And step 3, reconstructing according to the position information, namely, the nonindex 11, the minidex 12, the minidex 21, the minidex 22, the aq11, the aq12, the aq21, the aq22, the aq13 and the aq23 obtained by decoding
Step 4, according toW (t, k) is reconstructed.
Fig. 2 is a schematic structural diagram of an encoding apparatus of a PCA mapping model according to an embodiment of the present invention, the apparatus including:
a band combining unit 201, configured to perform band combining processing on each band after band division to obtain each band group;
a matrix determining unit 202, configured to determine a first mapping matrix for each of the band groups obtained by the band combining unit 201, where the first mapping matrix is a mapping matrix of a set of PCA mapping models shared by the bands in the band group;
an encoding unit 203, configured to perform quantization encoding on the first mapping matrix determined by the matrix determination unit 202.
Preferably, the frequency band combining unit 201 is specifically configured to perform frequency band combining processing on each frequency band after frequency band division according to characteristics of the frequency band signal and/or a psychoacoustic model and/or a model parameter similarity, so as to obtain each frequency band group.
Preferably, the frequency band combining unit 201 specifically includes:
the first frequency band combination subunit is used for comparing the energy of two adjacent frequency bands, and when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, combining the two frequency bands to divide the two frequency bands into a frequency band group; and/or
A second frequency band combination subunit, configured to calculate a masking threshold of a certain frequency band according to a psychoacoustic model, and when energy of the frequency band is lower than the masking threshold, combine the frequency band with an adjacent frequency band, and divide the two frequency bands into a frequency band group; and/or
And the third frequency band combination subunit is used for calculating the distance between the mapping matrixes of two or more adjacent frequency bands, and combining the two or more frequency bands to divide the two or more frequency bands into a frequency band group when the maximum distance is smaller than a distance threshold value.
Preferably, the mapping matrix is composed of a series of coefficients, and the encoding unit 203 is specifically configured to select coefficients to be encoded from the first mapping matrix and perform quantization encoding according to the dimensionality of PCA analysis and the number of groups of the multi-channel sound signal to be encoded.
Preferably, the encoding unit 203 specifically includes:
the vector determining subunit is used for determining a vector to be encoded in the mapping matrix W (t, k) according to the PCA grouping number M and the grouping condition of the multi-channel sound signals selected for encoding;
and the coding subunit is used for carrying out quantization coding on the coefficient needing to be coded in the vector determined by the vector determination subunit.
Correspondingly, the embodiment of the invention also provides a decoding device of the PCA mapping model, and the device comprises:
a vector determination unit for determining a vector encoded in the encoded mapping matrix;
a decoding unit, configured to decode the encoded coefficient in the vector determined by the vector determination unit to obtain a reconstruction value of the coefficient;
a vector reconstruction unit for reconstructing the vector from the reconstructed value of the coefficient obtained by the decoding unit;
and the matrix reconstruction unit is used for reconstructing the mapping matrix according to the vector reconstructed by the vector reconstruction unit, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division is obtained.
Preferably, the decoding unit is further configured to: when the PCA grouping number is 2, before the vector reconstruction unit reconstructs the vector according to the reconstruction value of the coefficient, decoding the code of the position identifier to obtain the position identifier, wherein the position identifier is used for indicating the position of the coded coefficient in the vector;
the vector reconstruction unit is specifically configured to: and reconstructing the vector according to the position identifier obtained by the decoding unit and the reconstruction value of the coefficient.
Preferably, the decoding unit is further configured to: when the number of PCA packets is 3, and the coefficients comprise a coefficient with the minimum absolute value and a coefficient with the second minimum absolute value in the vector, before the vector reconstruction unit reconstructs the vector according to the reconstruction value of the coefficients, decoding the coding of first position information and the coding of second position information to obtain first position information and second position information, wherein the first position information is used for indicating the position of the coefficient with the minimum absolute value, and the second position information is used for indicating the position of the coefficient with the second minimum absolute value;
the vector reconstruction unit is specifically configured to: determining a reconstruction value of a coefficient with the maximum absolute value in the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector obtained by the decoding unit, the reconstruction value of the coefficient with the second smallest absolute value, the first position information and the second position information; and reconstructing the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the reconstruction value of the coefficient with the maximum absolute value in the vector, the first position information and the second position information.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (13)

1. A coding method of a Principal Component Analysis (PCA) mapping model is characterized by comprising the following steps:
performing frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
determining a first mapping matrix for each of the band groups, the first mapping matrix being a mapping matrix of a set of PCA mapping models common to the bands in the band group;
the mapping matrix is composed of a series of coefficients, and the first mapping matrix is subjected to quantization coding; and determining a vector required to be coded in the first mapping matrix according to the PCA grouping number and the grouping condition selected to be coded in the multi-channel sound signals, and carrying out quantization coding on a coefficient required to be coded in the vector.
2. The method of claim 1, wherein the performing band combination processing on each frequency band after the frequency band division to obtain each frequency band group specifically comprises:
and performing frequency band combination processing on each frequency band after frequency band division according to the characteristics of the frequency band signals and/or the psychoacoustic model and/or the similarity of model parameters to obtain each frequency band group.
3. The method of claim 1, wherein the performing band combination processing on each frequency band after the frequency band division to obtain each frequency band group specifically comprises:
comparing the energy of two adjacent frequency bands, and combining the two frequency bands when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, so as to divide the two frequency bands into a frequency band group; and/or
Calculating a masking threshold of a certain frequency band according to a psychoacoustic model, and when the energy of the frequency band is lower than the masking threshold, combining the frequency band with an adjacent frequency band, and dividing the two frequency bands into a frequency band group; and/or
And calculating the distance between the mapping matrixes of two or more adjacent frequency bands, and combining the two or more frequency bands when the maximum distance is smaller than a distance threshold value to divide the two or more frequency bands into a frequency band group.
4. The method according to claim 1, wherein the quantization encoding of the coefficients to be encoded in the vector specifically comprises:
and selecting a first coefficient from the vector according to the property that the first mapping matrix is an orthonormal matrix of the unit or the property that the first mapping matrix is an identity matrix, carrying out quantization coding on the first coefficient, and carrying out no coding or sign bit coding on the rest coefficients in the vector.
5. The method of claim 4, wherein the number of PCA packets is 2, the method further comprising:
determining a location indicator, the location indicator being indicative of the first coefficient;
and when the first coefficient is subjected to quantization coding, performing quantization coding on the position identifier.
6. The method according to any of claims 1 to 4, wherein the number of PCA packets is 3, and the performing quantization coding on the coefficients to be coded in the vector specifically comprises:
determining first position information and second position information according to the magnitude relation of each coefficient in the vector, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value;
and carrying out quantization coding on the coefficient with the minimum absolute value, the coefficient with the second minimum absolute value, the first position information and the second position information in the vector.
7. A method for decoding a Principal Component Analysis (PCA) mapping model, the method comprising:
determining a vector encoded in the encoded mapping matrix;
decoding the coded coefficient in the vector to obtain a reconstruction value of the coefficient;
reconstructing the vector from the reconstructed values of the coefficients;
and reconstructing the mapping matrix according to the vector, wherein the mapping matrix is determined for each band group in each band group after performing band combination processing on each band after the band division to obtain each band group.
8. The method of claim 7, wherein the number of PCA groupings is 2, and wherein prior to said reconstructing the vector from the reconstructed values of the coefficients, the method further comprises:
decoding the coding of the position identifier to obtain the position identifier, wherein the position identifier is used for indicating the position of the coded coefficient in the vector;
reconstructing the vector according to the reconstruction value of the coefficient specifically includes: and reconstructing the vector according to the position identification and the reconstruction value of the coefficient.
9. The method of claim 7, wherein the number of PCA groups is 3, the coefficients include a coefficient having a smallest absolute value and a coefficient having a second smallest absolute value in the vector, and before reconstructing the vector from the reconstructed values of the coefficients, the method further comprises:
decoding the coding of the first position information and the coding of the second position information to obtain the first position information and the second position information, wherein the first position information is used for indicating the position of the coefficient with the smallest absolute value, and the second position information is used for indicating the position of the coefficient with the second smallest absolute value;
reconstructing the vector according to the reconstruction value of the coefficient specifically includes:
determining a reconstruction value of a coefficient with the maximum absolute value in the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the first position information and the second position information;
and reconstructing the vector according to the reconstruction value of the coefficient with the minimum absolute value in the vector, the reconstruction value of the coefficient with the second minimum absolute value, the reconstruction value of the coefficient with the maximum absolute value in the vector, the first position information and the second position information.
10. An apparatus for encoding a Principal Component Analysis (PCA) mapping model, the apparatus comprising:
a frequency band combination unit, configured to perform frequency band combination processing on each frequency band after frequency band division to obtain each frequency band group;
a matrix determining unit, configured to determine a first mapping matrix for each of the band groups obtained by the band combining unit, where the first mapping matrix is a mapping matrix of a group of PCA mapping models shared by the bands in the band group;
and the coding unit is used for carrying out quantization coding on the first mapping matrix determined by the matrix determination unit, wherein the mapping matrix consists of a series of coefficients, determining a vector needing to be coded in the first mapping matrix according to the PCA grouping number and the grouping condition of the selected coding in the multi-channel sound signals, and carrying out quantization coding on the coefficient needing to be coded in the vector.
11. The apparatus according to claim 10, wherein the frequency band combining unit is specifically configured to perform frequency band combining processing on each frequency band after the frequency band division according to characteristics of the frequency band signal and/or a psychoacoustic model and/or a similarity of model parameters to obtain each frequency band group.
12. The apparatus as claimed in claim 10, wherein said band combining unit specifically comprises:
the first frequency band combination subunit is used for comparing the energy of two adjacent frequency bands, and when the energy of one frequency band is lower than an energy threshold value calculated according to the energy of the adjacent frequency band, combining the two frequency bands to divide the two frequency bands into a frequency band group; and/or
A second frequency band combination subunit, configured to calculate a masking threshold of a certain frequency band according to a psychoacoustic model, and when energy of the frequency band is lower than the masking threshold, combine the frequency band with an adjacent frequency band, and divide the two frequency bands into a frequency band group; and/or
And the third frequency band combination subunit is used for calculating the distance between the mapping matrixes of two or more adjacent frequency bands, and combining the two or more frequency bands to divide the two or more frequency bands into a frequency band group when the maximum distance is smaller than a distance threshold value.
13. An apparatus for decoding a Principal Component Analysis (PCA) mapping model, the apparatus comprising:
a vector determination unit for determining a vector encoded in the encoded mapping matrix;
a decoding unit, configured to decode the encoded coefficient in the vector determined by the vector determination unit to obtain a reconstruction value of the coefficient;
a vector reconstruction unit for reconstructing the vector from the reconstructed value of the coefficient obtained by the decoding unit;
and the matrix reconstruction unit is used for reconstructing the mapping matrix according to the vector reconstructed by the vector reconstruction unit, wherein the mapping matrix is determined for each frequency band group in each frequency band group after frequency band combination processing is carried out on each frequency band after frequency band division is obtained.
CN201410710991.2A 2014-11-28 2014-11-28 Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model Active CN105632505B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410710991.2A CN105632505B (en) 2014-11-28 2014-11-28 Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model
PCT/CN2014/095393 WO2016082278A1 (en) 2014-11-28 2014-12-29 Encoding/decoding method and apparatus for principal component analysis (pca) mapping module

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410710991.2A CN105632505B (en) 2014-11-28 2014-11-28 Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model

Publications (2)

Publication Number Publication Date
CN105632505A CN105632505A (en) 2016-06-01
CN105632505B true CN105632505B (en) 2019-12-20

Family

ID=56047346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410710991.2A Active CN105632505B (en) 2014-11-28 2014-11-28 Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model

Country Status (2)

Country Link
CN (1) CN105632505B (en)
WO (1) WO2016082278A1 (en)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1208541A (en) * 1996-11-15 1999-02-17 菲利浦电子有限公司 A mono-stereo conversion device, an audio reproduction system using such a device and a mono-stereo conversion method
EP0987827A2 (en) * 1998-09-17 2000-03-22 Matsushita Electric Industrial Co., Ltd. Audio signal encoding method without transmission of bit allocation information
CN1427989A (en) * 2000-05-08 2003-07-02 诺基亚有限公司 Method and arrangement for changing source signal bandwidth in telecommunication connection with multiple bandwidth capability
CN1669359A (en) * 2002-07-12 2005-09-14 皇家飞利浦电子股份有限公司 Audio coding
CN1909381A (en) * 2005-08-03 2007-02-07 上海杰得微电子有限公司 Frequency band partition method for broad band acoustic frequency compression encoder
CN1942929A (en) * 2004-04-05 2007-04-04 皇家飞利浦电子股份有限公司 Multi-channel encoder
CN1311426C (en) * 2002-04-10 2007-04-18 皇家飞利浦电子股份有限公司 Coding of stereo signals
CN1969318A (en) * 2004-09-17 2007-05-23 松下电器产业株式会社 Audio encoding device, decoding device, method, and program
CN101053017A (en) * 2004-11-04 2007-10-10 皇家飞利浦电子股份有限公司 Encoding and decoding a set of signals
CN101105940A (en) * 2007-06-27 2008-01-16 北京中星微电子有限公司 Audio frequency encoding and decoding quantification method, reverse conversion method and audio frequency encoding and decoding device
CN101151659A (en) * 2005-03-30 2008-03-26 皇家飞利浦电子股份有限公司 Scalable multi-channel audio coding
JP2008185845A (en) * 2007-01-30 2008-08-14 National Institute Of Advanced Industrial & Technology Method and device of hlac feature extraction from conversion value of one-dimensional signal
CN101371447A (en) * 2006-01-20 2009-02-18 微软公司 Complex-transform channel coding with extended-band frequency coding
CN101401152A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for encoding by principal component analysis a multichannel audio signal
CN101401151A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis
EP2287836A1 (en) * 2008-05-30 2011-02-23 Panasonic Corporation Encoder, decoder, and the methods therefor
CN102027535A (en) * 2008-04-11 2011-04-20 诺基亚公司 Processing of signals
CN102150207A (en) * 2008-07-24 2011-08-10 Dts(英属维尔京群岛)有限公司 Compression of audio scale-factors by two-dimensional transformation
CN102682779A (en) * 2012-06-06 2012-09-19 武汉大学 Double-channel encoding and decoding method for 3D audio frequency and codec
WO2014008786A1 (en) * 2012-07-13 2014-01-16 华为技术有限公司 Bit allocation method and device for audio signal

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1208541A (en) * 1996-11-15 1999-02-17 菲利浦电子有限公司 A mono-stereo conversion device, an audio reproduction system using such a device and a mono-stereo conversion method
EP0987827A2 (en) * 1998-09-17 2000-03-22 Matsushita Electric Industrial Co., Ltd. Audio signal encoding method without transmission of bit allocation information
CN1427989A (en) * 2000-05-08 2003-07-02 诺基亚有限公司 Method and arrangement for changing source signal bandwidth in telecommunication connection with multiple bandwidth capability
CN1311426C (en) * 2002-04-10 2007-04-18 皇家飞利浦电子股份有限公司 Coding of stereo signals
CN1669359A (en) * 2002-07-12 2005-09-14 皇家飞利浦电子股份有限公司 Audio coding
CN1942929A (en) * 2004-04-05 2007-04-04 皇家飞利浦电子股份有限公司 Multi-channel encoder
CN1969318A (en) * 2004-09-17 2007-05-23 松下电器产业株式会社 Audio encoding device, decoding device, method, and program
CN101053017A (en) * 2004-11-04 2007-10-10 皇家飞利浦电子股份有限公司 Encoding and decoding a set of signals
CN101151659A (en) * 2005-03-30 2008-03-26 皇家飞利浦电子股份有限公司 Scalable multi-channel audio coding
CN1909381A (en) * 2005-08-03 2007-02-07 上海杰得微电子有限公司 Frequency band partition method for broad band acoustic frequency compression encoder
CN101371447A (en) * 2006-01-20 2009-02-18 微软公司 Complex-transform channel coding with extended-band frequency coding
CN101401152A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for encoding by principal component analysis a multichannel audio signal
CN101401151A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis
JP2008185845A (en) * 2007-01-30 2008-08-14 National Institute Of Advanced Industrial & Technology Method and device of hlac feature extraction from conversion value of one-dimensional signal
CN101105940A (en) * 2007-06-27 2008-01-16 北京中星微电子有限公司 Audio frequency encoding and decoding quantification method, reverse conversion method and audio frequency encoding and decoding device
CN102027535A (en) * 2008-04-11 2011-04-20 诺基亚公司 Processing of signals
EP2287836A1 (en) * 2008-05-30 2011-02-23 Panasonic Corporation Encoder, decoder, and the methods therefor
CN102150207A (en) * 2008-07-24 2011-08-10 Dts(英属维尔京群岛)有限公司 Compression of audio scale-factors by two-dimensional transformation
CN102682779A (en) * 2012-06-06 2012-09-19 武汉大学 Double-channel encoding and decoding method for 3D audio frequency and codec
WO2014008786A1 (en) * 2012-07-13 2014-01-16 华为技术有限公司 Bit allocation method and device for audio signal

Also Published As

Publication number Publication date
CN105632505A (en) 2016-06-01
WO2016082278A1 (en) 2016-06-02

Similar Documents

Publication Publication Date Title
US10841584B2 (en) Method and apparatus for pyramid vector quantization de-indexing of audio/video sample vectors
EP1514355B1 (en) Method and system for multi-rate lattice vector quantization of a signal
ES2635327T3 (en) Compression of the decomposed representations of a sound field
TWI584271B (en) Encoding apparatus and encoding method thereof, decoding apparatus and decoding method thereof, computer program
BR112020016948A2 (en) METHODS AND DEVICES FOR GENERATING OR DECODING A BIT FLOW UNDERSTANDING IMMERSIVE AUDIO SIGNS
CN110249384B (en) Quantizer with index coding and bit arrangement
CN111316353A (en) Determining spatial audio parameter encoding and associated decoding
US10789964B2 (en) Dynamic bit allocation methods and devices for audio signal
CN112997248A (en) Encoding and associated decoding to determine spatial audio parameters
KR102613282B1 (en) Variable alphabet size in digital audio signals
US10699721B2 (en) Encoding and decoding of digital audio signals using difference data
US9691397B2 (en) Device and method data for embedding data upon a prediction coding of a multi-channel signal
US8473288B2 (en) Quantizer, encoder, and the methods thereof
CN105632505B (en) Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model
CN103119649B (en) Method and apparatus for lossless encoding and decoding based on context
GB2574873A (en) Determination of spatial audio parameter encoding and associated decoding
JPWO2020089510A5 (en)
CN110660400B (en) Coding method, decoding method, coding device and decoding device for stereo signal

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant