WO2021143692A1 - 一种音频编解码方法和音频编解码设备 - Google Patents
一种音频编解码方法和音频编解码设备 Download PDFInfo
- Publication number
- WO2021143692A1 WO2021143692A1 PCT/CN2021/071328 CN2021071328W WO2021143692A1 WO 2021143692 A1 WO2021143692 A1 WO 2021143692A1 CN 2021071328 W CN2021071328 W CN 2021071328W WO 2021143692 A1 WO2021143692 A1 WO 2021143692A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- band signal
- current
- signal
- current frame
- parameter
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 90
- 230000005236 sound signal Effects 0.000 claims abstract description 54
- 238000001228 spectrum Methods 0.000 claims description 82
- 230000004927 fusion Effects 0.000 claims description 16
- 238000010586 diagram Methods 0.000 description 10
- 230000005540 biological transmission Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 6
- 238000011022 operating instruction Methods 0.000 description 6
- 239000004615 ingredient Substances 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007500 overflow downdraw method Methods 0.000 description 2
- 238000007493 shaping process Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/20—Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
Definitions
- This application relates to the technical field of audio signal coding and decoding, and in particular to an audio coding and decoding method and audio coding and decoding equipment.
- the embodiments of the present application provide an audio coding and decoding method and an audio coding and decoding device, which can improve the quality of decoded audio signals.
- a first aspect of the present invention provides an audio encoding method, the method includes: acquiring a current frame of an audio signal, the current frame including a high-band signal and a low-band signal; according to the high-band signal and the The low-band signal obtains the first coding parameter; the second coding parameter of the current frame is obtained according to the high-band signal, the second coding parameter includes pitch component information; the first coding parameter and the second coding parameter
- the coding parameters are coded stream multiplexed to obtain the coded code stream.
- the obtaining the second coding parameter of the current frame according to the high-band signal includes: detecting whether the high-band signal includes a tonal component; The frequency band signal includes a tone component, and the second encoding parameter of the current frame is obtained according to the high frequency band signal.
- the pitch component information includes at least one of the following: quantity information of pitch components, position information of pitch components, amplitude information of pitch components, or pitch Energy information of the ingredients.
- the second encoding parameter further includes a noise floor parameter.
- the noise floor parameter is used to indicate the noise floor energy.
- a second aspect of the present invention provides an audio decoding method, the method comprising: obtaining a coded code stream; demultiplexing the coded code stream to obtain the first coding parameter and the first coding parameter of the current frame of the audio signal.
- the second encoding parameter of the current frame, the second encoding parameter of the current frame includes pitch component information; the first high-frequency band signal of the current frame and the first high-frequency signal of the current frame are obtained according to the first encoding parameter A low-band signal; obtain the second high-band signal of the current frame according to the second encoding parameter, the second high-band signal including a reconstructed tone signal; according to the second high-band signal of the current frame And the first high-frequency band signal of the current frame obtains the fused high-frequency band signal of the current frame.
- the first high-band signal includes: a decoded high-band signal obtained by direct decoding according to the first encoding parameter, and a frequency band based on the first low-band signal At least one of the expanded high-band signals obtained by the expansion.
- the second high-frequency band signal if the first high-frequency band signal includes the extended high-frequency band signal, the second high-frequency band signal according to the current frame Obtaining the fused high-band signal of the current frame with signal and the first high-band signal of the current frame includes: if the value of the reconstructed tone signal spectrum at the current frequency point of the current sub-band of the current frame satisfies the preset Assuming conditions, the fused high-band signal at the current frequency point is obtained according to the spectrum of the extended high-band signal at the current frequency point and the noise floor information of the current subband; or if the current frame of the current frame The value of the reconstructed tone signal spectrum at the current frequency point of the subband does not meet the preset condition, and the fused high-band signal at the current frequency point is obtained according to the reconstructed tone signal spectrum at the current frequency point.
- the noise floor information includes a noise floor gain parameter.
- the noise floor gain parameter of the current subband is based on the width of the current subband, and the extended high-band signal of the current subband is The energy of the frequency spectrum and the noise floor energy of the current subband are obtained.
- the The second high-band signal of the current frame and the first high-band signal of the current frame to obtain the fused high-frequency signal of the current frame includes: if the current frequency point of the current subband of the current frame is The value of the reconstructed tone signal spectrum does not meet the preset condition, and the fused high-band signal at the current frequency point is obtained according to the reconstructed tone signal spectrum at the current frequency point; or if the current subband of the current frame is current The value of the reconstructed tone signal spectrum at the frequency point satisfies a preset condition, according to the spectrum of the extended high-band signal at the current frequency point, the spectrum of the decoded high-band signal at the current frequency point, and the current The noise floor information of the subband obtains the fused high-frequency band signal at the current frequency point.
- the noise floor information includes a noise floor gain parameter.
- the noise floor gain parameter of the current subband is based on the width of the current subband, the noise floor energy of the current subband, and The energy of the spectrum of the extended high-band signal of the current subband and the energy of the spectrum of the decoded high-band signal of the current subband are obtained.
- the method further The method includes: selecting at least one signal from the decoded high-band signal, the extended high-band signal, and the reconstructed tone signal to obtain the fused high-band signal of the current frame according to preset instruction information or instruction information obtained by decoding .
- the second encoding parameter further includes a noise floor parameter for indicating the energy of the noise floor.
- the preset condition includes: the value of the reconstructed tone signal spectrum is 0 or less than a preset threshold.
- a third aspect of the present invention provides an audio encoder, including: a signal acquisition unit for acquiring a current frame of an audio signal, the current frame including a high-band signal and a low-band signal; a parameter acquisition unit, according to the The high frequency band signal and the low frequency band signal obtain a first coding parameter; the second coding parameter of the current frame is obtained according to the high frequency band signal, and the second coding parameter includes pitch component information; an encoding unit is used for Perform code stream multiplexing on the first coding parameter and the second coding parameter to obtain a coded code stream.
- the parameter acquisition unit is specifically further configured to: detect whether the high-band signal includes a tonal component; if the high-band signal includes a tonal component , Obtaining the second encoding parameter of the current frame according to the high frequency band signal.
- the pitch component information includes at least one of the following: quantity information of pitch components, position information of pitch components, amplitude information of pitch components, or pitch Energy information of the ingredients.
- the second encoding parameter further includes a noise floor parameter.
- the noise floor parameter is used to indicate the noise floor energy.
- the fourth aspect of the present invention provides an audio decoder, including: a receiving unit for obtaining a coded stream; a demultiplexing unit for demultiplexing the coded stream to obtain an audio signal The first encoding parameter of the current frame and the second encoding parameter of the current frame, where the second encoding parameter of the current frame includes pitch component information; the acquiring unit is configured to acquire the current frame according to the first encoding parameter The first high-band signal of the current frame and the first low-band signal of the current frame; the second high-band signal of the current frame is obtained according to the second encoding parameter, and the second high-band signal includes a reconstructed tone Signal; a fusion unit for obtaining the fused high-band signal of the current frame according to the second high-band signal of the current frame and the first high-band signal of the current frame.
- the first high-band signal includes: a decoded high-band signal obtained by direct decoding according to the first encoding parameter, and a frequency band based on the first low-band signal At least one of the expanded high-band signals obtained by the expansion.
- the first high-band signal includes the extended high-band signal
- the fusion unit is specifically configured to: if the current frame The value of the reconstructed tone signal spectrum at the current frequency of the current subband satisfies a preset condition, and the current frequency is obtained according to the spectrum of the extended high-band signal at the current frequency and the noise floor information of the current subband.
- the noise floor information includes a noise floor gain parameter.
- the noise floor gain parameter of the current subband is based on the width of the current subband, and the extended high-band signal of the current subband is The energy of the frequency spectrum and the noise floor energy of the current subband are obtained.
- the fusion unit Specifically, if the value of the reconstructed tone signal spectrum at the current frequency point of the current subband of the current frame does not meet a preset condition, obtain the value of the reconstructed tone signal spectrum at the current frequency point according to the reconstructed tone signal spectrum at the current frequency point Fused high-band signal; or if the value of the reconstructed tone signal spectrum at the current frequency point of the current subband of the current frame satisfies a preset condition, according to the spectrum of the extended high-band signal at the current frequency point, The frequency spectrum of the decoded high-band signal at the current frequency point and the noise floor information of the current subband obtain the fused high-band signal at the current frequency point.
- the noise floor information includes a noise floor gain parameter.
- the noise floor gain parameter of the current subband is based on the width of the current subband, the noise floor energy of the current subband, and The energy of the spectrum of the extended high-band signal of the current subband and the energy of the spectrum of the decoded high-band signal of the current subband are obtained.
- the fusion unit It is also used for: selecting at least one signal from the decoded high-band signal, the extended high-band signal, and the reconstructed tone signal to obtain the fused high frequency of the current frame according to preset instruction information or instruction information obtained by decoding With signal.
- the second encoding parameter further includes a noise floor parameter for indicating the energy of the noise floor.
- the preset condition includes: the value of the reconstructed tone signal spectrum is 0 or less than a preset threshold.
- a fifth aspect of the present invention provides an audio encoding device, including at least one processor, the at least one processor is configured to be coupled with a memory, read and execute instructions in the memory, so as to implement any of the instructions in the first aspect.
- a sixth aspect of the present invention provides an audio decoding device, including at least one processor, the at least one processor is configured to be coupled with a memory, read and execute instructions in the memory, so as to implement any of the instructions in the second aspect.
- an embodiment of the present application provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, causes the computer to execute the above-mentioned first or second aspect. The method described.
- embodiments of the present application provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the method described in the first or second aspect above.
- an embodiment of the present application provides a communication device.
- the communication device may include entities such as audio codec equipment or a chip.
- the communication device includes a processor and optionally a memory; the memory is used for Storing instructions; the processor is configured to execute the instructions in the memory, so that the communication device executes the method according to any one of the foregoing first aspect or second aspect.
- this application provides a chip system that includes a processor for supporting audio codec devices to implement the functions involved in the above aspects, for example, sending or processing the data and/or involved in the above methods Or information.
- the chip system further includes a memory, and the memory is used to store necessary program instructions and data of the audio codec device.
- the chip system can be composed of chips, and can also include chips and other discrete devices.
- the audio encoder in the embodiment of the present invention encodes the tonal component information, so that the audio decoder can decode the audio signal according to the received tonal component information, and can more accurately restore the tonal components in the audio signal, thereby Improved the quality of decoded audio signals.
- FIG. 1 is a schematic structural diagram of an audio codec system provided by an embodiment of the application
- FIG. 2 is a schematic flowchart of an audio coding method provided by an embodiment of the application
- FIG. 3 is a schematic flowchart of an audio decoding method provided by an embodiment of this application.
- FIG. 4 is a schematic diagram of a mobile terminal according to an embodiment of the application.
- Fig. 5 is a schematic diagram of a network element according to an embodiment of the application.
- FIG. 6 is a schematic diagram of the composition structure of an audio coding device provided by an embodiment of the application.
- FIG. 7 is a schematic diagram of the composition structure of an audio decoding device provided by an embodiment of the application.
- FIG. 8 is a schematic diagram of the composition structure of another audio coding device provided by an embodiment of the application.
- FIG. 9 is a schematic diagram of the composition structure of another audio decoding device provided by an embodiment of the application.
- the audio signal in the embodiment of the present application refers to the input signal in the audio encoding device.
- the audio signal may include multiple frames.
- the current frame may specifically refer to a certain frame in the audio signal.
- the current frame The audio signal coding and decoding are illustrated by examples.
- the previous frame or the next frame of the current frame in the audio signal can be coded and decoded according to the coding and decoding mode of the current frame audio signal.
- the audio signal in the embodiment of the present application may be a mono audio signal, or may also be a stereo signal.
- the stereo signal can be the original stereo signal, it can also be a stereo signal composed of two signals (left channel signal and right channel signal) included in the multi-channel signal, or it can be composed of the multi-channel signal.
- Fig. 1 is a schematic structural diagram of an audio coding and decoding system according to an exemplary embodiment of the application.
- the audio codec system includes an encoding component 110 and a decoding component 120.
- the encoding component 110 is used to encode the current frame (audio signal) in the frequency domain or the time domain.
- the encoding component 110 can be implemented by software; alternatively, it can also be implemented by hardware; or, it can also be implemented by a combination of software and hardware, which is not limited in the embodiments of the present application.
- the encoding component 110 encodes the current frame in the frequency domain or the time domain, in a possible implementation manner, the steps shown in FIG. 2 may be included.
- the encoding component 110 and the decoding component 120 may be connected in a wired or wireless manner, and the decoding component 120 may obtain the encoded bitstream generated by the encoding component 110 through the connection between the encoding component 110 and the encoding component 110; or, the encoding component 110 may The generated code stream is stored in the memory, and the decoding component 120 reads the code stream in the memory.
- the decoding component 120 can be implemented by software; alternatively, it can also be implemented by hardware; or, it can also be implemented by a combination of software and hardware, which is not limited in the embodiment of the present application.
- the decoding component 120 decodes the current frame (audio signal) in the frequency domain or the time domain, in a possible implementation manner, the steps shown in FIG. 3 may be included.
- the encoding component 110 and the decoding component 120 can be provided in the same device; or, they can also be provided in different devices.
- the device can be a terminal with audio signal processing functions such as mobile phones, tablet computers, laptop computers and desktop computers, Bluetooth speakers, voice recorders, wearable devices, etc., or it can be a core network or wireless network with audio signal processing capabilities This embodiment does not limit this.
- the encoding component 110 is installed in the mobile terminal 130
- the decoding component 120 is installed in the mobile terminal 140.
- the mobile terminal 130 and the mobile terminal 140 are independent of each other and have audio signal processing capabilities.
- the electronic device may be a mobile phone, a wearable device, a virtual reality (VR) device, or an augmented reality (AR) device, etc., and the mobile terminal 130 and the mobile terminal 140 are connected wirelessly or wiredly. Take network connection as an example.
- the mobile terminal 130 may include an acquisition component 131, an encoding component 110, and a channel encoding component 132, where the acquisition component 131 is connected to the encoding component 110, and the encoding component 110 is connected to the encoding component 132.
- the mobile terminal 140 may include an audio playing component 141, a decoding component 120, and a channel decoding component 142.
- the audio playing component 141 is connected to the decoding component 120
- the decoding component 120 is connected to the channel decoding component 142.
- the mobile terminal 130 After the mobile terminal 130 collects the audio signal through the collection component 131, it encodes the audio signal through the encoding component 110 to obtain an encoded code stream; then, the channel encoding component 132 encodes the encoded code stream to obtain a transmission signal.
- the mobile terminal 130 transmits the transmission signal to the mobile terminal 140 through a wireless or wired network.
- the mobile terminal 140 After receiving the transmission signal, the mobile terminal 140 decodes the transmission signal through the channel decoding component 142 to obtain a code stream; decodes the code stream through the decoding component 110 to obtain an audio signal; and plays the audio signal through the audio playback component. It can be understood that the mobile terminal 130 may also include components included in the mobile terminal 140, and the mobile terminal 140 may also include components included in the mobile terminal 130.
- the encoding component 110 and the decoding component 120 are provided in a network element 150 capable of processing audio signals in the same core network or wireless network as an example for description.
- the network element 150 includes a channel decoding component 151, a decoding component 120, an encoding component 110, and a channel encoding component 152.
- the channel decoding component 151 is connected to the decoding component 120
- the decoding component 120 is connected to the encoding component 110
- the encoding component 110 is connected to the channel encoding component 152.
- the channel decoding component 151 After the channel decoding component 151 receives the transmission signal sent by other devices, it decodes the transmission signal to obtain the first coded code stream; the decoding component 120 decodes the coded code stream to obtain the audio signal; the coding component 110 performs the decoding on the audio signal Encode to obtain a second coded code stream; use the channel coding component 152 to encode the second coded code stream to obtain a transmission signal.
- the other device may be a mobile terminal with audio signal processing capability; or, it may also be other network elements with audio signal processing capability, which is not limited in this embodiment.
- the encoding component 110 and the decoding component 120 in the network element can transcode the encoded code stream sent by the mobile terminal.
- the device installed with the encoding component 110 may be referred to as an audio encoding device.
- the audio encoding device may also have an audio decoding function, which is not limited in the implementation of this application.
- the device installed with the decoding component 120 may be referred to as an audio decoding device.
- the audio decoding device may also have an audio encoding function, which is not limited in the implementation of this application.
- Figure 2 depicts the audio coding method process provided by an embodiment of the present invention, including:
- 201 Acquire a current frame of an audio signal, where the current frame includes a high-band signal and a low-band signal.
- the current frame can be any frame in the audio signal, and the current frame can include a high-band signal and a low-band signal.
- the division of the high-band signal and the low-band signal can be determined by the frequency band threshold, which is higher than the frequency band threshold.
- the frequency band threshold signal is a high frequency band signal, and the signal below the frequency band threshold value is a low frequency band signal.
- the frequency band threshold can be determined according to the transmission bandwidth, the data processing capability of the encoding component 110 and the decoding component 120, and it will not be done here. limited.
- the high-band signal and the low-band signal are relative. For example, a signal below a certain frequency is a low-band signal, but a signal above this frequency is a high-band signal (the signal corresponding to the frequency can be classified into the low-band Signals can also be assigned to high-band signals).
- the frequency varies according to the bandwidth of the current frame. For example, when the current frame is a 0-8khz wideband signal, the frequency may be 4khz; when the current frame is a 0-16khz ultra-wideband signal, the frequency may be 8khz.
- the first coding parameters may specifically include: time domain noise shaping parameters, frequency domain noise shaping parameters, frequency spectrum quantization parameters, frequency band extension parameters, and so on.
- the tonal component information includes at least one of the following: quantity information of the tonal component, position information of the tonal component, amplitude information of the tonal component, or energy information of the tonal component.
- the amplitude information and the energy information may include only one.
- step 203 may be performed only when the high frequency band signal includes tonal components.
- the obtaining the second coding parameter of the current frame according to the high-band signal may include: detecting whether the high-band signal includes a tonal component; if the high-band signal includes a tonal component, according to the The high frequency band signal obtains the second coding parameter of the current frame.
- the second encoding parameter may further include a noise floor parameter, for example, the noise floor parameter may be used to indicate noise floor energy.
- the audio encoder in the embodiment of the present invention encodes the tonal component information, so that the audio decoder can decode the audio signal according to the received tonal component information, and can more accurately restore the tonal components in the audio signal, thereby Improved the quality of decoded audio signals.
- FIG. 3 depicts the flow of an audio decoding method provided by another embodiment of the present invention, including:
- the first coding parameter and the second coding parameter can refer to the coding method, which will not be repeated here.
- the first high-band signal includes: a decoded high-band signal obtained by direct decoding according to the first encoding parameter, and an extended high-band signal obtained by performing frequency band expansion according to the first low-band signal. At least one.
- the current frame is obtained according to the second high frequency band signal of the current frame and the first high frequency band signal of the current frame.
- the fusion high-band signal of the frame may include: if the value of the reconstructed tone signal spectrum at the current frequency point of the current subband of the current frame satisfies a preset condition, according to the value of the extended high-band signal at the current frequency point Frequency spectrum and the noise floor information of the current subband to obtain the fused high-band signal at the current frequency point; or if the value of the reconstructed tone signal spectrum at the current frequency point of the current subband of the current frame does not meet the expected Assuming a condition, the fused high-band signal at the current frequency point is obtained according to the reconstructed tone signal spectrum at the current frequency point.
- the noise floor information may include a noise floor gain parameter.
- the noise floor gain parameter of the current subband is based on the width of the current subband, the energy of the spectrum of the extended high-band signal of the current subband, and the noise of the current subband Base energy gain.
- the second high-frequency signal of the current frame and the first high-frequency signal of the current frame may include: if the value of the reconstructed tone signal spectrum at the current frequency point of the current subband of the current frame does not meet a preset condition, according to the current frequency point To obtain the fused high-band signal at the current frequency point from the reconstructed tone signal spectrum; or if the value of the reconstructed tone signal spectrum at the current frequency point of the current subband of the current frame satisfies a preset condition, according to the current The spectrum of the extended high-band signal at the frequency point, the spectrum of the decoded high-band signal at the current frequency point, and the noise floor information of the current subband to obtain the fused high-band signal at the current frequency point .
- the noise floor information includes noise floor gain parameters.
- the noise floor gain parameter of the current subband is based on the width of the current subband, the noise floor energy of the current subband, and the frequency spectrum of the extended high-band signal of the current subband. Energy, and the energy of the frequency spectrum of the decoded high-band signal of the current subband is obtained.
- the preset condition includes: the value of the reconstructed tone signal spectrum is 0. In another embodiment of the present invention, the preset condition includes: the value of the reconstructed tone signal spectrum is less than a preset threshold, and the preset threshold is a real number greater than zero.
- the audio encoder in the embodiment of the present invention encodes the tonal component information, so that the audio decoder can decode the audio signal according to the received tonal component information, and can more accurately restore the tonal components in the audio signal, thereby Improved the quality of decoded audio signals.
- the audio decoding method described in FIG. 3 may further include:
- At least one signal is selected from the decoded high-band signal, the extended high-band signal, and the reconstructed tone signal to obtain the fused high-band signal of the current frame.
- the frequency spectrum of the decoded high-band signal obtained by direct decoding according to the first coding parameter is denoted as enc_spec[sfb]
- the spectrum of the extended high-band signal obtained by performing frequency band expansion according to the first low-band signal is denoted as patch_spec[sfb]
- the spectrum of the reconstructed tone signal is denoted as recon_spec[sfb].
- the noise floor energy is denoted as E noise_floor [sfb].
- the noise floor energy can be obtained, for example, from the noise floor energy parameter E noise_floor [tile] in the spectrum interval according to the correspondence between the spectrum interval and the subband, that is, each sfb in the tile-th spectrum interval
- the energy of the noise floor is equal to E noise_floor [tile].
- the fused high-frequency signal of the current frame obtained according to the second high-frequency signal of the current frame and the first high-frequency signal of the current frame can be divided into the following types condition:
- merge_spec[sfb][k] patch_spec[sfb][k],k ⁇ [sfb_offset[sfb],sfb_offset[sfb+1])
- merge_spec[sfb][k] represents the fusion signal spectrum at the kth frequency point of the sfb subband
- sfb_offset is the subband division table
- sfb_offset[sfb] and sfb_offset[sfb+1] are respectively the sfb and sfb+1 The starting point of each subband.
- g noise_floor [sfb] is the noise floor gain parameter of the sfb subband, which is calculated from the noise floor energy parameter of the sfb subband and the energy of patch_spec[sfb], namely:
- sfb_width[sfb] is the width of the sfb-th subband, expressed as:
- sfb_width[sfb] sfb_offset[sfb+1]-sfb_offset[sfb]
- E patch [sfb] is the energy of patch_spec[sfb], and the calculation process is as follows:
- the value range of k is k ⁇ [sfb_offset[sfb], sfb_offset[sfb+1]).
- Fusion methods can be divided into two types, one is to combine the above three spectrums, with recon_spec[sfb] as the main component, and the other two are adjusted to the noise floor energy level; the other is to combine enc_spec[sfb] and patch_spec [sfb] way.
- the high frequency signal spectrum obtained by patch_spec[sfb] and enc_spec[sfb] is adjusted with the noise floor gain and combined with recon_spec[sfb] to obtain the fusion signal spectrum.
- g noise_floor [sfb] is the noise floor gain parameter of the sfbth subband, which is calculated from the sfbth subband noise floor energy parameter, patch_spec[sfb] energy, and enc_spec[sfb] energy, namely:
- E patch [sfb] is the energy of patch_spec[sfb]
- E enc [sfb] is the energy of enc_spec[sfb]
- the value range of k is k ⁇ [sfb_offset[sfb], sfb_offset[sfb+1]).
- Recon_spec[sfb] is no longer reserved, and the fusion signal is composed of patch_spec[sfb] and enc_spec[sfb].
- method one and method two one of them can be selected by a preset method, or the judgment can be made in a certain manner, for example, method one is selected when the signal meets a certain preset condition.
- the embodiment of the present invention does not limit the specific selection method.
- Figure 6 depicts the structure of an audio encoder provided by an embodiment of the present invention, including:
- the signal acquisition unit 601 is configured to acquire a current frame of an audio signal, where the current frame includes a high-band signal and a low-band signal.
- the parameter obtaining unit 602 obtains a first coding parameter according to the high frequency band signal and the low frequency band signal; obtains a second coding parameter of the current frame according to the high frequency band signal, and the second coding parameter includes a pitch Ingredient information
- the coding unit 603 is configured to perform code stream multiplexing on the first coding parameter and the second coding parameter to obtain a coded code stream.
- Figure 7 illustrates the structure of an audio decoder provided by an embodiment of the present invention, including:
- the receiving unit 701 is configured to obtain an encoding code stream
- the demultiplexing unit 702 is configured to demultiplex the code stream to obtain the first coding parameter of the current frame of the audio signal and the second coding parameter of the current frame.
- Two coding parameters include tonal component information
- the obtaining unit 703 is configured to obtain the first high-band signal of the current frame and the first low-band signal of the current frame according to the first coding parameter; obtain the information of the current frame according to the second coding parameter A second high-band signal, where the second high-band signal includes a reconstructed tone signal;
- the fusion unit 704 is configured to obtain the fused high-frequency signal of the current frame according to the second high-frequency signal of the current frame and the first high-frequency signal of the current frame.
- the specific implementation of the audio decoder can refer to the above-mentioned audio decoding method, which will not be repeated here.
- the embodiment of the present invention also provides a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the above-mentioned audio encoding method or audio decoding method.
- the embodiment of the present invention also provides a computer program product containing instructions, which when running on a computer, causes the computer to execute the above-mentioned audio encoding method or audio decoding method.
- An embodiment of the present application also provides a computer storage medium, wherein the computer storage medium stores a program, and the program executes some or all of the steps recorded in the above method embodiments.
- the audio coding device 1000 includes:
- the receiver 1001, the transmitter 1002, the processor 1003, and the memory 1004 (the number of processors 1003 in the audio encoding device 1000 may be one or more, and one processor is taken as an example in FIG. 8).
- the receiver 1001, the transmitter 1002, the processor 1003, and the memory 1004 may be connected by a bus or in other ways, where the bus connection is taken as an example in FIG. 8.
- the memory 1004 may include a read-only memory and a random access memory, and provides instructions and data to the processor 1003. A part of the memory 1004 may also include a non-volatile random access memory (NVRAM).
- NVRAM non-volatile random access memory
- the memory 1004 stores an operating system and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
- the operating instructions may include various operating instructions for implementing various operations.
- the operating system may include various system programs for implementing various basic services and processing hardware-based tasks.
- the processor 1003 controls the operation of the audio encoding device, and the processor 1003 may also be referred to as a central processing unit (CPU).
- the various components of the audio encoding device are coupled together through a bus system.
- the bus system may also include a power bus, a control bus, and a status signal bus.
- various buses are referred to as bus systems in the figure.
- the methods disclosed in the foregoing embodiments of the present application may be applied to the processor 1003 or implemented by the processor 1003.
- the processor 1003 may be an integrated circuit chip with signal processing capability. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 1003 or instructions in the form of software.
- the above-mentioned processor 1003 may be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or Other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- DSP digital signal processing
- ASIC application specific integrated circuit
- FPGA field-programmable gate array
- Other programmable logic devices discrete gates or transistor logic devices, discrete hardware components.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
- the storage medium is located in the memory 1004, and the processor 1003 reads the information in the memory 1004, and completes the steps of the foregoing method in combination with its hardware.
- the receiver 1001 can be used to receive input digital or character information, and generate signal input related to the related settings and function control of the audio coding device.
- the transmitter 1002 can include display devices such as a display screen, and the transmitter 1002 can be used to output through an external interface Number or character information.
- the processor 1003 is configured to execute the aforementioned audio coding method.
- the audio decoding device 1100 includes:
- the receiver 1101, the transmitter 1102, the processor 1103, and the memory 1104 (the number of processors 1103 in the audio decoding device 1100 may be one or more, and one processor is taken as an example in FIG. 9).
- the receiver 1101, the transmitter 1102, the processor 1103, and the memory 1104 may be connected by a bus or in other ways, wherein the connection by a bus is taken as an example in FIG. 9.
- the memory 1104 may include a read-only memory and a random access memory, and provides instructions and data to the processor 1103. A part of the memory 1104 may also include NVRAM.
- the memory 1104 stores an operating system and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
- the operating instructions may include various operating instructions for implementing various operations.
- the operating system may include various system programs for implementing various basic services and processing hardware-based tasks.
- the processor 1103 controls the operation of the audio decoding device, and the processor 1103 may also be referred to as a CPU.
- the various components of the audio decoding device are coupled together through a bus system, where the bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
- bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
- various buses are referred to as bus systems in the figure.
- the method disclosed in the foregoing embodiment of the present application may be applied to the processor 1103 or implemented by the processor 1103.
- the processor 1103 may be an integrated circuit chip with signal processing capability. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 1103 or instructions in the form of software.
- the aforementioned processor 1103 may be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
- the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
- the storage medium is located in the memory 1104, and the processor 1103 reads the information in the memory 1104, and completes the steps of the foregoing method in combination with its hardware.
- the processor 1103 is configured to execute the aforementioned audio decoding method.
- the chip when the audio encoding device or the audio decoding device is a chip in the terminal, the chip includes: a processing unit and a communication unit.
- the processing unit may be, for example, a processor, and the communication unit may be, for example, Input/output interface, pin or circuit, etc.
- the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip in the terminal executes the method of any one of the above-mentioned first aspects.
- the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the terminal located outside the chip, such as a read-only memory (read-only memory). -only memory, ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
- processor mentioned in any of the foregoing may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the method in the first aspect.
- the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate.
- the physical unit can be located in one place or distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the connection relationship between the modules indicates that they have a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
- this application can be implemented by means of software plus necessary general hardware.
- it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memory, Dedicated components and so on to achieve.
- all functions completed by computer programs can be easily implemented with corresponding hardware.
- the specific hardware structures used to achieve the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. Circuit etc.
- software program implementation is a better implementation in more cases.
- the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, server, or network device, etc.) execute the methods described in each embodiment of this application .
- a computer device which can be a personal computer, server, or network device, etc.
- the computer program product includes one or more computer instructions.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
- wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
- wireless such as infrared, wireless, microwave, etc.
- the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media.
- the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Mathematical Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims (35)
- 一种音频编码方法,其特征在于,所述方法包括:获取音频信号的当前帧,所述当前帧包括高频带信号和低频带信号;根据所述高频带信号和所述低频带信号得到第一编码参数;根据所述高频带信号得到所述当前帧的第二编码参数,所述第二编码参数包括音调成分信息;对所述第一编码参数和所述第二编码参数进行码流复用,以得到编码码流。
- 根据权利要求1所述的方法,其特征在于,所述根据所述高频带信号得到所述当前帧的第二编码参数,包括:检测所述高频带信号是否包括音调成分;若所述高频带信号包括音调成分,根据所述高频带信号得到所述当前帧的第二编码参数。
- 根据权利要求1或2所述的方法,其特征在于,所述音调成分信息包括如下至少一种:音调成分的数量信息、音调成分位置信息、音调成分的幅度信息、或音调成分的能量信息。
- 根据权利要求1至3任一所述的方法,其特征在于,所述第二编码参数还包括噪声基底参数。
- 根据权利要求4所述的方法,其特征在于,所述噪声基底参数包括噪声基底能量。
- 一种音频解码方法,其特征在于,所述方法包括:获取编码码流;对所述编码码流进行码流解复用,以得到音频信号的当前帧的第一编码参数和所述当前帧的第二编码参数,所述当前帧的第二编码参数包括音调成分信息;根据所述第一编码参数得到所述当前帧的第一高频带信号和所述当前帧的第一低频带信号;根据所述第二编码参数得到所述当前帧的第二高频带信号,所述第二高频带信号包括重建音调信号;根据所述当前帧的第二高频带信号以及所述当前帧的第一高频带信号得到所述当前帧的融合高频带信号。
- 根据权利要求6所述的方法,其特征在于,所述第一高频带信号包括:根据所述第一编码参数直接解码得到的解码高频带信号,以及根据所述第一低频带信号进行频带扩展得到的扩展高频带信号中的至少一种。
- 根据权利要求7所述的方法,其特征在于,若所述第一高频带信号包括所述扩展高频带信号,所述根据所述当前帧的第二高频带信号以及所述当前帧的第一高频带信号得到所述当前帧的融合高频带信号包括:若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值满足预设条件,根据所述当前频点上的扩展高频带信号的频谱以及所述当前子带的噪声基底信息得到所述当前频点上的融合高频带信号;或若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值不满足预设条件,根据所述当前频点上的重建音调信号频谱得到所述当前频点上的融合高频带信号。
- 根据权利要求8所述的方法,其特征在于,所述噪声基底信息包括噪声基底增益参数。
- 根据权利要求9所述的方法,其特征在于,所述当前子带的噪声基底增益参数根据所述当前子带的宽度,所述当前子带的扩展高频带信号的频谱的能量,以及所述当前子带的噪声基底能量获得。
- 根据权利要求7所述的方法,其特征在于,若所述第一高频带信号包括所述解码高频带信号以及所述扩展高频带信号,所述根据所述当前帧的第二高频带信号以及所述当前帧的第一高频带信号得到所述当前帧的融合高频带信号包括:若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值不满足预设条件,根据所述当前频点上的重建音调信号频谱得到所述当前频点上的融合高频带信号;或若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值满足预设条件,根据所述当前频点上的扩展高频带信号的频谱,所述当前频点上的解码高频带信号的频谱,以及所述当前子带的噪声基底信息得到所述当前频点上的融合高频带信号。
- 根据权利要求11所述的方法,其特征在于,所述噪声基底信息包括噪声基底增益参数。
- 根据权利要求12所述的方法,其特征在于,所述当前子带的噪声基底增益参数根据所述当前子带的宽度,所述当前子带的噪声基底能量,所述当前子带的扩展高频带信号的频谱的能量,以及所述当前子带的解码高频带信号的频谱的能量获得。
- 根据权利要求7所述的方法,其特征在于,若所述第一高频带信号包括所述解码高频带信号以及所述扩展高频带信号,所述方法还包括:根据预设指示信息或解码得到的指示信息,从所述解码高频带信号,扩展高频带信号以及所述重建音调信号中选择至少一个信号得到所述当前帧的融合高频带信号。
- 根据权利要求10或13所述的方法,其特征在于,所述第二编码参数包括用于指示所述噪声基底能量的噪声基底参数。
- 根据权利要求8或11所述的方法,其特征在于,所述预设条件包括:重建音调信号频谱的值为0或小于预设阈值。
- 一种音频编码器,其特征在于,包括:信号获取单元,用于获取音频信号的当前帧,所述当前帧包括高频带信号和低频带信号;参数获取单元,根据所述高频带信号和所述低频带信号得到第一编码参数;根据所述高频带信号得到所述当前帧的第二编码参数,所述第二编码参数包括音调成分信息;编码单元,用于对所述第一编码参数和所述第二编码参数进行码流复用,以得到编码码流。
- 根据权利要求17所述的音频编码器,其特征在于,参数获取单元具体还用于:检测所述高频带信号是否包括音调成分;若所述高频带信号包括音调成分,根据所述高频带信号得到所述当前帧的第二编码参数。
- 根据权利要求17或18所述的音频编码器,其特征在于,所述音调成分信息包括如下至少一种:音调成分的数量信息、音调成分位置信息、音调成分的幅度信息、或音调成分的能量信息。
- 根据权利要求17至19任一所述的音频编码器,其特征在于,所述第二编码参数还包括噪声基底参数。
- 根据权利要求20所述的音频编码器,其特征在于,所述噪声基底参数用于指示噪声基底能量。
- 一种音频解码器,其特征在于,包括:接收单元,用于获取编码码流;解复用单元,用于对所述编码码流进行码流解复用,以得到音频信号的当前帧的第一编码参数和所述当前帧的第二编码参数,所述当前帧的第二编码参数包括音调成分信息;获取单元,用于根据所述第一编码参数得到所述当前帧的第一高频带信号和所述当前帧的第一低频带信号;根据所述第二编码参数得到所述当前帧的第二高频带信号,所述第二高频带信号包括重建音调信号;融合单元,用于根据所述当前帧的第二高频带信号以及所述当前帧的第一高频带信号得到所述当前帧的融合高频带信号。
- 根据权利要求22所述的音频解码器,其特征在于,所述第一高频带信号包括:根据所述第一编码参数直接解码得到的解码高频带信号,以及根据所述第一低频带信号进行频带扩展得到的扩展高频带信号中的至少一种。
- 根据权利要求23所述的音频解码器,其特征在于,所述第一高频带信号包括所述扩展高频带信号,所述融合单元具体用于:若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值满足预设条件,根据所述当前频点上的扩展高频带信号的频谱以及所述当前子带的噪声基底信息得到所述当前频点上的融合高频带信号;或若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值不满足预设条件,根据所述当前频点上的重建音调信号频谱得到所述当前频点上的融合高频带信号。
- 根据权利要求24所述的音频解码器,其特征在于,所述噪声基底信息包括噪声基底增益参数。
- 根据权利要求25所述的音频解码器,其特征在于,所述当前子带的噪声基底增益参数根据所述当前子带的宽度,所述当前子带的扩展高频带信号的频谱的能量,以及所述当前子带的噪声基底能量获得。
- 根据权利要求23所述的音频解码器,其特征在于,若所述第一高频带信号包括所述解码高频带信号以及所述扩展高频带信号,所述融合单元具体用于:若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值不满足预设条件,根据所述当前频点上的重建音调信号频谱得到所述当前频点上的融合高频带信号;或若所述当前帧的当前子带的当前频点上的重建音调信号频谱的值满足预设条件,根据所述当前频点上的扩展高频带信号的频谱,所述当前频点上的解码高频带信号的频谱,以及所述当前子带的噪声基底信息得到所述当前频点上的融合高频带信号。
- 根据权利要求27所述的音频解码器,其特征在于,所述噪声基底信息包括噪声基底增益参数。
- 根据权利要求28所述的音频解码器,其特征在于,所述当前子带的噪声基底增益参数根据所述当前子带的宽度,所述当前子带的噪声基底能量,所述当前子带的扩展高频带信号的频谱的能量,以及所述当前子带的解码高频带信号的频谱的能量获得。
- 根据权利要求23所述的音频解码器,其特征在于,若所述第一高频带信号包括所述解码高频带信号以及所述扩展高频带信号,所述融合单元还用于:根据预设指示信息或解码得到的指示信息,从所述解码高频带信号,扩展高频带信号以及所述重建音调信号中选择至少一个信号得到所述当前帧的融合高频带信号。
- 根据权利要求26或29所述的音频解码器,其特征在于,所述第二编码参数包括用于指示所述噪声基底能量的噪声基底参数。
- 根据权利要求31或34所述的音频解码器,其特征在于,所述预设条件包括:重建音调信号频谱的值为0或小于预设阈值。
- 一种音频编码设备,其特征在于,包括至少一个处理器,所述至少一个处理器用于与存储器耦合,读取并执行所述存储器中的指令,以实现如权利要求1至5中任一项所述的方法。
- 一种音频解码设备,其特征在于,包括至少一个处理器,所述至少一个处理器用于与存储器耦合,读取并执行所述存储器中的指令,以实现如权利要求6至16中任一项所述的方法。
- 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至16任意一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020227026854A KR20220123108A (ko) | 2020-01-13 | 2021-01-12 | 오디오 인코딩 및 디코딩 방법 및 오디오 인코딩 및 디코딩 장치 |
EP21741759.1A EP4084001A4 (en) | 2020-01-13 | 2021-01-12 | AUDIO CODING AND DECODING METHODS AND DEVICES |
JP2022542749A JP7443534B2 (ja) | 2020-01-13 | 2021-01-12 | オーディオ符号化および復号方法ならびにオーディオ符号化および復号デバイス |
US17/864,116 US12039984B2 (en) | 2020-01-13 | 2022-07-13 | Audio encoding and decoding method and audio encoding and decoding device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010033326.XA CN113192523B (zh) | 2020-01-13 | 2020-01-13 | 一种音频编解码方法和音频编解码设备 |
CN202010033326.X | 2020-01-13 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/864,116 Continuation US12039984B2 (en) | 2020-01-13 | 2022-07-13 | Audio encoding and decoding method and audio encoding and decoding device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021143692A1 true WO2021143692A1 (zh) | 2021-07-22 |
Family
ID=76863590
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/071328 WO2021143692A1 (zh) | 2020-01-13 | 2021-01-12 | 一种音频编解码方法和音频编解码设备 |
Country Status (6)
Country | Link |
---|---|
US (1) | US12039984B2 (zh) |
EP (1) | EP4084001A4 (zh) |
JP (1) | JP7443534B2 (zh) |
KR (1) | KR20220123108A (zh) |
CN (1) | CN113192523B (zh) |
WO (1) | WO2021143692A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113808596A (zh) * | 2020-05-30 | 2021-12-17 | 华为技术有限公司 | 一种音频编码方法和音频编码装置 |
CN114127844A (zh) * | 2021-10-21 | 2022-03-01 | 北京小米移动软件有限公司 | 一种信号编解码方法、装置、编码设备、解码设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1831940A (zh) * | 2006-04-07 | 2006-09-13 | 安凯(广州)软件技术有限公司 | 基于音频解码器的音调和节奏快速调节方法 |
CN102194458A (zh) * | 2010-03-02 | 2011-09-21 | 中兴通讯股份有限公司 | 频带复制方法、装置及音频解码方法、系统 |
CN104584124A (zh) * | 2013-01-22 | 2015-04-29 | 松下电器产业株式会社 | 带宽扩展参数生成装置、编码装置、解码装置、带宽扩展参数生成方法、编码方法、以及解码方法 |
US20180182403A1 (en) * | 2016-12-27 | 2018-06-28 | Fujitsu Limited | Audio coding device and audio coding method |
US20190035413A1 (en) * | 2017-07-28 | 2019-01-31 | Fujitsu Limited | Audio encoding apparatus and audio encoding method |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2555182C (en) * | 2004-03-12 | 2011-01-04 | Nokia Corporation | Synthesizing a mono audio signal based on an encoded multichannel audio signal |
CN101297356B (zh) * | 2005-11-04 | 2011-11-09 | 诺基亚公司 | 用于音频压缩的方法和设备 |
JP2008058727A (ja) * | 2006-08-31 | 2008-03-13 | Toshiba Corp | 音声符号化装置 |
KR101355376B1 (ko) * | 2007-04-30 | 2014-01-23 | 삼성전자주식회사 | 고주파수 영역 부호화 및 복호화 방법 및 장치 |
JP4932917B2 (ja) * | 2009-04-03 | 2012-05-16 | 株式会社エヌ・ティ・ティ・ドコモ | 音声復号装置、音声復号方法、及び音声復号プログラム |
KR101991421B1 (ko) * | 2013-06-21 | 2019-06-21 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에.베. | 에너지 조정 모듈을 갖는 대역폭 확장 모듈을 구비한 오디오 디코더 |
PL3550563T3 (pl) * | 2014-03-31 | 2024-07-08 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Enkoder, dekoder, sposób enkodowania, sposób dekodowania oraz powiązane programy |
PT3696813T (pt) * | 2016-04-12 | 2022-12-23 | Fraunhofer Ges Forschung | Codificador de áudio para codificar um sinal de áudio, método para codificar um sinal de áudio e programa de computador sob consideração de uma região espectral de pico detetada numa banda de frequência superior |
EP4303871A3 (en) * | 2018-01-26 | 2024-03-20 | Dolby International AB | Backward-compatible integration of high frequency reconstruction techniques for audio signals |
EP3662469A4 (en) * | 2018-04-25 | 2020-08-19 | Dolby International AB | INTEGRATION OF HIGH FREQUENCY RECONSTRUCTION TECHNIQUES WITH REDUCED POST-PROCESSING DELAY |
-
2020
- 2020-01-13 CN CN202010033326.XA patent/CN113192523B/zh active Active
-
2021
- 2021-01-12 WO PCT/CN2021/071328 patent/WO2021143692A1/zh unknown
- 2021-01-12 EP EP21741759.1A patent/EP4084001A4/en active Pending
- 2021-01-12 JP JP2022542749A patent/JP7443534B2/ja active Active
- 2021-01-12 KR KR1020227026854A patent/KR20220123108A/ko active Search and Examination
-
2022
- 2022-07-13 US US17/864,116 patent/US12039984B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1831940A (zh) * | 2006-04-07 | 2006-09-13 | 安凯(广州)软件技术有限公司 | 基于音频解码器的音调和节奏快速调节方法 |
CN102194458A (zh) * | 2010-03-02 | 2011-09-21 | 中兴通讯股份有限公司 | 频带复制方法、装置及音频解码方法、系统 |
CN104584124A (zh) * | 2013-01-22 | 2015-04-29 | 松下电器产业株式会社 | 带宽扩展参数生成装置、编码装置、解码装置、带宽扩展参数生成方法、编码方法、以及解码方法 |
US20180182403A1 (en) * | 2016-12-27 | 2018-06-28 | Fujitsu Limited | Audio coding device and audio coding method |
US20190035413A1 (en) * | 2017-07-28 | 2019-01-31 | Fujitsu Limited | Audio encoding apparatus and audio encoding method |
Non-Patent Citations (1)
Title |
---|
See also references of EP4084001A4 |
Also Published As
Publication number | Publication date |
---|---|
US20220358941A1 (en) | 2022-11-10 |
CN113192523A (zh) | 2021-07-30 |
EP4084001A4 (en) | 2023-03-08 |
KR20220123108A (ko) | 2022-09-05 |
JP7443534B2 (ja) | 2024-03-05 |
JP2023510556A (ja) | 2023-03-14 |
CN113192523B (zh) | 2024-07-16 |
US12039984B2 (en) | 2024-07-16 |
EP4084001A1 (en) | 2022-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021143694A1 (zh) | 一种音频编解码方法和音频编解码设备 | |
WO2021143692A1 (zh) | 一种音频编解码方法和音频编解码设备 | |
WO2021208792A1 (zh) | 音频信号编码方法、解码方法、编码设备以及解码设备 | |
WO2021244418A1 (zh) | 一种音频编码方法和音频编码装置 | |
WO2021143691A1 (zh) | 一种音频编解码方法和音频编解码设备 | |
US20230040515A1 (en) | Audio signal coding method and apparatus | |
US20220335962A1 (en) | Audio encoding method and device and audio decoding method and device | |
US20220335961A1 (en) | Audio signal encoding method and apparatus, and audio signal decoding method and apparatus | |
CA3193063A1 (en) | Spatial audio parameter encoding and associated decoding | |
WO2022012677A1 (zh) | 音频编解码方法和相关装置及计算机可读存储介质 | |
US20230154472A1 (en) | Multi-channel audio signal encoding method and apparatus | |
US12057130B2 (en) | Audio signal encoding method and apparatus, and audio signal decoding method and apparatus | |
TWI847276B (zh) | 編解碼方法、裝置、設備、儲存媒體及電腦程式產品 | |
US12100408B2 (en) | Audio coding with tonal component screening in bandwidth extension | |
EP4398242A1 (en) | Encoding and decoding methods and apparatus, device, storage medium, and computer program | |
WO2023051367A1 (zh) | 解码方法、装置、设备、存储介质及计算机程序产品 | |
CN115881140A (zh) | 编解码方法、装置、设备、存储介质及计算机程序产品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21741759 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022542749 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20227026854 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021741759 Country of ref document: EP Effective date: 20220725 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |