WO2021143691A1 - 一种音频编解码方法和音频编解码设备 - Google Patents

一种音频编解码方法和音频编解码设备 Download PDF

Info

Publication number
WO2021143691A1
WO2021143691A1 PCT/CN2021/071327 CN2021071327W WO2021143691A1 WO 2021143691 A1 WO2021143691 A1 WO 2021143691A1 CN 2021071327 W CN2021071327 W CN 2021071327W WO 2021143691 A1 WO2021143691 A1 WO 2021143691A1
Authority
WO
WIPO (PCT)
Prior art keywords
frequency region
parameter
component
current frequency
current
Prior art date
Application number
PCT/CN2021/071327
Other languages
English (en)
French (fr)
Inventor
夏丙寅
李佳蔚
王喆
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202010033973.0A external-priority patent/CN113192517B/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21740645.3A priority Critical patent/EP4080503A4/en
Priority to KR1020227026986A priority patent/KR20220117340A/ko
Priority to JP2022542159A priority patent/JP2023509201A/ja
Publication of WO2021143691A1 publication Critical patent/WO2021143691A1/zh
Priority to US17/862,712 priority patent/US11887610B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • G10L19/0208Subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0017Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Definitions

  • This application relates to the technical field of audio signal coding and decoding, and in particular to an audio coding and decoding method and audio coding and decoding equipment.
  • the embodiments of the present application provide an audio coding and decoding method and an audio coding and decoding device, which can improve the quality of decoded audio signals.
  • an audio encoding method comprising: obtaining a current frame of an audio signal, the current frame including a high frequency band signal; obtaining the high frequency band of the current frame according to the high frequency band signal Parameters, the high-frequency band parameters are used to indicate the position, quantity, and amplitude or energy of the tonal components included in the high-frequency band signal; code stream multiplexing is performed on the high-frequency band coding parameters to obtain a coded bit stream.
  • the high-frequency band parameters include a position quantity parameter of a pitch component, and an amplitude parameter or an energy parameter of the pitch component.
  • the high frequency band corresponding to the high frequency band signal includes at least one frequency region, one frequency region includes at least one subband, and Obtaining the high-frequency band parameters of the current frame according to the high-frequency band signal includes: determining the position quantity parameter of the tonal component of the current frequency region according to the high-frequency band signal of the current frequency region in the at least one frequency region And the amplitude parameter or energy parameter of the tonal component in the current frequency region.
  • the method includes: determining whether a tonal component is included in the current frequency region; when a tonal component is included in the current frequency region, according to The high-band signal of the current frequency region in the at least one frequency region determines the position quantity parameter of the tone component of the current frequency region and the amplitude parameter or energy parameter of the tone component of the current frequency region.
  • the high-band parameters of the current frame further include tone component indication information, and the tone component indication information is used to indicate the current frequency region Whether to include tonal components.
  • the determination of the tone component of the current frequency region based on the high-band signal of the current frequency region in the at least one frequency region includes: performing a peak search in the current frequency region according to a high-band signal of the current frequency region in the at least one frequency region to obtain At least one of peak quantity information, peak position information, and peak amplitude information of the current area; determining the current frequency area according to at least one of peak quantity information, peak position information, and peak amplitude information The position quantity parameter of the tone component in the frequency region and the amplitude parameter or energy parameter of the tone component in the current frequency region.
  • a peak search is performed in the current frequency region according to the high-band signal of the current frequency region in the at least one frequency region to obtain
  • the at least one of the peak quantity information, peak position information, and peak amplitude information of the current region includes: according to at least one of a power spectrum, an energy spectrum, or an amplitude spectrum of the current frequency region in the at least one frequency region.
  • a peak search is performed in the current frequency region to obtain at least one of peak quantity information, peak position information, and peak amplitude information in the current region.
  • the current frequency region is determined according to at least one of peak quantity information, peak position information, and peak amplitude information.
  • the position quantity parameter of the tone component of the frequency region and the amplitude parameter or energy parameter of the tone component of the current frequency region include: according to at least one of peak quantity information, peak position information, and peak amplitude information of the current frequency region, Determine the position information, quantity information and amplitude information of the tonal components in the current frequency region; determine the position and quantity parameters of the tonal components in the current frequency region according to the position information, quantity information and amplitude information of the tonal components in the current frequency region And the amplitude parameter or energy parameter of the tonal component in the current frequency region.
  • the position quantity parameter of the pitch component of the current frequency region includes N bits, where N is the number of subbands included in the current frequency region , The N bits have a one-to-one correspondence with the subbands included in the current frequency region; wherein, if the first subband included in the current frequency region has a peak, then the N bits and the first subband have a peak value.
  • the value of the bit corresponding to a subband is the first value; or if there is no peak in the second subband included in the current frequency region, the bit corresponding to the second subband among the N bits
  • the value of is a second value, and the first value is different from the second value.
  • the position quantity parameter of the pitch component of the current frequency region includes N bits, where N is the number of subbands included in the current frequency region , The N bits have a one-to-one correspondence with the sub-bands included in the current frequency region; wherein, if the first sub-band included in the current frequency region has a tonal component, the N bits and the sub-bands The value of the bit corresponding to the first subband is the first value; or if the second subband included in the current frequency region does not have a tonal component, then among the N bits corresponding to the second subband The value of the bit position is a second value, and the first value is different from the second value.
  • the high-band parameter further includes a noise floor parameter of the high-band signal.
  • an audio decoding method including: obtaining an encoded bitstream; demultiplexing the encoded bitstream to obtain the high-frequency band parameters of the current frame of the audio signal.
  • the parameter is used to indicate the position, quantity, and amplitude or energy of the tonal components included in the high-frequency signal of the current frame; obtain the reconstructed high-frequency signal of the current frame according to the high-frequency parameter; The reconstructed high-band signal obtains the audio output signal of the current frame.
  • the high-frequency band parameters include a position quantity parameter of a pitch component of the high-frequency signal of the current frame and an amplitude parameter or an energy parameter of the pitch component.
  • the high frequency band corresponding to the high frequency band signal includes at least one frequency region, and one frequency region includes at least one subband;
  • the high-frequency band parameter includes the position quantity parameter of the pitch component of the high-frequency signal of the current frame, and the parameter includes the position quantity parameter of the respective pitch component of the at least one frequency region, and the amplitude of the pitch component of the high-frequency signal of the current frame.
  • the parameters or energy parameters include the amplitude parameters or energy parameters of the respective tonal components of the at least one frequency region.
  • the demultiplexing the code stream to obtain the high frequency band parameters of the current frame of the audio signal includes: obtaining The position quantity parameter of the tone component of the current frequency region of the at least one frequency region; and the amplitude parameter of the tone component of the current frequency region is analyzed from the coded code stream according to the position quantity parameter of the tone component of the current frequency region Or energy parameters.
  • the pitch of the current frequency region is parsed from the code stream according to the position quantity parameter of the pitch component of the current frequency region
  • the amplitude parameter or energy parameter of the component includes: determining the quantity parameter of the pitch component of the current frequency region according to the position quantity parameter of the pitch component of the current frequency region; according to the quantity parameter of the pitch component of the current frequency region, Analyze the amplitude parameter or the energy parameter of the tonal component of the current frequency region from the coded code stream.
  • the demultiplexing the code stream to obtain the high frequency band parameters of the current frame of the audio signal includes: obtaining The position quantity parameter of the tonal component in the current frequency area of the at least one frequency area; and determine the position parameter of the tonal component in the current frequency area and the value of the tonal component in the current frequency area according to the position quantity parameter of the tonal component in the current frequency area Quantity parameter; parse the amplitude parameter or energy parameter of the tonal component of the current frequency region from the code stream according to the quantity parameter of the tonal component of the current frequency region.
  • the method before acquiring the position quantity parameter of the tonal components of the current frequency region of the at least one frequency region, the method includes: acquiring the information of the current frequency region Tonal component indication information; the tonal component indication information is used to indicate whether a tonal component is included in the current frequency region; when a tonal component is included in the current frequency region, the tone of the current frequency region of the at least one frequency region is acquired The position quantity parameter of the component.
  • the acquiring the position quantity parameter of the tonal components of the current frequency region of the at least one frequency region includes: according to the current frequency region included The number of subbands read N bits from the code stream, where the N bits are the position quantity parameter of the pitch component of the current frequency region, where N is the subband included in the current frequency region The number of the N bits corresponds to the subbands included in the current frequency region in a one-to-one correspondence.
  • the obtaining the reconstructed high-band signal of the current frame according to the high-band parameter includes: according to the pitch of the current frequency region The position quantity parameter of the component determines the position of the tonal component in the current frequency region; the amplitude or energy corresponding to the position of the tonal component is determined according to the amplitude parameter or energy parameter of the tonal component in the current frequency region; according to the current frequency The position of the tone component in the area and the amplitude or energy corresponding to the position of the tone component obtain the reconstructed high-band signal.
  • the position of the tonal component in the current frequency region is determined according to the position quantity parameter of the tonal component of the high-frequency signal in the current frequency region
  • the method includes: determining the position parameter of the tonal component of the current frequency region according to the position quantity parameter of the tonal component of the high-frequency signal in the current frequency region; and determining the position parameter of the tonal component of the current frequency region The position of the tonal component in the frequency region.
  • the obtaining the reconstructed high-band signal of the current frame according to the high-band parameter includes: according to the pitch of the current frequency region The position parameter of the component determines the position of the tonal component in the current frequency region; the amplitude or energy corresponding to the position of the tonal component is determined according to the amplitude parameter or the energy parameter of the tonal component in the current frequency region; according to the current frequency region The position of the mid-tone component and the amplitude or energy corresponding to the position of the tonal component obtain the reconstructed high-band signal.
  • the position parameter of the tonal component in the current frequency region is used to indicate the sequence number of the subband including the tonal component in the current frequency region.
  • the position of the tone component in the current frequency region is located at a designated position in the subband where the tone component is located in the current frequency region.
  • the designated position of the sub-band is the center position of the sub-band.
  • the reconstructed high frequency band is obtained according to the position of the tone component in the current frequency region and the amplitude corresponding to the position of the tone component
  • the signal includes: the frequency domain signal that determines the position of the tonal component according to the following calculation formula:
  • pSpectralData represents the reconstructed high-band frequency domain signal in the current frequency region
  • tone_val represents the amplitude value corresponding to the position of the tone component in the current frequency region
  • tone_pos represents the position of the tone component in the current frequency region
  • an audio encoder including: a signal acquisition unit for acquiring a current frame of an audio signal, where the current frame includes a high-band signal; and a parameter acquisition unit for acquiring a signal based on the high-band signal Obtain the high-frequency band parameters of the current frame, where the high-frequency band parameters are used to indicate the position, quantity, and amplitude or energy of the tonal components included in the high-frequency band signal; The coding parameters are coded stream multiplexed to obtain the coded code stream.
  • the high-band parameters include a position quantity parameter of a pitch component, and an amplitude parameter or an energy parameter of the pitch component.
  • the high frequency band corresponding to the high frequency band signal includes at least one frequency region, and one frequency region includes at least one subband;
  • the parameter acquisition unit is specifically configured to determine the position quantity parameter of the tonal component in the current frequency region and the amplitude of the tonal component in the current frequency region according to the high-band signal of the current frequency region in the at least one frequency region Parameters or energy parameters.
  • the audio encoder further includes: a determining unit, configured to determine whether tonal components are included in the current frequency region; and the parameter acquiring unit , Specifically used to determine the position quantity parameter of the tonal component in the current frequency region and the tonal component in the current frequency region according to the high-band signal of the current frequency region in the at least one frequency region when the tonal component is included in the current frequency region.
  • the amplitude parameter or energy parameter of the tonal component in the current frequency region is Specifically used to determine the position quantity parameter of the tonal component in the current frequency region and the tonal component in the current frequency region according to the high-band signal of the current frequency region in the at least one frequency region when the tonal component is included in the current frequency region.
  • the high-band parameters of the current frame further include tone component indication information, and the tone component indication information is used to indicate the current frequency region Whether to include tonal components.
  • the parameter acquisition unit is specifically configured to: according to the high-frequency band signal in the current frequency region in the at least one frequency region, Perform a peak search in the current frequency region to obtain at least one of peak number information, peak position information, and peak amplitude information of the current region; according to the peak number information, peak position information, and peak amplitude information of the current frequency region At least one of determining the position quantity parameter of the tone component in the current frequency region and the amplitude parameter or energy parameter of the tone component in the current frequency region.
  • the parameter acquisition unit is specifically configured to: according to the power spectrum, energy spectrum or amplitude of the current frequency region in the at least one frequency region At least one of the spectrums performs a peak search in the current frequency region to obtain at least one of peak number information, peak position information, and peak amplitude information in the current region.
  • the parameter acquisition unit is specifically configured to: according to the peak number information, peak position information, and peak amplitude information of the current frequency region At least one of determining the position information, quantity information, and amplitude information of the tonal components of the current frequency region; and determining the tonal components of the current frequency region according to the position information, quantity information, and amplitude information of the tonal components of the current frequency region The position quantity parameter and the amplitude parameter or energy parameter of the tonal component of the current frequency region.
  • the position quantity parameter of the pitch component of the current frequency region includes N bits, where N is the number of subbands included in the current frequency region , The N bits have a one-to-one correspondence with the subbands included in the current frequency region; wherein, if there is a peak in the first subband included in the current frequency region, the The value of the bit corresponding to a subband is the first value; or if there is no peak in the second subband included in the current frequency region, the bit corresponding to the second subband among the N bits The value of is a second value, and the first value is different from the second value.
  • the position quantity parameter of the pitch component of the current frequency region includes N bits, where N is the number of subbands included in the current frequency region , The N bits have a one-to-one correspondence with the sub-bands included in the current frequency region; wherein, if the first sub-band included in the current frequency region has a tonal component, the N bits and the sub-bands The value of the bit corresponding to the first subband is the first value; or if the second subband included in the current frequency region does not have a tonal component, then among the N bits corresponding to the second subband The value of the bit position is a second value, and the first value is different from the second value.
  • the high-band parameter further includes a noise floor parameter of the high-band signal.
  • a fourth aspect provides an audio decoder, including: a receiving unit for obtaining a coded stream; a demultiplexing unit for demultiplexing the coded stream to obtain the current frame of the audio signal
  • the high-band parameters are used to indicate the position, number, and amplitude or energy of the tonal components included in the high-band signal of the current frame; the reconstruction unit is used to obtain the high-band parameters according to the high-band parameters.
  • the reconstructed high-band signal of the current frame; and the audio output signal of the current frame is obtained according to the reconstructed high-band signal of the current frame.
  • the high-frequency band parameter includes a position quantity parameter of a pitch component of the high-frequency signal of the current frame and an amplitude parameter or an energy parameter of the pitch component.
  • the high frequency band corresponding to the high frequency band signal includes at least one frequency region, and one frequency region includes at least one subband;
  • the high-frequency band parameter includes the position quantity parameter of the pitch component of the high-frequency signal of the current frame, and the parameter includes the position quantity parameter of the respective pitch component of the at least one frequency region, and the amplitude of the pitch component of the high-frequency signal of the current frame.
  • the parameters or energy parameters include the amplitude parameters or energy parameters of the respective tonal components of the at least one frequency region.
  • the demultiplexing unit is specifically configured to: obtain the position quantity parameter of the tonal component of the current frequency region of the at least one frequency region; Analyze the amplitude parameter or the energy parameter of the tonal component in the current frequency region from the code stream according to the position quantity parameter of the tonal component in the current frequency region.
  • the demultiplexing unit is specifically configured to: determine the current frequency according to the position quantity parameter of the tone component in the current frequency region The quantity parameter of the tonal component of the region; according to the quantity parameter of the tonal component of the current frequency region, the amplitude parameter or the energy parameter of the tonal component of the current frequency region is parsed from the coded code stream.
  • the demultiplexing unit is specifically configured to: obtain the position quantity parameter of the tonal component of the current frequency region of the at least one frequency region; Determine the position parameter of the tonal component in the current frequency region and the quantitative parameter of the tonal component in the current frequency region according to the position quantity parameter of the tonal component in the current frequency region; and determine the position parameter of the tonal component in the current frequency region according to the quantity parameter of the tonal component in the current frequency region from the Analyze the amplitude parameter or energy parameter of the tonal component in the current frequency region in the coded stream.
  • the demultiplexing unit is specifically configured to: obtain the tone component indication information of the current frequency region; the tone component indication information is used for To indicate whether tonal components are included in the current frequency region; when the current frequency region includes tonal components, obtain the position quantity parameter of the tonal components in the current frequency region of the at least one frequency region.
  • the demultiplexing unit is specifically configured to: read from the code stream according to the number of subbands included in the current frequency region Take N bits, the N bits are the position quantity parameter of the pitch component of the current frequency region, where N is the number of subbands included in the current frequency region, and the N bits are the same as the The subbands included in the current frequency region have a one-to-one correspondence.
  • the reconstruction unit is specifically configured to: determine the tone in the current frequency region according to the position quantity parameter of the tone component in the current frequency region The position of the component; determine the amplitude or energy corresponding to the position of the tonal component according to the amplitude parameter or energy parameter of the tonal component in the current frequency region; according to the position of the tonal component in the current frequency region and the position of the tonal component The corresponding amplitude or energy obtains the reconstructed high frequency band signal.
  • the reconstruction unit is specifically configured to: determine the position quantity parameter of the tonal component of the high-frequency signal in the current frequency region The position parameter of the tonal component in the current frequency area; and the position of the tonal component in the current frequency area is determined according to the position parameter of the tonal component in the current frequency area.
  • the reconstruction unit is specifically configured to: determine the tone in the current frequency region according to the position parameter of the tone component in the current frequency region Component position; determine the amplitude or energy corresponding to the position of the tonal component according to the amplitude parameter or energy parameter of the tonal component in the current frequency region; according to the position of the tonal component in the current frequency region and the position of the tonal component corresponding The amplitude or energy of the reconstructed high-band signal is obtained.
  • the position parameter of the tonal component in the current frequency region is used to indicate the sequence number of the subband including the tonal component in the current frequency region.
  • the position of the tonal component in the current frequency region is located at a designated position in the subband where the tonal component is located in the current frequency region.
  • the designated position of the sub-band is the center position of the sub-band.
  • the reconstructed high frequency band is obtained according to the position of the tone component in the current frequency region and the amplitude corresponding to the position of the tone component
  • the signal includes: the frequency domain signal that determines the position of the tonal component according to the following calculation formula:
  • pSpectralData represents the reconstructed high-band frequency domain signal in the current frequency region
  • tone_val represents the amplitude value corresponding to the position of the tone component in the current frequency region
  • tone_pos represents the position of the tone component in the current frequency region
  • the embodiments of the present application provide a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, causes the computer to execute the above-mentioned first or second aspect. The method described.
  • the embodiments of the present application provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the method described in the first aspect or the second aspect.
  • an embodiment of the present application provides an audio encoder, including a processor and a memory; the memory is used to store instructions; the processor is used to execute the instructions in the memory, so that the audio encoding
  • the device performs any one of the methods of the aforementioned first aspect.
  • an embodiment of the present application provides an audio decoder, including a processor and a memory; the memory is used to store instructions; the processor is used to execute the instructions in the memory, so that the audio decoding
  • the device executes any one of the methods of the aforementioned second aspect.
  • an embodiment of the present application provides a communication device.
  • the communication device may include entities such as audio codec equipment or a chip.
  • the communication device includes a processor and optionally a memory; the memory is used for Storing instructions; the processor is configured to execute the instructions in the memory, so that the communication device executes the method according to any one of the foregoing first aspect or second aspect.
  • this application provides a chip system that includes a processor for supporting audio codec devices to implement the functions involved in the above aspects, for example, sending or processing the data and/or involved in the above methods Or information.
  • the chip system further includes a memory, and the memory is used to store necessary program instructions and data of the audio codec device.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • the audio encoder in the embodiment of the present invention encodes the position, quantity, and amplitude or energy of the tonal components in the high-band signal, so that the audio decoder recovers according to the position, quantity, and amplitude or energy of the tonal components
  • the tonal component makes the position and energy of the restored tonal component more accurate, thereby improving the quality of the decoded signal.
  • FIG. 1 is a schematic structural diagram of an audio codec system provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of an audio coding method provided by an embodiment of the application
  • FIG. 3 is a schematic flowchart of an audio decoding method provided by an embodiment of this application.
  • FIG. 4 is a schematic diagram of a mobile terminal according to an embodiment of the application.
  • Fig. 5 is a schematic diagram of a network element according to an embodiment of the application.
  • FIG. 6 is a schematic diagram of the composition structure of an audio coding device provided by an embodiment of the application.
  • FIG. 7 is a schematic diagram of the composition structure of an audio decoding device provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of the composition structure of another audio coding device provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of the composition structure of another audio decoding device provided by an embodiment of the application.
  • the audio signal in the embodiment of the present application refers to the input signal in the audio encoding device.
  • the audio signal may include multiple frames.
  • the current frame may specifically refer to a certain frame in the audio signal.
  • the current frame The audio signal coding and decoding are illustrated by examples.
  • the previous frame or the next frame of the current frame in the audio signal can be coded and decoded according to the coding and decoding mode of the current frame audio signal.
  • the audio signal in the embodiment of the present application may be a mono audio signal, or may also be a stereo signal.
  • the stereo signal can be the original stereo signal, it can also be a stereo signal composed of two signals (left channel signal and right channel signal) included in the multi-channel signal, or it can be composed of the multi-channel signal.
  • Fig. 1 is a schematic structural diagram of an audio coding and decoding system according to an exemplary embodiment of the application.
  • the audio codec system includes an encoding component 110 and a decoding component 120.
  • the encoding component 110 is used to encode the current frame (audio signal) in the frequency domain or the time domain.
  • the encoding component 110 can be implemented by software; alternatively, it can also be implemented by hardware; or, it can also be implemented by a combination of software and hardware, which is not limited in the embodiments of the present application.
  • the encoding component 110 encodes the current frame in the frequency domain or the time domain, in a possible implementation manner, the steps shown in FIG. 2 may be included.
  • the encoding component 110 can generate an encoded bitstream after encoding is completed, and the encoding component 110 can send the encoded bitstream to the decoding component 120, so that the decoding component 120 can receive the encoded bitstream, and then decode the encoded bitstream.
  • the component 120 obtains the audio output signal from the coded stream.
  • the encoding method shown in FIG. 2 is only an example and not a limitation.
  • the embodiment of the present application does not limit the execution order of the steps in FIG. 2 and the encoding method shown in FIG. 2 may also include more Or fewer steps, which are not limited in the embodiments of the present application.
  • the encoding component 110 and the decoding component 120 may be connected in a wired or wireless manner, and the decoding component 120 may obtain the encoded bitstream generated by the encoding component 110 through the connection between the encoding component 110 and the encoding component 110; or, the encoding component 110 may The generated code stream is stored in the memory, and the decoding component 120 reads the code stream in the memory.
  • the decoding component 120 can be implemented by software; alternatively, it can also be implemented by hardware; or, it can also be implemented by a combination of software and hardware, which is not limited in the embodiment of the present application.
  • the decoding component 120 decodes the current frame (audio signal) in the frequency domain or the time domain, in a possible implementation manner, the steps shown in FIG. 3 may be included.
  • the encoding component 110 and the decoding component 120 can be provided in the same device; or, they can also be provided in different devices.
  • the device can be a terminal with audio signal processing functions such as mobile phones, tablet computers, laptop computers and desktop computers, Bluetooth speakers, voice recorders, wearable devices, etc., or it can be a core network or wireless network with audio signal processing capabilities This embodiment does not limit this.
  • the encoding component 110 is installed in the mobile terminal 130
  • the decoding component 120 is installed in the mobile terminal 140.
  • the mobile terminal 130 and the mobile terminal 140 are independent of each other and have audio signal processing capabilities.
  • the electronic device may be a mobile phone, a wearable device, a virtual reality (VR) device, or an augmented reality (AR) device, etc., and the mobile terminal 130 and the mobile terminal 140 are connected wirelessly or wiredly. Take network connection as an example.
  • the mobile terminal 130 may include an acquisition component 131, an encoding component 110, and a channel encoding component 132, where the acquisition component 131 is connected to the encoding component 110, and the encoding component 110 is connected to the encoding component 132.
  • the mobile terminal 140 may include an audio playing component 141, a decoding component 120, and a channel decoding component 142.
  • the audio playing component 141 is connected to the decoding component 120
  • the decoding component 120 is connected to the channel decoding component 142.
  • the mobile terminal 130 After the mobile terminal 130 collects the audio signal through the collection component 131, it encodes the audio signal through the encoding component 110 to obtain an encoded code stream; then, the channel encoding component 132 encodes the encoded code stream to obtain a transmission signal.
  • the mobile terminal 130 transmits the transmission signal to the mobile terminal 140 through a wireless or wired network.
  • the mobile terminal 140 After receiving the transmission signal, the mobile terminal 140 decodes the transmission signal through the channel decoding component 142 to obtain a code stream; decodes the code stream through the decoding component 110 to obtain an audio signal; and plays the audio signal through the audio playback component. It can be understood that the mobile terminal 130 may also include components included in the mobile terminal 140, and the mobile terminal 140 may also include components included in the mobile terminal 130.
  • the encoding component 110 and the decoding component 120 are provided in a network element 150 capable of processing audio signals in the same core network or wireless network as an example for description.
  • the network element 150 includes a channel decoding component 151, a decoding component 120, an encoding component 110, and a channel encoding component 152.
  • the channel decoding component 151 is connected to the decoding component 120
  • the decoding component 120 is connected to the encoding component 110
  • the encoding component 110 is connected to the channel encoding component 152.
  • the channel decoding component 151 After the channel decoding component 151 receives the transmission signal sent by other devices, it decodes the transmission signal to obtain the first coded code stream; the decoding component 120 decodes the coded code stream to obtain the audio signal; the coding component 110 performs the decoding on the audio signal Encode to obtain a second coded code stream; use the channel coding component 152 to encode the second coded code stream to obtain a transmission signal.
  • the other device may be a mobile terminal with audio signal processing capability; or, it may also be other network elements with audio signal processing capability, which is not limited in this embodiment.
  • the encoding component 110 and the decoding component 120 in the network element can transcode the encoded code stream sent by the mobile terminal.
  • the device installed with the encoding component 110 may be referred to as an audio encoding device.
  • the audio encoding device may also have an audio decoding function, which is not limited in the implementation of this application.
  • the device installed with the decoding component 120 may be referred to as an audio decoding device.
  • the audio decoding device may also have an audio encoding function, which is not limited in the implementation of this application.
  • Figure 2 describes the flow of an audio coding method provided by an embodiment of the present invention, including:
  • the current frame can be any frame in the audio signal, and the current frame can include a high-band signal and a low-band signal.
  • the division of the high-band signal and the low-band signal can be determined by the frequency band threshold, which is higher than the frequency band threshold.
  • the frequency band threshold signal is a high frequency band signal, and the signal below the frequency band threshold value is a low frequency band signal.
  • the frequency band threshold can be determined according to the transmission bandwidth, the data processing capability of the encoding component 110 and the decoding component 120, and it will not be done here. limited.
  • the high-band signal and the low-band signal are relative. For example, a signal below a certain frequency is a low-band signal, but a signal above this frequency is a high-band signal (the signal corresponding to the frequency can be classified into the low-band Signals can also be assigned to high-band signals).
  • the frequency varies according to the bandwidth of the current frame. For example, when the current frame is a 0-8khz wideband signal, the frequency may be 4khz; when the current frame is a 0-16khz ultra-wideband signal, the frequency may be 8khz.
  • the high-frequency band parameter is used to indicate the position, quantity, and amplitude or energy of the tonal components included in the high-frequency band signal.
  • the high frequency band parameters include a position quantity parameter of the tone component, and an amplitude parameter or an energy parameter of the tone component.
  • the number of positions parameter indicates that the position of the tonal component and the number of tonal components are represented by the same parameter.
  • the high-band parameters include the position parameter of the tonal component, the quantity parameter of the tonal component, and the amplitude parameter or energy parameter of the tonal component; in this case, the position and quantity of the tonal component are different. The parameter representation.
  • the high frequency band corresponding to the high frequency band signal includes at least one frequency region (Tile), and one frequency region includes at least one subband.
  • the high-band parameter of the current frame includes: determining the position quantity parameter of the tone component of the current frequency region and the tone component of the current frequency region according to the high-band signal of the current frequency region in the at least one frequency region The amplitude parameter or energy parameter.
  • the method includes: determining whether a tonal component is included in the current frequency region; when a tonal component is included in the current frequency region, according to the current frequency region in the at least one frequency region For a high-band signal, the position quantity parameter of the tonal component in the current frequency region and the amplitude parameter or energy parameter of the tonal component in the current frequency region are determined. In this way, only the parameters of the frequency region with tonal components are acquired, thereby improving the coding efficiency.
  • the high-band parameters of the current frame further include pitch component indication information, and the pitch component indication information is used to indicate whether a pitch component is included in the current frequency region.
  • the audio decoder to perform decoding according to the indication information, which improves decoding efficiency.
  • the position quantity parameter of the tone component of the current frequency region and the tone component of the current frequency region are determined according to the high-band signal of the current frequency region in the at least one frequency region
  • the amplitude parameter or energy parameter includes: performing a peak search in the current frequency region according to the high-band signal of the current frequency region in the at least one frequency region, to obtain peak quantity information and peak position information of the current region And at least one of peak amplitude information; according to at least one of peak number information, peak position information, and peak amplitude information in the current frequency region, determine the position quantity parameter of the tonal component in the current frequency region and the The amplitude parameter or energy parameter of the tonal component in the current frequency region.
  • the high-band signal for peak search may be a frequency domain signal or a time domain signal.
  • the peak search may be specifically performed according to at least one of a power spectrum, an energy spectrum, or an amplitude spectrum of the current frequency region.
  • the position quantity parameter of the tonal component of the current frequency region and the position quantity parameter of the tonal component of the current frequency region and the peak amplitude information are determined according to at least one of the peak quantity information, peak position information, and peak amplitude information of the current frequency region.
  • the amplitude parameter or energy parameter of the tonal component in the current frequency region includes: determining the position information of the tonal component in the current frequency region according to at least one of peak quantity information, peak position information, and peak amplitude information in the current frequency region , Quantity information and amplitude information; according to the position information, quantity information and amplitude information of the tonal component of the current frequency region, determine the position quantity parameter of the tonal component of the current frequency region and the amplitude parameter of the tone component of the current frequency region Or energy parameters.
  • the position quantity parameter of the pitch component of the current frequency region includes N bits, where N is the number of subbands included in the current frequency region, and the N bits are related to the current frequency.
  • the subbands included in the region have a one-to-one correspondence; wherein, if the first subband included in the current frequency region has a peak, the value of the bit corresponding to the first subband among the N bits is the first Value; or if there is no peak in the second subband included in the current frequency region, the value of the bit corresponding to the second subband in the N bits is the second value, and the first value Different from the second value.
  • the position quantity parameter of the pitch component of the current frequency region includes N bits, where N is the number of subbands included in the current frequency region, and the N bits are related to the current frequency.
  • the subbands included in the region have a one-to-one correspondence; wherein, if the first subband included in the current frequency region has a tonal component, the value of the bit corresponding to the first subband in the N bits is the first Or if the second subband included in the current frequency region does not have tonal components, the value of the bit corresponding to the second subband in the N bits is the second value, and the first One value is different from the second value.
  • the high-band parameters may further include a noise floor parameter of the high-band signal.
  • the audio encoding method may include the following procedures:
  • the high frequency band parameters include position parameters, quantity parameters, and amplitude parameters of tonal components.
  • the high-band parameter which may specifically be:
  • peak search according to the power spectrum of the high-frequency signal to obtain peak number information, peak position information, and peak amplitude information.
  • the embodiment of the present invention does not limit the specific way of peak search. For example, if the value of the power spectrum corresponding to the current frequency point is significantly different from the value of the power spectrum corresponding to the left and right adjacent frequency points, the frequency point is the peak value.
  • filtering is performed according to at least one of the peak position, the peak amplitude, and the number of peaks to determine the position parameter, the quantity parameter, and the amplitude parameter of the tonal component.
  • filtering based on the peak amplitude may be: the peak amplitude is greater than a preset threshold value as a preset condition.
  • the number of peaks that meet the preset condition can be used as the number parameter of the tonal components.
  • the corresponding peak position is used as the position parameter of the tonal component, or the position parameter of the tonal component is determined according to the corresponding peak position.
  • the subband sequence number corresponding to the peak position is obtained according to the corresponding peak position, and the subband sequence number corresponding to the peak position is used as the position parameter of the pitch component.
  • the corresponding peak amplitude is used as the amplitude parameter of the tonal component or the amplitude parameter of the tonal component is determined according to the corresponding peak amplitude.
  • the peak amplitude can be characterized by the energy of the frequency domain signal or the power of the frequency domain signal.
  • the amplitude parameter of the tonal component can be replaced with the energy parameter of the tonal component as the high-frequency band parameter.
  • the high frequency band is divided into K frequency regions (tile), and each frequency region is divided into N subbands.
  • the determination of the high-band parameters based on the high-band signals can also be performed in various frequency regions.
  • K and N are both integers greater than or equal to 1.
  • the high frequency band parameters include the position quantity parameter and the amplitude parameter of the tonal component.
  • the high frequency band can be divided into K frequency regions (tile), and each frequency region is divided into N subbands.
  • the determination of the high-frequency band parameters can be performed in units of frequency regions. Take a frequency region as an example.
  • the method for determining the high-band parameters according to the high-band signal may specifically be:
  • Peak search is performed in units of frequency regions.
  • a peak search is performed on the power spectrum of a high-band signal in a frequency region to obtain peak number information, peak position information, and peak amplitude information in the frequency region.
  • Screening is performed according to at least one of the peak position, the peak amplitude, and the number of peaks, and the position quantity parameter and the amplitude parameter of the tone component are determined.
  • Screening is performed according to at least one of the peak position, the peak amplitude, and the number of peaks, and the position parameter, quantity parameter, and amplitude parameter of the tonal component are determined.
  • the position parameter of the tonal component may be the sequence number of the subband with the peak in the frequency region.
  • the parameter of the number of tonal components is the number of subbands with peaks in the frequency region.
  • the amplitude parameter of the tonal component may be equal to the peak amplitude of the subband with the peak in the frequency region or calculated according to the peak amplitude of the subband with the peak in the frequency region.
  • the peak amplitude can be characterized by the energy of the frequency domain signal or the power of the frequency domain signal.
  • the amplitude parameter of the tonal component can be replaced with the energy parameter of the tonal component as the high-frequency band parameter.
  • the position quantity parameter of the tonal component is determined.
  • the position quantity parameter of the tone component can be represented by an N-bit sequence, where N is the number of subbands in a frequency region.
  • N is the number of subbands in a frequency region.
  • the bit sequence from low to high indicates that the sequence numbers of the subbands are from small to large.
  • the bit sequence from low to high indicates the sequence numbers of the subbands from large to small.
  • the sequence number of the subband corresponding to each bit of the bit sequence can also be specified in advance.
  • sequence number of the subband with the peak in the frequency region it is determined whether there is a peak in the subband corresponding to each bit in the N-bit sequence, and the N-bit sequence is obtained, that is, the position quantity parameter of the tonal component. If the sequence number of the subband corresponding to the bit is equal to the sequence number of the subband with the peak in the frequency region, the value of the bit is 1, otherwise the value of the bit is 0.
  • the number of subbands in a frequency region is 5, the position quantity parameter of the tone component is represented by a 5-bit sequence, and the binary representation of the 5-bit sequence value is 10011. Assuming that the 5-bit bit sequence from low to high indicates that the subband's sequence number is from small to large, the value of this bit sequence indicates that there are peaks in the 0th, 1st, and 4th subbands in the frequency region, that is, the sequence number of the subbands with peaks is 0, 1, 4.
  • Case 3 The high-band parameters may also include noise floor parameters. Case 3 can be implemented in combination with Case 1 or Case 2.
  • the method further includes:
  • Case 4 The high-band parameters may also include signal type information. Case 3 can be implemented in combination with Cases 1-3.
  • the determination of the high-frequency parameter further includes: determining the signal type information according to the quantity parameter of the tonal component or the position quantity parameter of the tonal component. specifically:
  • the signal type information is determined. For example, if the value of the quantity parameter of tonal components is greater than 0, the signal type information indicates the tonal signal type.
  • the signal type information is determined. It may be that the quantity parameter of the tonal component is obtained according to the position quantity parameter of the tonal component; the signal type information is determined according to the quantity parameter of the tonal component. It is worth noting that if the quantitative parameter of the tonal component has been obtained by determining the position quantity parameter of the tonal component, there is no need to obtain the quantity parameter of the tonal component according to the position quantity parameter of the tonal component, and the signal is determined directly according to the quantity parameter of the tonal component. Type information is fine.
  • the signal type information can be indicated by the presence or absence of tonal components.
  • the flag of the presence or absence of a tone component may also be referred to as tone component indication information.
  • the flag value of the presence or absence of a tonal component is 1, which indicates that there is a tonal component.
  • the signal type information can be represented by a flag indicating whether there are tonal components in the frequency region. For example, the flag value of whether there is a tonal component in the frequency region is 1, which indicates that there is a tonal component in the frequency region.
  • Special processing for case 4 If the signal type information indicates the tone signal type, the signal type information and high-frequency band parameters other than the signal type information need to be written into the code stream. Otherwise, write the signal type information into the code stream. If it is coded according to the frequency region, the frequency region is processed in sequence: If the signal type information corresponding to the frequency region indicates the tone signal type, the signal type information and the high-frequency band parameters other than the signal type information need to be written into the code Stream; otherwise, write the signal type information into the code stream.
  • the audio encoder in the embodiment of the present invention encodes the position, quantity, and amplitude or energy of the tonal components in the high-band signal, so that the audio decoder recovers according to the position, quantity, and amplitude or energy of the tonal components
  • the tonal component makes the position and energy of the restored tonal component more accurate, thereby improving the quality of the decoded signal.
  • Figure 3 describes the flow of an audio decoding method provided by an embodiment of the present invention, including:
  • the high frequency band parameters include a position quantity parameter of the tone component, and an amplitude parameter or an energy parameter of the tone component.
  • the number of positions parameter indicates that the position of the tonal component and the number of tonal components are represented by the same parameter.
  • the high-band parameters include the position parameter of the tonal component, the quantity parameter of the tonal component, and the amplitude parameter or energy parameter of the tonal component; in this case, the position and quantity of the tonal component are different. The parameter representation.
  • the high frequency band corresponding to the high frequency band signal includes at least one frequency region, and one frequency region includes at least one subband; accordingly, the high frequency band parameter includes the high frequency of the current frame.
  • the position quantity parameter of the tonal component of the high-frequency signal includes the position quantity parameter of the respective tonal component of the at least one frequency region, and the amplitude parameter or energy parameter of the tonal component of the high-frequency signal of the current frame includes each of the at least one frequency region.
  • the amplitude parameter or energy parameter of the tonal component includes
  • the demultiplexing the code stream to obtain the high-frequency band parameters of the current frame of the audio signal includes: obtaining the pitch component of the current frequency region of the at least one frequency region Position quantity parameter; parse the amplitude parameter or energy parameter of the pitch component of the current frequency region from the code stream according to the position quantity parameter of the pitch component of the current frequency region.
  • the parsing the amplitude parameter or energy parameter of the pitch component of the current frequency region from the code stream according to the position quantity parameter of the pitch component of the current frequency region includes: The position quantity parameter of the tonal component in the frequency region determines the quantity parameter of the tonal component in the current frequency region; according to the quantity parameter of the tonal component in the current frequency region, the current frequency region is parsed from the code stream. The amplitude parameter or energy parameter of the tonal component.
  • the demultiplexing the code stream to obtain the high-frequency band parameters of the current frame of the audio signal includes: obtaining the pitch component of the current frequency region of the at least one frequency region Position quantity parameter; according to the position quantity parameter of the tonal component in the current frequency region, determine the position parameter of the tonal component in the current frequency region and the quantity parameter of the tonal component in the current frequency region; according to the number of tonal components in the current frequency region
  • the parameter analyzes the amplitude parameter or the energy parameter of the pitch component of the current frequency region from the coded code stream.
  • the method before acquiring the position quantity parameter of the pitch component of the current frequency region of the at least one frequency region, the method includes: acquiring the pitch component indication information of the current frequency region; the pitch component indication information is used to indicate Whether the current frequency region includes a tonal component; when the current frequency region includes a tonal component, acquiring a position quantity parameter of the tonal component in the current frequency region of the at least one frequency region. Therefore, it is possible to decode the parameters of the tonal components only in the frequency region including the tonal components, thereby improving the decoding efficiency.
  • the obtaining the reconstructed high-band signal of the current frame according to the high-band parameter includes: determining the tonal component in the current frequency region according to the position quantity parameter of the tonal component in the current frequency region Determine the amplitude or energy corresponding to the position of the tonal component according to the amplitude parameter or energy parameter of the tonal component in the current frequency region; according to the position of the tonal component in the current frequency region and the position of the tonal component corresponding to The amplitude or energy of the reconstructed high-band signal is obtained.
  • the determining the position of the tonal component in the current frequency region according to the position quantity parameter of the tonal component of the high-frequency signal in the current frequency region may include: The position quantity parameter determines the position parameter of the tonal component in the current frequency region; and determines the position of the tonal component in the current frequency region according to the position parameter of the tonal component in the current frequency region.
  • the obtaining the reconstructed high-band signal of the current frame according to the high-band parameter may specifically include: determining the current frequency region according to the position parameter of the tonal component of the current frequency region Tone component position; determine the amplitude or energy corresponding to the position of the tonal component according to the amplitude parameter or energy parameter of the tonal component in the current frequency region; according to the position of the tonal component in the current frequency region and the position of the tonal component The corresponding amplitude or energy obtains the reconstructed high frequency band signal.
  • the obtaining of the reconstructed high-band signal according to the position of the tone component in the current frequency region and the amplitude corresponding to the position of the tone component may be performed in the following manner:
  • pSpectralData represents the reconstructed high-band frequency domain signal in the current frequency region
  • tone_val represents the amplitude value corresponding to the position of the tone component in the current frequency region
  • tone_pos represents the position of the tone component in the current frequency region
  • the position quantity parameter of the tonal component of the current frequency region includes N bits. Accordingly, the obtaining the position quantity parameter of the tonal component of the current frequency region of the at least one frequency region includes: according to The number of subbands included in the current frequency region reads N bits from the code stream, where the N bits are the position quantity parameters of the tonal components in the current frequency region, where N is the The number of subbands included in the current frequency region, and the N bits have a one-to-one correspondence with the subbands included in the current frequency region.
  • the position parameter of the tonal component in the current frequency region is used to indicate the sequence number of the sub-band including the tonal component in the current frequency region.
  • the position of the tonal component in the current frequency region is located at a designated position in the subband where the tonal component is located in the current frequency region.
  • the designated position of the subband may be the center position of the subband, or the start position of the subband, or the end position of the subband.
  • Another embodiment of the present invention provides an audio decoding method, including the following processes:
  • the high frequency band can be divided into K frequency regions (tile), and each frequency region is divided into N subbands.
  • the determination of the high-frequency band parameters can be performed in units of frequency regions.
  • the following takes the method of obtaining high-frequency band parameters according to the code stream in a frequency region as an example.
  • the methods for obtaining high-band parameters in different frequency regions according to the coded bitstream can be the same or different.
  • the high frequency band parameters can be obtained through the following process:
  • the code stream is analyzed to determine the position parameter of the tonal component.
  • the code stream is analyzed to determine the amplitude parameter of the tonal component.
  • the high frequency band parameters can be obtained through the following process:
  • the position quantity parameter of the tonal component represents the position information of the tonal component and the quantity information of the tonal component.
  • the decoding side parses the code stream, and first obtains the position quantity parameter of the tonal component.
  • the position quantity parameter of the tone component can be represented by an N-bit sequence, where N is the number of subbands in a frequency region.
  • the frequency domain resolution tone_res[p] may be preset, or it may be obtained by analyzing the obtained code stream. Assuming that the bandwidth of the p-th frequency region is tile_width[p], the number of subbands in the frequency region can be
  • num_subband tile_width[p]/tone_res[p]
  • the number of subbands in the frequency region is 5, and 5 bits are read from the code stream, and the binary representation of the number of positions of the tonal components is 10011.
  • the number of subbands num_subband in the frequency region can also be preset, and num_subband bits can be read from the code stream directly according to the number of subbands in the frequency region num_subband, which is the position quantity parameter of the tone component.
  • the quantity parameter of the tone component is obtained according to the position quantity parameter of the tone component.
  • it may be: determining the number of subbands of the tonal component in the frequency region according to the position quantity parameter of the tonal component, that is, the quantity parameter tone_cnt[p] of the tonal component.
  • the number of subbands of tonal components in the frequency region is equal to the number of bits with a value of 1 in the binary representation of the number of positions of the tonal components parameter.
  • the code stream is analyzed according to the quantity parameter of the tonal component, and the amplitude parameter of the tonal component is determined.
  • the amplitude parameters of the tonal components are sequentially analyzed from the code stream according to the preset number of bits, and the number of amplitude parameters of the tonal components is equal to the quantity parameter of the tonal components.
  • the amplitude parameter tone_val_q[p][i], i 0,...,tone_cnt[p]-1 of the tone component.
  • the high-band parameters may also include the noise floor parameters of the tonal components.
  • Obtaining the high-frequency band parameters according to the coded code stream also includes: parsing the code stream to determine the noise floor parameters. Specifically, it may be: analyzing the noise floor parameter noise_floor[p] from the code stream according to the preset number of bits.
  • the high frequency band parameters also include signal type information.
  • Obtaining the high frequency band parameters according to the coded code stream also includes: parsing the code stream to determine the signal type information.
  • the high frequency band parameters are obtained, which can be specifically:
  • the signal type information can be a flag indicating whether there are tonal components in the frequency region, and can also be referred to as tonal component indication information.
  • the signal type information it is determined whether it is necessary to decode other high-frequency band parameters except the signal type information.
  • the flag value of whether there is a tone component in the frequency region is 1, that is, the signal type information indicates the tone signal type, then the code stream analysis is continued.
  • the method of parsing the code stream to determine other high-band parameters except the signal type information can be any of Case 1, Case 2, and Case 3 on the decoding side.
  • the high frequency band can be divided into K frequency regions (tile), and each frequency region is divided into N subbands.
  • the reconstruction of the high-band signal can be performed in units of frequency regions.
  • the method for obtaining the reconstructed high-band signal according to the high-band parameters in different frequency regions may be the same or different.
  • the reconstructed high-frequency signal in each frequency region the reconstructed high-frequency signal is obtained.
  • the high frequency band signal can be a frequency domain signal or a time domain signal
  • the position parameter of the tonal component represents the subband sequence number corresponding to the position of the tonal component.
  • the quantity parameter of tonal components characterizes the quantity of tonal components. According to the quantity parameter of the tonal component, the position parameter and the amplitude parameter of the tonal component, the high frequency band signal of the current frame is reconstructed.
  • tone_pos tile[p]+(sfb+0.5)*tone_res[p]
  • tone_val pow(2.0,0.25*tone_val_q[p][tone_idx]–4.0)
  • tile[p] is the starting frequency point of the p-th frequency region
  • sfb is the position parameter of the tonal component (that is, the subband number corresponding to the position of the tonal component)
  • tone_res[p] is the frequency domain resolution of the subband
  • Tone_pos represents the position of the tone component corresponding to the tone_idxth tone component in the p-th frequency region
  • tone_val_q[p][tone_idx] represents the amplitude parameter of the tone component corresponding to the tone_idx-th tone component in the p-th frequency region
  • tone_val represents the amplitude value corresponding to the tone_idx-th tone component in the p-th frequency region.
  • pSpectralData[tone_pos] represents the frequency domain signal corresponding to the position tone_pos of the tone component.
  • the value range of tone_idx belongs to [0, tone_cnt[p]-1], and tone_cnt[p] is the quantity parameter of the tone component.
  • the frequency domain signal on this frequency point can be directly set to 0.
  • the present invention does not limit the reconstruction method of other frequency points without tonal components.
  • the position quantity parameter of the tone component can be represented by an N-bit sequence, where N is the number of subbands in a frequency region. Specifically, the position quantity parameter of the pitch component may be shifted to determine the subband sequence number of the pitch component in the frequency region and the number of subbands with the pitch component.
  • the subband number of the tone component in the frequency region is the position parameter of the tone component.
  • the number of subbands of tonal components in the frequency region is the parameter of the number of tonal components.
  • bit sequence from low to high indicates that the sequence numbers of the subbands are from small to large.
  • the number of subbands in the frequency region is 5, the lowest bit of the 5-bit sequence corresponds to the subband sequence number 0, and the highest bit of the 5-bit sequence corresponds to the subband sequence number 4.
  • the binary representation of the position quantity parameter of the tone component is 10011, the subband numbers of the tone component in the frequency region are 0, 1, and 4, respectively.
  • bit sequence from low to high indicates the sequence numbers of the subbands from large to small.
  • the number of subbands in the frequency region is 5, the lowest bit of the 5-bit sequence corresponds to the subband sequence number 4, and the highest bit of the 5-bit sequence corresponds to the subband sequence number 0.
  • the binary representation of the position quantity parameter of the tone component is 10011, the subband numbers of the tone component in the frequency region are 0, 3, and 4, respectively.
  • sequence number of the subband corresponding to each bit of the bit sequence may also be predetermined, which is not limited in the present invention.
  • the quantity parameter of the tonal component can be obtained.
  • the number of subband numbers of tonal components in the frequency region is the quantity parameter of the tonal components.
  • the high-frequency band signal is reconstructed.
  • it may be: calculating the position of the pitch component according to the position parameter of the pitch component.
  • tone_pos tile[p]+(sfb+0.5)*tone_res[p]
  • tile[p] is the starting frequency point of the p-th frequency region
  • sfb is the subband number of the tone component in the frequency region
  • tone_res[p] is the frequency-domain resolution of the p-th frequency region.
  • the subband number of the tone component in the frequency region is the position parameter of the tone component. 0.5 means that the position of the tonal component in the sub-band where the tonal component exists is at the center of the sub-band.
  • the reconstructed tonal components can also be located in other positions of the subband.
  • it may be: calculating the amplitude of the tonal component according to the amplitude parameter of the tonal component.
  • tone_val pow(2.0,0.25*tone_val_q[p][tone_idx]–4.0)
  • tone_val_q[p][tone_idx] represents the amplitude parameter corresponding to the tone_idx position parameter in the p-th frequency region
  • tone_val represents the amplitude value of the frequency point corresponding to the tone_idx position parameter in the p-th frequency region.
  • tone_idx belongs to [0, tone_cnt[p]-1], and tone_cnt[p] is the quantity parameter of the tone component.
  • the frequency domain signal corresponding to the position tone_pos of the tone component satisfies:
  • tone_pos represents the frequency domain signal corresponding to the position tone_pos of the tone component
  • tone_val represents the amplitude value of the frequency point corresponding to the tone_idx position parameter in the p-th frequency region
  • tone_pos indicates the position of the tone component corresponding to the tone_idx position parameter in the p-th frequency region.
  • the frequency domain signal of the frequency point can be directly set to 0.
  • the present invention does not limit the reconstruction method of other frequency points without tonal components.
  • the audio signal of the current frame is obtained.
  • the third embodiment of the present invention provides an audio decoding method, including the following processes:
  • the high frequency band can be divided into K frequency regions (tile), and each frequency region is divided into N subbands.
  • the determination of the high-frequency band parameters can be performed in units of frequency regions. The following takes the method of obtaining high-frequency band parameters according to the code stream in a frequency region as an example.
  • the position quantity parameter of the tonal component represents the position information of the tonal component and the quantity information of the tonal component.
  • the decoding side parses the code stream, and first obtains the position quantity parameter of the tonal component.
  • the position quantity parameter of the tone component can be represented by an N-bit sequence, where N is the number of subbands in a frequency region.
  • the frequency domain resolution tone_res[p] may be preset, or it may be obtained by analyzing the obtained code stream. Assuming that the bandwidth of the p-th frequency region is tile_width[p], the number of subbands in the frequency region can be
  • num_subband tile_width[p]/tone_res[p]
  • the number of subbands in the frequency region is 5, and 5 bits are read from the code stream, and the binary representation of the position quantity parameter of the tonal component obtained is 10011.
  • the number of subbands num_subband in the frequency region can also be preset, and num_subband bits can be read from the code stream directly according to the number of subbands in the frequency region num_subband, which is the position quantity parameter of the tone component.
  • the position quantity parameter of the tone component can be represented by an N-bit sequence, where N is the number of subbands in a frequency region. Specifically, the position quantity parameter of the pitch component may be shifted to determine the subband sequence number of the pitch component in the frequency region and the number of subbands with the pitch component.
  • the subband number of the tone component in the frequency region is the position parameter of the tone component.
  • the number of subbands of tonal components in the frequency region is the parameter of the number of tonal components.
  • bit sequence from low to high indicates that the sequence numbers of the subbands are from small to large.
  • the number of subbands in the frequency region is 5, the lowest bit of the 5-bit sequence corresponds to the subband sequence number 0, and the highest bit of the 5-bit sequence corresponds to the subband sequence number 4.
  • the binary representation of the position quantity parameter of the tone component is 10011, the subband numbers of the tone component in the frequency region are 0, 1, and 4, respectively.
  • bit sequence from low to high indicates the sequence numbers of the subbands from large to small.
  • the number of subbands in the frequency region is 5, the lowest bit of the 5-bit sequence corresponds to the subband sequence number 4, and the highest bit of the 5-bit sequence corresponds to the subband sequence number 0.
  • the binary representation of the position quantity parameter of the tone component is 10011, the subband numbers of the tone component in the frequency region are 0, 3, and 4, respectively.
  • sequence number of the subband corresponding to each bit of the bit sequence may also be predetermined, which is not limited in the present invention.
  • the quantity parameter of the tonal component can be obtained.
  • the number of subband numbers of tonal components in the frequency region is the quantity parameter of the tonal components.
  • it may be: determining the number of subbands of the tonal component in the frequency region according to the position quantity parameter of the tonal component, that is, the quantity parameter tone_cnt[p] of the tonal component.
  • the number of subbands of tonal components in the frequency region is equal to the number of bits with a value of 1 in the binary representation of the number of positions of the tonal components parameter.
  • the amplitude parameters of the tonal components are sequentially analyzed from the code stream according to the preset number of bits, and the number of amplitude parameters of the tonal components is equal to the quantity parameter of the tonal components.
  • the amplitude parameter tone_val_q[p][i], i 0,...,tone_cnt[p]-1 of the tone component.
  • the high frequency band can be divided into K frequency regions (tile), and each frequency region is divided into N subbands.
  • the reconstruction of the high-band signal can be performed in units of frequency regions. The following are examples of methods for obtaining a reconstructed high-band signal based on high-band parameters in a frequency region. According to the reconstructed high-frequency signal in each frequency region, the reconstructed high-frequency signal is obtained.
  • the high frequency band signal can be a frequency domain signal or a time domain signal.
  • the high-frequency signal of the current frame may be reconstructed according to the position parameter, the quantity parameter, and the amplitude parameter of the pitch component.
  • the number of tonal components parameter table proves the number of tonal components.
  • the reconstruction method of the tonal component at a position can be specifically:
  • it may be: calculating the position of the pitch component according to the position parameter of the pitch component.
  • tone_pos tile[p]+(sfb+0.5)*tone_res[p]
  • tile[p] is the starting frequency point of the p-th frequency region
  • sfb is the subband number of the tone component in the frequency region
  • tone_res[p] is the frequency-domain resolution of the p-th frequency region.
  • the subband number of the tone component in the frequency region is the position parameter of the tone component. 0.5 means that the position of the tonal component in the sub-band where the tonal component exists is at the center of the sub-band.
  • the reconstructed tonal components can also be located in other positions of the subband.
  • it may be: calculating the amplitude of the tonal component according to the amplitude parameter of the tonal component.
  • tone_val pow(2.0,0.25*tone_val_q[p][tone_idx]–4.0)
  • tone_val_q[p][tone_idx] represents the amplitude parameter corresponding to the tone_idx position parameter in the p-th frequency region
  • tone_val represents the amplitude value of the frequency point corresponding to the tone_idx position parameter in the p-th frequency region.
  • tone_idx belongs to [0, tone_cnt[p]-1], and tone_cnt[p] is the number of tone components.
  • the high-frequency band signal is reconstructed.
  • the frequency domain signal corresponding to the position tone_pos of the tone component satisfies:
  • tone_pos represents the frequency domain signal corresponding to the position tone_pos of the tone component
  • tone_val represents the amplitude value of the frequency point corresponding to the tone_idx position parameter in the p-th frequency region
  • tone_pos indicates the position of the tone component corresponding to the tone_idx position parameter in the p-th frequency region.
  • the frequency domain signal of the frequency point can be directly set to 0.
  • the present invention does not limit the reconstruction method of other frequency points without tonal components.
  • the audio signal of the current frame is obtained.
  • the audio encoder in the embodiment of the present invention encodes the position, quantity, and amplitude or energy of the tonal components in the high-band signal, so that the audio decoder recovers according to the position, quantity, and amplitude or energy of the tonal components
  • the tonal component makes the position and energy of the restored tonal component more accurate, thereby improving the quality of the decoded signal.
  • Figure 6 depicts the structure of an audio encoder provided by an embodiment of the present invention, including:
  • the signal acquisition unit 601 is configured to acquire a current frame of an audio signal, where the current frame includes a high frequency band signal;
  • the parameter obtaining unit 602 is configured to obtain the high-band parameters of the current frame according to the high-band signal, where the high-band parameters are used to indicate the position, quantity, and amplitude of the tonal components included in the high-band signal Or energy
  • the encoding unit 603 is configured to perform code stream multiplexing on the high frequency band coding parameters to obtain an encoded code stream.
  • the audio encoder may further include: a determining unit, configured to determine whether a pitch component is included in the current frequency region; and the parameter acquiring unit, specifically configured to include a pitch component in the current frequency region
  • the position quantity parameter of the tone component of the current frequency region and the amplitude parameter or energy parameter of the tone component of the current frequency region are determined according to the high-band signal of the current frequency region in the at least one frequency region.
  • the specific implementation of the audio encoder can refer to the aforementioned audio encoding method, which will not be repeated here.
  • the audio encoder in the embodiment of the present invention encodes the position, quantity, and amplitude or energy of the tonal components in the high-band signal, so that the audio decoder recovers according to the position, quantity, and amplitude or energy of the tonal components
  • the tonal component makes the position and energy of the restored tonal component more accurate, thereby improving the quality of the decoded signal.
  • Figure 7 illustrates the structure of an audio decoder provided by an embodiment of the present invention, including:
  • the receiving unit 701 is configured to obtain an encoding code stream
  • the demultiplexing unit 702 is configured to demultiplex the code stream to obtain the high frequency band parameters of the current frame of the audio signal, and the high frequency band parameters are used to indicate the high frequency of the current frame.
  • the reconstruction unit 703 is configured to obtain the reconstructed high frequency band signal of the current frame according to the high frequency band parameter; and obtain the audio output signal of the current frame according to the reconstructed high frequency band signal of the current frame.
  • the specific implementation of the audio decoder can refer to the aforementioned audio coding method, which will not be repeated here.
  • the audio encoder in the embodiment of the present invention encodes the position, quantity, and amplitude or energy of the tonal components in the high-band signal, so that the audio decoder recovers according to the position, quantity, and amplitude or energy of the tonal components
  • the tonal component makes the position and energy of the restored tonal component more accurate, thereby improving the quality of the decoded signal.
  • An embodiment of the present application also provides a computer storage medium, wherein the computer storage medium stores a program, and the program executes some or all of the steps recorded in the above method embodiments.
  • the audio coding device 800 includes:
  • the receiver 801, the transmitter 802, the processor 803, and the memory 804 (the number of the processors 803 in the audio encoding device 800 may be one or more, and one processor is taken as an example in FIG. 8).
  • the receiver 801, the transmitter 802, the processor 803, and the memory 804 may be connected by a bus or in other ways, wherein the connection by a bus is taken as an example in FIG. 8.
  • the memory 804 may include a read-only memory and a random access memory, and provides instructions and data to the processor 803. A part of the memory 804 may also include a non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 804 stores an operating system and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
  • the operating instructions may include various operating instructions for implementing various operations.
  • the operating system may include various system programs for implementing various basic services and processing hardware-based tasks.
  • the processor 803 controls the operation of the audio encoding device, and the processor 803 may also be referred to as a central processing unit (CPU).
  • the various components of the audio encoding device are coupled together through a bus system.
  • the bus system may also include a power bus, a control bus, and a status signal bus.
  • various buses are referred to as bus systems in the figure.
  • the method disclosed in the foregoing embodiment of the present application may be applied to the processor 803 or implemented by the processor 803.
  • the processor 803 may be an integrated circuit chip with signal processing capability. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 803 or instructions in the form of software.
  • the aforementioned processor 803 may be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or Other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field-programmable gate array
  • Other programmable logic devices discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 804, and the processor 803 reads the information in the memory 804, and completes the steps of the foregoing method in combination with its hardware.
  • the receiver 801 can be used to receive input digital or character information, and generate signal input related to the related settings and function control of the audio coding device.
  • the transmitter 802 can include display devices such as a display screen, and the transmitter 802 can be used to output through an external interface Number or character information.
  • the processor 803 is configured to execute the audio encoding method shown in FIG. 2 above.
  • the audio decoding device 900 includes:
  • the receiver 901, the transmitter 902, the processor 903, and the memory 904 (the number of the processors 903 in the audio decoding device 900 may be one or more, and one processor is taken as an example in FIG. 9).
  • the receiver 901, the transmitter 902, the processor 903, and the memory 904 may be connected by a bus or in other ways, wherein the connection by a bus is taken as an example in FIG. 9.
  • the memory 904 may include a read-only memory and a random access memory, and provides instructions and data to the processor 903. A part of the memory 904 may also include NVRAM.
  • the memory 904 stores an operating system and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
  • the operating instructions may include various operating instructions for implementing various operations.
  • the operating system may include various system programs for implementing various basic services and processing hardware-based tasks.
  • the processor 903 controls the operation of the audio decoding device, and the processor 903 may also be referred to as a CPU.
  • the various components of the audio decoding device are coupled together through a bus system, where the bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
  • bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
  • various buses are referred to as bus systems in the figure.
  • the method disclosed in the foregoing embodiment of the present application may be applied to the processor 903 or implemented by the processor 903.
  • the processor 903 may be an integrated circuit chip with signal processing capability. In the implementation process, the steps of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 903 or instructions in the form of software.
  • the aforementioned processor 903 may be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 904, and the processor 903 reads the information in the memory 904, and completes the steps of the foregoing method in combination with its hardware.
  • the processor 903 is configured to execute the audio decoding method shown in FIG. 3.
  • the chip when the audio encoding device or the audio decoding device is a chip in the terminal, the chip includes: a processing unit and a communication unit.
  • the processing unit may be, for example, a processor, and the communication unit may be, for example, Input/output interface, pin or circuit, etc.
  • the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip in the terminal executes the method of any one of the above-mentioned first aspects.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the terminal located outside the chip, such as a read-only memory (read-only memory). -only memory, ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • processor mentioned in any of the foregoing may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the method in the first aspect.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate.
  • the physical unit can be located in one place or distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the connection relationship between the modules indicates that they have a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • this application can be implemented by means of software plus necessary general hardware.
  • it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memory, Dedicated components and so on to achieve.
  • all functions completed by computer programs can be easily implemented with corresponding hardware.
  • the specific hardware structures used to achieve the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. Circuit etc.
  • software program implementation is a better implementation in more cases.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, server, or network device, etc.) execute the methods described in each embodiment of this application .
  • a computer device which can be a personal computer, server, or network device, etc.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).

Abstract

本申请实施例公开了一种音频编解码方法和音频编解码设备,能够提高音频信号的解码质量。本申请实施例提供一种音频编码方法,所述方法包括:获取音频信号的当前帧,所述当前帧包括高频带信号;根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量;对所述高频带编码参数进行码流复用,以得到编码码流。

Description

一种音频编解码方法和音频编解码设备
本申请要求于2020年1月13日提交中国知识产权局、申请号为202010033973.0、发明名称为“一种音频编解码方法和音频编解码设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频信号编解码技术领域,尤其涉及一种音频编解码方法和音频编解码设备。
背景技术
随着生活质量的提高,人们对高质量音频的需求不断增大。为了利用有限的带宽更好地传输音频信号,通常需要先对音频信号进行编码,然后将编码处理后的码流传输到解码端。解码端对接收到的码流进行解码处理,得到解码后的音频信号,解码后的音频信号用于回放。
其中,如何提高解码音频信号的质量,成为一个亟需解决的技术问题。
发明内容
本申请实施例提供了一种音频编解码方法和音频编解码设备,能够提高解码音频信号的质量。
为解决上述技术问题,本申请实施例提供以下技术方案:
第一方面,提供了一种音频编码方法,所述方法包括:获取音频信号的当前帧,所述当前帧包括高频带信号;根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量;对所述高频带编码参数进行码流复用,以得到编码码流。
结合第一方面,在一种实施方式中,所述高频带参数包括音调成分的位置数量参数、以及所述音调成分的幅度参数或能量参数。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带,所述根据所述高频带信号获得所述当前帧的高频带参数包括:根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数前,所述方法包括:确定所述当前频率区域内是否包括音调成分;在所述当前频率区域内包括音调成分时,根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述当前帧的高频带参数还包括音调成分指示信息,所述音调成分指示信息用于指示所述当前频率区域内是否 包括音调成分。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数包括:根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种;根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种包括:根据所述至少一个频率区域中的当前频率区域的功率谱、能量谱或幅度谱中的至少一种在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数包括:根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息;根据所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在峰值,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在峰值,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在音调成分,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在音调成分,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
结合第一方面或第一方面的前述实施方式,在一种实施方式中,所述高频带参数还包括所述高频带信号的噪声基底参数。
第二方面,提供了一种音频解码方法,包括:获取编码码流;对所述编码码流进行码 流解复用,以得到音频信号的当前帧的高频带参数,所述高频带参数用于表示所述当前帧的高频带信号包括的音调成分的位置、数量以及幅度或能量;根据所述高频带参数获得所述当前帧的重建高频带信号;根据所述当前帧的重建高频带信号获得所述当前帧的音频输出信号。
结合第二方面,在一种实施方式中,所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数和所述音调成分的幅度参数或能量参数。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,将所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数包括所述至少一个频率区域各自的音调成分的位置数量参数,所述当前帧的高频信号的音调成分的幅度参数或能量参数包括所述至少一个频率区域各自的音调成分的幅度参数或能量参数。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数包括:获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数,包括:根据所述当前频率区域的音调成分的位置数量参数,确定所述当前频率区域的音调成分的数量参数;根据所述当前频率区域的音调成分的数量参数,从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数包括:获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;根据所述当前频率区域的音调成分的位置数量参数,确定当前频率区域的音调成分的位置参数和当前频率区域的音调成分的数量参数;根据所述当前频率区域的音调成分的数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数前包括:获取所述当前频率区域的音调成分指示信息;所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分;当所述当前频率区域内包括音调成分时,获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数包括:根据所述当前频率区域包括的子带数量从所述编码码流中读取N个比特位,所述N个比特位为所述当前频率区域的音调成分的位置数量参数,其中,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述根据所述高频带 参数获得所述当前帧的重建高频带信号包括:根据所述当前频率区域的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置;根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述根据所述当前频率区域的高频信号的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置包括:根据所述当前频率区域的高频信号的音调成分的位置数量参数,确定所述当前频率区域的音调成分的位置参数;根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述根据所述高频带参数获得所述当前帧的重建高频带信号包括:根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置;根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述当前频率区域的音调成分的位置参数用于指示所述当前频率区域中包括音调成分的子带的序号。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述当前频率区域中音调成分位置位于所述当前频率区域中音调成分所在子带的指定位置。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述子带的指定位置为子带的中心位置。
结合第二方面或第二方面的前述实施方式,在一种实施方式中,所述根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度获得所述重建高频带信号包括:根据如下计算式确定音调成分的位置的频域信号:
pSpectralData[tone_pos]=tone_val
其中,pSpectralData表示所述当前频率区域的重建高频带频域信号,tone_val表示所述当前频率区域内音调成分的位置对应的幅度值,tone_pos表示所述当前频率区域内音调成分的位置。
第三方面,提供了一种音频编码器,包括:信号获取单元,用于获取音频信号的当前帧,所述当前帧包括高频带信号;参数获取单元,用于根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量;编码单元,用于对所述高频带编码参数进行码流复用,以得到编码码流。
结合第三方面,在一种实施方式中,所述高频带参数包括音调成分的位置数量参数、以及所述音调成分的幅度参数或能量参数。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;所述参数获取单元,具体用于:根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量 参数。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述音频编码器还包括:确定单元,用于确定所述当前频率区域内是否包括音调成分;所述参数获取单元,具体用于在所述当前频率区域内包括音调成分时,根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述当前帧的高频带参数还包括音调成分指示信息,所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述参数获取单元,具体用于:根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种;根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述参数获取单元,具体用于:根据所述至少一个频率区域中的当前频率区域的功率谱、能量谱或幅度谱中的至少一种在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述参数获取单元,具体用于:根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息;根据所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在峰值,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在峰值,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在音调成分,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在音调成分,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
结合第三方面或第三方面的前述实施方式,在一种实施方式中,所述高频带参数还包 括所述高频带信号的噪声基底参数。
第四方面提供了一种音频解码器,包括:接收单元,用于获取编码码流;解复用单元,用于对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数,所述高频带参数用于表示所述当前帧的高频带信号包括的音调成分的位置、数量以及幅度或能量;重建单元,用于根据所述高频带参数获得所述当前帧的重建高频带信号;根据所述当前帧的重建高频带信号获得所述当前帧的音频输出信号。
结合第四方面,在一种实施方式中,所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数和所述音调成分的幅度参数或能量参数。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,将所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数包括所述至少一个频率区域各自的音调成分的位置数量参数,所述当前帧的高频信号的音调成分的幅度参数或能量参数包括所述至少一个频率区域各自的音调成分的幅度参数或能量参数。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述解复用单元,具体用于:获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述解复用单元,具体用于:根据所述当前频率区域的音调成分的位置数量参数,确定所述当前频率区域的音调成分的数量参数;根据所述当前频率区域的音调成分的数量参数,从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述解复用单元,具体用于:获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;根据所述当前频率区域的音调成分的位置数量参数,确定当前频率区域的音调成分的位置参数和当前频率区域的音调成分的数量参数;根据所述当前频率区域的音调成分的数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述解复用单元,具体用于:获取所述当前频率区域的音调成分指示信息;所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分;当所述当前频率区域内包括音调成分时,获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述解复用单元,具体用于:根据所述当前频率区域包括的子带数量从所述编码码流中读取N个比特位,所述N个比特位为所述当前频率区域的音调成分的位置数量参数,其中,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述重建单元,具体用于:根据所述当前频率区域的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置;根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的 位置对应的幅度或能量;根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述重建单元,具体用于:根据所述当前频率区域的高频信号的音调成分的位置数量参数,确定所述当前频率区域的音调成分的位置参数;根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述重建单元,具体用于:根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置;根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述当前频率区域的音调成分的位置参数用于指示所述当前频率区域中包括音调成分的子带的序号。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述当前频率区域中音调成分位置位于所述当前频率区域中音调成分所在子带的指定位置。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述子带的指定位置为子带的中心位置。
结合第四方面或第四方面的前述实施方式,在一种实施方式中,所述根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度获得所述重建高频带信号包括:根据如下计算式确定音调成分的位置的频域信号:
pSpectralData[tone_pos]=tone_val
其中,pSpectralData表示所述当前频率区域的重建高频带频域信号,tone_val表示所述当前频率区域内音调成分的位置对应的幅度值,tone_pos表示所述当前频率区域内音调成分的位置。
第五方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的方法。
第六方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的方法。
第七方面,本申请实施例提供了一种音频编码器,包括处理器和存储器;所述存储器用于存储指令;所述处理器用于执行所述存储器中的所述指令,使得所述音频编码器执行如前述第一方面的任一项方法。
第八方面,本申请实施例提供了一种音频解码器,包括处理器和存储器;所述存储器用于存储指令;所述处理器用于执行所述存储器中的所述指令,使得所述音频解码器执行如前述第二方面的任一项方法。
第九方面,本申请实施例提供一种通信装置,该通信装置可以包括音频编解码设备或者芯片等实体,所述通信装置包括:处理器,可选的,还包括存储器;所述存储器用于存 储指令;所述处理器用于执行所述存储器中的所述指令,使得所述通信装置执行如前述第一方面或第二方面中任一项所述的方法。
第十方面,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持音频编解码设备实现上述方面中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存音频编解码设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
从上可知,本发明实施例中音频编码器会对高频带信号中的音调成分的位置、数量以及幅度或能量进行编码,从而使得音频解码器根据音调成分的位置、数量以及幅度或能量恢复音调成分,使得恢复的音调成分的位置和能量更准确,从而提高了解码信号的质量。
附图说明
图1为本申请实施例提供的一种音频编解码系统的结构示意图;
图2为本申请实施例提供的一种音频编码方法的示意性流程图;
图3为本申请实施例提供的一种音频解码方法的示意性流程图;
图4为本申请实施例的移动终端的示意图;
图5为本申请实施例的网元的示意图;
图6为本申请实施例提供的一种音频编码设备的组成结构示意图;
图7为本申请实施例提供的一种音频解码设备的组成结构示意图;
图8为本申请实施例提供的另一种音频编码设备的组成结构示意图;
图9为本申请实施例提供的另一种音频解码设备的组成结构示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
本申请实施例中的音频信号是指音频编码设备中的输入信号,该音频信号中可以包括多个帧,例如当前帧可以特指音频信号中的某一个帧,本申请实施例中以当前帧音频信号的编解码进行示例说明,音频信号中当前帧的前一帧或者后一帧都可以根据该当前帧音频信号的编解码方式进行相应的编解码,对于音频信号中当前帧的前一帧或者后一帧的编解码过程不再逐一说明。另外,本申请实施例中的音频信号可以是单声道音频信号,或者,也可以为立体声信号。其中,立体声信号可以是原始的立体声信号,也可以是多声道信号中包括的两路信号(左声道信号和右声道信号)组成的立体声信号,还可以是由多声道信号中包含的至少三路信号产生的两路信号组成的立体声信号,本申请实施例中对此并不限定。
图1为本申请一个示例性实施例的音频编解码系统的结构示意图。该音频编解码系统包括编码组件110和解码组件120。
编码组件110用于对当前帧(音频信号)在频域或时域上进行编码。可选地,编码组件110可以通过软件实现;或者,也可以通过硬件实现;或者,还可以通过软硬件结合的形式实现,本申请实施例中对此不作限定。
编码组件110对当前帧在频域或时域上进行编码时,在一种可能的实现方式中,可以包括如图2所示的步骤。
在本申请实施例中,编码组件110在完成编码之后,可以生成编码码流,编码组件110可以向解码组件120发送编码码流,从而使得解码组件120可以接收到该编码码流,再由解码组件120从编码码流中得到音频输出信号。
需要说明的是,图2中所示的编码方法仅为示例而非限定,本申请实施例对图2中各步骤的执行顺序并不限定,图2中所示的编码方法也可以包括更多或更少的步骤,本申请实施例中对此并不限定。
可选地,编码组件110与解码组件120可以通过有线或无线的方式相连,解码组件120可以通过其与编码组件110之间的连接获取编码组件110生成的编码码流;或者,编码组件110可以将生成的编码码流存储至存储器,解码组件120读取存储器中的编码码流。
可选地,解码组件120可以通过软件实现;或者,也可以通过硬件实现;或者,还可以通过软硬件结合的形式实现,本申请实施例中对此不作限定。
解码组件120对当前帧(音频信号)在频域或时域上进行解码时,在一种可能的实现方式中,可以包括如图3所示的步骤。
可选地,编码组件110和解码组件120可以设置在同一设备中;或者,也可以设置在不同设备中。设备可以为手机、平板电脑、膝上型便携计算机和台式计算机、蓝牙音箱、录音笔、可穿戴式设备等具有音频信号处理功能的终端,也可以是核心网、无线网中具有音频信号处理能力的网元,本实施例对此不作限定。
示意性地,如图4所示,本实施例以编码组件110设置于移动终端130中、解码组件120设置于移动终端140中,移动终端130与移动终端140是相互独立的具有音频信号处理能力的电子设备,例如可以是手机,可穿戴设备,虚拟现实(virtual reality,VR)设备,或增强现实(augmented reality,AR)设备等等,且移动终端130与移动终端140之间通过无线或有线网络连接为例进行说明。
可选地,移动终端130可以包括采集组件131、编码组件110和信道编码组件132,其中,采集组件131与编码组件110相连,编码组件110与编码组件132相连。
可选地,移动终端140可以包括音频播放组件141、解码组件120和信道解码组件142,其中,音频播放组件141与解码组件120相连,解码组件120与信道解码组件142相连。
移动终端130通过采集组件131采集到音频信号后,通过编码组件110对该音频信号进行编码,得到编码码流;然后,通过信道编码组件132对编码码流进行编码,得到传输信号。
移动终端130通过无线或有线网络将该传输信号发送至移动终端140。
移动终端140接收到该传输信号后,通过信道解码组件142对传输信号进行解码得到码码流;通过解码组件110对编码码流进行解码得到音频信号;通过音频播放组件播放该音频信号。可以理解的是,移动终端130也可以包括移动终端140所包括的组件,移动终端140也可以包括移动终端130所包括的组件。
示意性地,如图5所示,以编码组件110和解码组件120设置于同一核心网或无线网中具有音频信号处理能力的网元150中为例进行说明。
可选地,网元150包括信道解码组件151、解码组件120、编码组件110和信道编码组件152。其中,信道解码组件151与解码组件120相连,解码组件120与编码组件110相连,编码组件110与信道编码组件152相连。
信道解码组件151接收到其它设备发送的传输信号后,对该传输信号进行解码得到第一编码码流;通过解码组件120对编码码流进行解码得到音频信号;通过编码组件110对该音频信号进行编码,得到第二编码码流;通过信道编码组件152对该第二编码码流进行编码得到传输信号。
其中,其它设备可以是具有音频信号处理能力的移动终端;或者,也可以是具有音频信号处理能力的其它网元,本实施例对此不作限定。
可选地,网元中的编码组件110和解码组件120可以对移动终端发送的编码码流进行转码。
可选地,本申请实施例中可以将安装有编码组件110的设备称为音频编码设备,在实际实现时,该音频编码设备也可以具有音频解码功能,本申请实施对此不作限定。
可选地,本申请实施例中可以将安装有解码组件120的设备称为音频解码设备,在实际实现时,该音频解码设备也可以具有音频编码功能,本申请实施对此不作限定。
图2描述了本发明一个实施例提供的音频编码方法的流程,包括:
201、获取音频信号的当前帧,所述当前帧包括高频带信号。
其中,当前帧可以是音频信号中的任意一个帧,在当前帧中可以包括高频带信号和低频带信号,其中,高频带信号和低频带信号的划分可以通过频带阈值确定,高于该频带阈值的信号为高频带信号,低于该频带阈值的信号为低频带信号,对于频带阈值的确定可以根据传输带宽、编码组件110和解码组件120的数据处理能力来确定,此处不做限定。
其中高频带信号和低频带信号是相对的,例如低于某个频率的信号为低频带信号,但是高于该频率的信号为高频带信号(该频率对应的信号既可以划到低频带信号,也可以划到高频带信号)。该频率根据当前帧的带宽不同会有不同。例如,在当前帧为0-8khz的宽带信号时,该频率可以为4khz;在当前帧为0-16khz的超宽带信号时,该频率可以为8khz。
202、根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量。
具体地,所述高频带参数包括音调成分的位置数量参数、以及所述音调成分的幅度参数或能量参数。其中,位置数量参数表示由同一个参数表示音调成分的位置和音调成分的数量。在另一种实施方式中,高频带参数包括音调成分的位置参数、音调成分的数量参数以及所述音调成分的幅度参数或能量参数;在这种情况下,音调成分的位置和数量采用不 同的参数表示。
在一种具体实施方式中,所述高频带信号对应的高频带包括至少一个频率区域(Tile),一个所述频率区域包括至少一个子带,所述根据所述高频带信号获得所述当前帧的高频带参数包括:根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
在另一种实施方式中,所述根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数前,所述方法包括:确定所述当前频率区域内是否包括音调成分;在所述当前频率区域内包括音调成分时,根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。从而仅获取具有音调成分的频率区域的参数,从而提高编码效率。
相应地,所述当前帧的高频带参数还包括音调成分指示信息,所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分。使得音频解码器可以根据该指示信息进行解码,提高解码效率。
其中,在一个实施方式中,所述根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数包括:根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种;根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
其中,进行峰值搜索的高频带信号可以是频域信号,也可以是时域信号。
具体地,在一个实施方式中,所述峰值搜索具体可以根据当前频率区域的功率谱、能量谱或幅度谱中的至少一种进行。
其中,在一个实施方式中,所述根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数包括:根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息;根据所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
203、对所述高频带编码参数进行码流复用,以得到编码码流。
其中,在一个实施方式中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在峰值,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带 不存在峰值,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
其中,在一个实施方式中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在音调成分,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在音调成分,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
在一个实施方式中,所述高频带参数还可以包括所述高频带信号的噪声基底参数。
在本发明的另一个实施例中,音频编码方法可以包括如下流程:
1、获取音频信号的高频带信号。
2、根据高频带信号,确定高频带参数。其中,具体可以包括如下4种情况。
情况1:高频带参数包括音调成分的位置参数、数量参数、幅度参数。
根据高频带信号,确定高频带参数,具体地可以是:
先根据高频带信号,获取高频带信号的功率谱。
再根据高频带信号的功率谱进行峰值搜索,得到峰值数量信息、峰值位置信息以及峰值幅度信息。其中,峰值搜索的方式有很多种,本发明实施例对峰值搜索的具体方式不做限定。例如,如果当前频点对应的功率谱的值与左、右相邻频点对应的功率谱的值相差较大,则该频点为峰值。
然后再根据峰值位置、峰值幅度以及峰值个数中的至少一种进行筛选,确定音调成分的位置参数、数量参数、幅度参数。
例如,根据峰值幅度进行筛选,可以是:将峰值幅度大于预先设定的阈值作为预设条件。
具体地,可以将符合预设条件的峰值数量作为音调成分的数量参数。
将对应的峰值位置作为音调成分的位置参数,或者根据对应的峰值位置确定音调成分的位置参数。例如根据对应的峰值位置得到峰值位置对应的子带序号,将峰值位置对应的子带序号作为音调成分的位置参数。
将对应的峰值幅度作为音调成分的幅度参数或者根据对应的峰值幅度确定音调成分的幅度参数。峰值幅度可以由频域信号的能量表征,也可以由频域信号的功率表征。可以用音调成分的能量参数替换音调成分的幅度参数,作为高频带参数。
如果在编码的过程中,将高频带划分成K个频率区域(tile),每一个频率区域内又划分为N个子带。根据高频带信号确定高频带参数,也可以在各个频率区域内进行。其中,K和N均为大于或等于1的整数。
情况2:高频带参数包括音调成分的位置数量参数、幅度参数。
在编码的过程中,可以将高频带划分成K个频率区域(tile),每一个频率区域内又划分为N个子带。高频带参数的确定,可以以频率区域为单位进行。这里以一个频率区域为例。根据高频带信号确定高频带参数的方法,具体地可以是:
先根据高频带信号,获取高频带信号的功率谱。
再根据高频带信号的功率谱进行峰值搜索,得到峰值数量信息、峰值位置信息以及峰值幅度信息。
峰值搜索是以频率区域为单位进行的。对一个频率区域内的高频带信号的功率谱进行峰值搜索,得到频率区域内的峰值数量信息、峰值位置信息以及峰值幅度信息。
根据峰值位置、峰值幅度以及峰值个数中的至少一种进行筛选,确定音调成分的位置数量参数、幅度参数。
根据峰值位置、峰值幅度以及峰值个数中的至少一种进行筛选,确定音调成分的位置参数、数量参数、幅度参数。
音调成分的位置参数可以是频率区域内存在峰值的子带的序号。音调成分的数量参数是频率区域内存在峰值的子带的数量。音调成分的幅度参数可以等于频率区域内存在峰值的子带的峰值幅度或者根据频率区域内存在峰值的子带的峰值幅度计算得到的。峰值幅度可以由频域信号的能量表征,也可以由频域信号的功率表征。可以用音调成分的能量参数替换音调成分的幅度参数,作为高频带参数。
根据音调成分的位置参数,确定音调成分的位置数量参数。
音调成分的位置数量参数可以由N位比特序列表示,N为一个频率区域内的子带个数。一种可能的情况是:比特序列由低位到高位分别表示子带的序号从小到大。另一种可能的情况是:比特序列由低位到高位分别表示子带的序号从大到小。除此之外,还可以预先规定比特序列的每一位所对应的子带的序号。
根据频率区域内存在峰值的子带的序号,判断N位比特序列中各个比特位对应的子带中是否存在峰值,得到N位比特序列,即音调成分的位置数量参数。如果比特位对应的子带序号等于频率区域内存在峰值的子带的序号,则比特位的值为1,否则该比特位的值为0。
例如,一个频率区域内的子带个数为5,音调成分的位置数量参数由5位比特序列表示,5位比特序列值的二进制表示为10011。假设5位比特序列由低位到高位分别表示子带的序号从小到大,则该比特序列的值表示频率区域内的第0、1、4子带存在峰值,即存在峰值的子带的序号为0、1、4。
情况3:高频带参数还可以包括噪声基底参数。情况3可以结合情况1或情况2实现。
根据所述高频带信号根据高频带信号,确定高频带参数,还包括:
根据所述高频带信号的功率谱,得到噪声基底的功率谱估计值;
根据噪声基底的功率谱估计值,得到待编码的噪声基底参数;
对所述待编码的噪声基底参数进行量化编码,以得到所述噪声基底参数。
情况4:高频带参数还可以包括信号类型信息。情况3可以结合情况1-3实现。
根据所述高频带信号根据高频带信号,确定高频带参数,还包括:根据音调成分的数量参数或者音调成分的位置数量参数,确定信号类型信息。具体地:
根据音调成分的数量参数,确定信号类型信息。例如,如果音调成分的数量参数的值大于0,则信号类型信息指示为音调信号类型。
根据音调成分的位置数量参数,确定信号类型信息。可以是,根据音调成分的位置数 量参数得到音调成分的数量参数;根据音调成分的数量参数,确定信号类型信息。值得注意的是,如果在确定音调成分的位置数量参数已经获得了音调成分的数量参数,则不需要根据音调成分的位置数量参数得到音调成分的数量参数,直接根据音调成分的数量参数,确定信号类型信息即可。
信号类型信息可以用是否存在音调成分的标志来表示。是否存在音调成分的标志也可以称为音调成分指示信息。
例如,是否存在音调成分的标志值为1,表示存在音调成分。
如果是按照频率区域进行编码,信号类型信息的确定也要按照频率区域进行。信号类型信息可以用频率区域内是否存在音调成分的标志来表示。例如,频率区域内是否存在音调成分的标志值为1,表示该频率区域内存在音调成分。
3、对高频带参数进行码流复用,以得到编码码流。
针对情况4的特殊处理:如果信号类型信息指示为音调信号类型,则需要将信号类型信息和除信号类型信息外的高频带参数写入码流。否则,将信号类型信息写入码流。如果是按照频率区域进行编码,则对频率区域依次进行处理:如果频率区域对应的信号类型信息指示为音调信号类型,则需要将信号类型信息和除信号类型信息外的高频带参数写入码流;否则,将信号类型信息写入码流。
从上可知,本发明实施例中音频编码器会对高频带信号中的音调成分的位置、数量以及幅度或能量进行编码,从而使得音频解码器根据音调成分的位置、数量以及幅度或能量恢复音调成分,使得恢复的音调成分的位置和能量更准确,从而提高了解码信号的质量。
图3描述了本发明一个实施例提供的音频解码方法的流程,包括:
301、获取编码码流。
302、对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数,所述高频带参数用于表示所述当前帧的高频带信号包括的音调成分的位置、数量以及幅度或能量。
具体地,所述高频带参数包括音调成分的位置数量参数、以及所述音调成分的幅度参数或能量参数。其中,位置数量参数表示由同一个参数表示音调成分的位置和音调成分的数量。在另一种实施方式中,高频带参数包括音调成分的位置参数、音调成分的数量参数以及所述音调成分的幅度参数或能量参数;在这种情况下,音调成分的位置和数量采用不同的参数表示。
在一个实施方式中,所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;相应地,所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数包括所述至少一个频率区域各自的音调成分的位置数量参数,所述当前帧的高频信号的音调成分的幅度参数或能量参数包括所述至少一个频率区域各自的音调成分的幅度参数或能量参数。
在一个实施方式中,所述对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数包括:获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频 率区域的音调成分的幅度参数或能量参数。
在一个实施方式中,所述根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数,包括:根据所述当前频率区域的音调成分的位置数量参数,确定所述当前频率区域的音调成分的数量参数;根据所述当前频率区域的音调成分的数量参数,从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
在一个实施方式中,所述对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数包括:获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;根据所述当前频率区域的音调成分的位置数量参数,确定当前频率区域的音调成分的位置参数和当前频率区域的音调成分的数量参数;根据所述当前频率区域的音调成分的数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
在一个实施方式中,所述获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数前包括:获取所述当前频率区域的音调成分指示信息;所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分;当所述当前频率区域内包括音调成分时,获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数。从而可以仅对包括了音调成分的频率区域进行音调成分的参数的解码,提高解码效率。
在一个实施方式中,所述根据所述高频带参数获得所述当前帧的重建高频带信号包括:根据所述当前频率区域的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置;根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
具体地,所述根据所述当前频率区域的高频信号的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置可以包括:根据所述当前频率区域的高频信号的音调成分的位置数量参数,确定所述当前频率区域的音调成分的位置参数;根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置。
303、根据所述高频带参数获得所述当前帧的重建高频带信号。
在一个实施方式中,所述根据所述高频带参数获得所述当前帧的重建高频带信号具体可以包括:根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置;根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
具体地,所述根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度获得所述重建高频带信号可以采用如下方式进行:
根据如下计算式确定音调成分的位置的频域信号:
pSpectralData[tone_pos]=tone_val
其中,pSpectralData表示所述当前频率区域的重建高频带频域信号,tone_val表示所述当前频率区域内音调成分的位置对应的幅度值,tone_pos表示所述当前频率区域内 音调成分的位置。
304、根据所述当前帧的重建高频带信号获得所述当前帧的音频输出信号。
在一个实施例中,所述当前频率区域的音调成分的位置数量参数包括N个比特位,相应地,所述获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数包括:根据所述当前频率区域包括的子带数量从所述编码码流中读取N个比特位,所述N个比特位为所述当前频率区域的音调成分的位置数量参数,其中,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应。
在一个实施方式中,所述当前频率区域的音调成分的位置参数用于指示所述当前频率区域中包括音调成分的子带的序号。
在一个实施方式中,所述当前频率区域中音调成分位置位于所述当前频率区域中音调成分所在子带的指定位置。例如,所述子带的指定位置可以为子带的中心位置,或子带的起始位置,或子带的结束位置。
本发明另一个实施例提供了一个音频解码方法,包括如下流程:
1、获取编码码流。
2、根据编码码流,得到高频带参数。
其中,高频带可以划分成K个频率区域(tile),每一个频率区域内又划分为N个子带。高频带参数的确定,可以以频率区域为单位进行。下面均以一个频率区域内根据编码码流得到高频带参数的方法为例。不同频率区域根据编码码流得到高频带参数的方法可以相同也可以不同。
情况1,可以通过如下流程获得高频带参数:
解析码流,确定音调成分的位置参数、数量参数、幅度参数。
解析码流,确定音调成分的数量参数。
根据音调成分的数量参数,解析码流,确定音调成分的位置参数。
根据音调成分的数量参数,解析码流,确定音调成分的幅度参数。
情况2,可以通过如下流程获得高频带参数:
解析码流,确定音调成分的位置数量参数。
音调成分的位置数量参数表征了音调成分的位置信息和音调成分的数量信息。解码侧解析码流,先获得音调成分的位置数量参数。音调成分的位置数量参数可以由N位比特序列表示,N为一个频率区域内的子带个数。
具体地,先根据频域分辨率确定频率区域内的子带个数num_subband;然后,根据频率区域内的子带个数num_subband,从码流中读取num_subband个比特位,即为音调成分的位置数量参数。
其中,频域分辨率tone_res[p]可以是预先设定的,也可以是从获得的编码码流中解析得到的。假设第p个频率区域的频带宽度为tile_width[p],则频率区域内的子带个数,可以是
num_subband=tile_width[p]/tone_res[p]
例如,频率区域内的子带个数为5,从码流中读取5个比特位,得到的音调成分的位 置数量参数的二进制表示为10011。
频率区域内的子带个数num_subband还可以是预设的,可以直接根据频率区域内的子带个数num_subband,从码流中读取num_subband个比特位,即为音调成分的位置数量参数。
解析码流,确定音调成分的幅度参数。
首先,根据音调成分的位置数量参数得到音调成分的数量参数。
具体地,可以是:根据音调成分的位置数量参数,确定频率区域内存在音调成分的子带的数量,即音调成分的数量参数tone_cnt[p]。频率区域内存在音调成分的子带的数量等于音调成分的位置数量参数的二进制表示中取值为1的比特位的个数。
例如,音调成分的位置数量参数的二进制表示为10011。那么,频率区域内存在音调成分的子带的数量等于3,即音调成分的位置参数tone_cnt[p]=3。
当然,也可以用0表示子带存在音调成分,那么音调成分的位置数量参数的二进制表示为10011时频率区域内存在音调成分的子带的数量等于2,即音调成分的位置参数tone_cnt[p]=2。
然后,根据音调成分的数量参数解析码流,确定音调成分的幅度参数。
具体地,可以是:根据预先设定的比特数从码流中依次解析音调成分的幅度参数,音调成分的幅度参数的个数等于音调成分的数量参数。音调成分的幅度参数tone_val_q[p][i],i=0,…,tone_cnt[p]-1。
情况3:高频带参数还可以包括音调成分的噪声基底参数。根据编码码流,得到高频带参数,还包括:解析码流,确定噪声基底参数。具体地,可以是:根据预先设定的比特数从码流中解析噪声基底参数noise_floor[p]。
情况4:高频带参数还包括信号类型信息。根据编码码流,得到高频带参数,还包括:解析码流,确定信号类型信息。
根据编码码流,得到高频带参数,具体地可以是:
解析码流,确定信号类型信息。
信号类型信息可以是指示该频率区域内是否存在音调成分的标志,也可以称为音调成分指示信息。
根据信号类型信息,判断是否需要解码除信号类型信息外的其他高频带参数。
如果频率区域内是否存在音调成分的标志值为1,即信号类型信息指示为音调信号类型,则继续进行码流解析。
解析码流,确定除信号类型信息外的其他高频带参数。
解析码流,确定除信号类型信息外的其他高频带参数的方法可以是解码侧的情况1、情况2、情况3的任意一种。
3、根据高频带参数,得到重建的高频带信号。
高频带可以划分成K个频率区域(tile),每一个频率区域内又划分为N个子带。高频带信号的重建,可以以频率区域为单位进行。下面均以一个频率区域内根据高频带参数得到重建的高频带信号的方法为例。不同频率区内域根据高频带参数得到重建的高频带信号 的方法可以相同也可以不相同。根据各个频率区域内重建的高频带信号,获得重建的高频带信号。高频带信号可以是频域信号,也可以是时域信号
针对情况1:根据音调成分的数量参数、位置参数和音调成分的幅度参数,重建高频带信号。
例如,音调成分的位置参数表征了音调成分的位置对应的子带序号。音调成分的数量参数表征了音调成分的数量。根据音调成分的数量参数、位置参数和音调成分的幅度参数,重建当前帧的高频带信号。
具体地,可以是:
tone_pos=tile[p]+(sfb+0.5)*tone_res[p]
tone_val=pow(2.0,0.25*tone_val_q[p][tone_idx]–4.0)
pSpectralData[tone_pos]=tone_val
其中,tile[p]为第p个频率区域的起始频点,sfb为音调成分的位置参数(即音调成分的位置对应的子带序号),tone_res[p]为子带的频域分辨率,tone_pos表示第p个频率区域内第tone_idx个音调成分对应的音调成分的位置。tone_val_q[p][tone_idx]表示第p个频率区域内的第tone_idx个音调成分对应的音调成分的幅度参数,tone_val表示第p个频率区域内第tone_idx个音调成分对应的幅度值。pSpectralData[tone_pos]表示音调成分的位置tone_pos对应的频域信号。tone_idx的取值范围属于[0,tone_cnt[p]-1],tone_cnt[p]为音调成分的数量参数。
在高频带范围内,如果频点号不等于音调成分位置tone_pos,则该频点上的频域信号可以直接设置为0。本发明对不存在音调成分的其他频点的重建方法不做限定。
针对情况2:音调成分的位置数量参数、幅度参数,重建当前帧的高频带信号。
(1)根据音调成分的位置数量参数,确定音调成分的位置参数。
音调成分的位置数量参数可以由N位比特序列表示,N为一个频率区域内的子带个数。具体地,可以是对音调成分的位置数量参数进行移位操作,以确定频率区域内存在音调成分的子带序号以及存在音调成分的子带数量。频率区域内存在音调成分的子带序号即为音调成分的位置参数。频率区域内存在音调成分的子带数量即为音调成分的数量参数。
一种可能的情况是:比特序列由低位到高位分别表示子带的序号从小到大。例如,频率区域内的子带个数为5,5位比特序列的最低比特位对应子带的序号为0,5位比特序列的最高比特位对应子带的序号为4。这种情况下,如果音调成分的位置数量参数的二进制表示为10011,频率区域内存在音调成分的子带序号分别为0、1、4。
另一种可能的情况是:比特序列由低位到高位分别表示子带的序号从大到小。例如,频率区域内的子带个数为5,5位比特序列的最低比特位对应子带的序号为4,5位比特序列的最高比特位对应子带的序号为0。这种情况下,如果音调成分的位置数量参数的二进制表示为10011,频率区域内存在音调成分的子带序号分别为0、3、4。
除此之外,比特序列的每一位所对应的子带的序号还可以是预先规定的,本发明不做限定。
根据音调成分的位置数量参数确定音调成分的位置参数的同时,可以获得音调成分的 数量参数。频率区域内存在音调成分的子带序号的个数即音调成分的数量参数。
(2)根据音调成分的位置参数和音调成分的幅度参数,重建高频带信号。
计算音调成分的位置。
具体地可以是:根据音调成分的位置参数计算音调成分位置。
tone_pos=tile[p]+(sfb+0.5)*tone_res[p]
其中,tile[p]为第p个频率区域的起始频点,sfb为频率区域内存在音调成分的子带序号,tone_res[p]为第p个频率区域的频域分辨率。频率区域内存在音调成分的子带序号即为音调成分的位置参数。0.5表示存在音调成份的子带中音调成分的位置位于子带的中心。当然重建的音调成分也可以位于子带的其他位置。
计算音调成分的幅度。
具体地可以是:根据音调成分的幅度参数计算音调成分的幅度。
具体地,可以是:
tone_val=pow(2.0,0.25*tone_val_q[p][tone_idx]–4.0)
其中,tone_val_q[p][tone_idx]表示第p个频率区域内的第tone_idx个位置参数对应的幅度参数,tone_val表示第p个频率区域内第tone_idx个位置参数对应的频点的幅度值。
tone_idx的取值范围属于[0,tone_cnt[p]-1],tone_cnt[p]为音调成分的数量参数。
根据音调成分的位置和音调成分的幅度,重建高频带信号
音调成分的位置tone_pos对应的频域信号,满足:
pSpectralData[tone_pos]=tone_val
其中,pSpectralData[tone_pos]表示音调成分的位置tone_pos对应的频域信号,tone_val表示第p个频率区域内第tone_idx个位置参数对应的频点的幅度值。tone_pos表示第p个频率区域内第tone_idx个位置参数对应的音调成分的位置。
在高频带范围内,如果频点号不等于音调成分的位置tone_pos,则该频点的频域信号可以直接设置为0。本发明对不存在音调成分的其他频点的重建方法不做限定。
4、根据重建的高频带信号,得到当前帧的音频信号。
本发明第三个实施例提供了一个音频解码方法,包括如下流程:
1、获取编码码流。
2、根据编码码流,得到高频带参数。
其中,高频带可以划分成K个频率区域(tile),每一个频率区域内又划分为N个子带。高频带参数的确定,可以以频率区域为单位进行。下面均以一个频率区域内根据编码码流得到高频带参数的方法为例。
(1)解析码流,确定音调成分的位置数量参数。
音调成分的位置数量参数表征了音调成分的位置信息和音调成分的数量信息。解码侧解析码流,先获得音调成分的位置数量参数。音调成分的位置数量参数可以由N位比特序列表示,N为一个频率区域内的子带个数。
具体地,先根据频域分辨率确定频率区域内的子带个数num_subband;然后,根据频率区域内的子带个数num_subband,从码流中读取num_subband个比特位,即为音调成分的位置数量参数。
其中,频域分辨率tone_res[p]可以是预先设定的,也可以是从获得的编码码流中解析得到的。假设第p个频率区域的频带宽度为tile_width[p],则频率区域内的子带个数,可以是
num_subband=tile_width[p]/tone_res[p]
例如,频率区域内的子带个数为5,从码流中读取5个比特位,得到的音调成分的位置数量参数的二进制表示为10011。
频率区域内的子带个数num_subband还可以是预设的,可以直接根据频率区域内的子带个数num_subband,从码流中读取num_subband个比特位,即为音调成分的位置数量参数。
(2)根据音调成分的位置数量参数,确定音调成分的位置参数和音调成分的数量参数。
音调成分的位置数量参数可以由N位比特序列表示,N为一个频率区域内的子带个数。具体地,可以是对音调成分的位置数量参数进行移位操作,以确定频率区域内存在音调成分的子带序号以及存在音调成分的子带数量。频率区域内存在音调成分的子带序号即为音调成分的位置参数。频率区域内存在音调成分的子带数量即为音调成分的数量参数。
一种可能的情况是:比特序列由低位到高位分别表示子带的序号从小到大。例如,频率区域内的子带个数为5,5位比特序列的最低比特位对应子带的序号为0,5位比特序列的最高比特位对应子带的序号为4。这种情况下,如果音调成分的位置数量参数的二进制表示为10011,频率区域内存在音调成分的子带序号分别为0、1、4。
另一种可能的情况是:比特序列由低位到高位分别表示子带的序号从大到小。例如,频率区域内的子带个数为5,5位比特序列的最低比特位对应子带的序号为4,5位比特序列的最高比特位对应子带的序号为0。这种情况下,如果音调成分的位置数量参数的二进制表示为10011,频率区域内存在音调成分的子带序号分别为0、3、4。
除此之外,比特序列的每一位所对应的子带的序号还可以是预先规定的,本发明不做限定。
根据音调成分的位置数量参数确定音调成分的位置参数的同时,可以获得音调成分的数量参数。频率区域内存在音调成分的子带序号的个数即音调成分的数量参数。
具体地,可以是:根据音调成分的位置数量参数,确定频率区域内存在音调成分的子带的数量,即音调成分的数量参数tone_cnt[p]。频率区域内存在音调成分的子带的数量等于音调成分的位置数量参数的二进制表示中取值为1的比特位的个数。
例如,音调成分的位置数量参数的二进制表示为10011。那么,频率区域内存在音调成分的子带的数量等于3,即音调成分的位置参数tone_cnt[p]=3。
当然,也可以用0表示子带存在音调成分,那么音调成分的位置数量参数的二进制表示为10011时频率区域内存在音调成分的子带的数量等于2,即音调成分的位置参数tone_cnt[p]=2。
(3)根据音调成分的数量参数,解析码流,确定音调成分的幅度参数。
具体地,可以是:根据预先设定的比特数从码流中依次解析音调成分的幅度参数,音调成分的幅度参数的个数等于音调成分的数量参数。音调成分的幅度参数tone_val_q[p][i],i=0,…,tone_cnt[p]-1。
3、根据高频带参数,得到重建的高频带信号。
高频带可以划分成K个频率区域(tile),每一个频率区域内又划分为N个子带。高频带信号的重建,可以以频率区域为单位进行。下面均以一个频率区域内根据高频带参数得到重建的高频带信号的方法为例。根据各个频率区域内重建的高频带信号,获得重建的高频带信号。高频带信号可以是频域信号,也可以是时域信号。
具体地,可以是根据音调成分的位置参数、数量参数、幅度参数,重建当前帧的高频带信号。音调成分的数量参数表证了音调成分的数量。一个位置上的音调成分的重建方法,具体的可以是:
(1)计算音调成分的位置。
具体地可以是:根据音调成分的位置参数计算音调成分位置。
tone_pos=tile[p]+(sfb+0.5)*tone_res[p]
其中,tile[p]为第p个频率区域的起始频点,sfb为频率区域内存在音调成分的子带序号,tone_res[p]为第p个频率区域的频域分辨率。频率区域内存在音调成分的子带序号即为音调成分的位置参数。0.5表示存在音调成份的子带中音调成分的位置位于子带的中心。当然重建的音调成分也可以位于子带的其他位置。
(2)计算音调成分的幅度。
具体地可以是:根据音调成分的幅度参数计算音调成分的幅度。
具体地,可以是:
tone_val=pow(2.0,0.25*tone_val_q[p][tone_idx]–4.0)
其中,tone_val_q[p][tone_idx]表示第p个频率区域内的第tone_idx个位置参数对应的幅度参数,tone_val表示第p个频率区域内第tone_idx个位置参数对应的频点的幅度值。
tone_idx的取值范围属于[0,tone_cnt[p]-1],tone_cnt[p]为音调成分的数量。
(3)根据音调成分的位置和音调成分的幅度,重建高频带信号。
音调成分的位置tone_pos对应的频域信号,满足:
pSpectralData[tone_pos]=tone_val
其中,pSpectralData[tone_pos]表示音调成分的位置tone_pos对应的频域信号,tone_val表示第p个频率区域内第tone_idx个位置参数对应的频点的幅度值。tone_pos表示第p个频率区域内第tone_idx个位置参数对应的音调成分的位置。
在高频带范围内,如果频点号不等于音调成分的位置tone_pos,则该频点的频域信号可以直接设置为0。本发明对不存在音调成分的其他频点的重建方法不做限定。
4、根据重建的高频带信号,得到当前帧的音频信号。
从上可知,本发明实施例中音频编码器会对高频带信号中的音调成分的位置、数量以 及幅度或能量进行编码,从而使得音频解码器根据音调成分的位置、数量以及幅度或能量恢复音调成分,使得恢复的音调成分的位置和能量更准确,从而提高了解码信号的质量。
图6描述了本发明一个实施例提供的音频编码器的结构,包括:
信号获取单元601,用于获取音频信号的当前帧,所述当前帧包括高频带信号;
参数获取单元602,用于根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量;
编码单元603,用于对所述高频带编码参数进行码流复用,以得到编码码流。
在一个实施方式中,所述音频编码器还可以包括:确定单元,用于确定所述当前频率区域内是否包括音调成分;所述参数获取单元,具体用于在所述当前频率区域内包括音调成分时,根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
其中,音频编码器的具体实现可以参考前述的音频编码方法,此处不再赘述。
从上可知,本发明实施例中音频编码器会对高频带信号中的音调成分的位置、数量以及幅度或能量进行编码,从而使得音频解码器根据音调成分的位置、数量以及幅度或能量恢复音调成分,使得恢复的音调成分的位置和能量更准确,从而提高了解码信号的质量。
图7描述了本发明一个实施例提供的音频解码器的结构,包括:
接收单元701,用于获取编码码流;
解复用单元702,用于对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数,所述高频带参数用于表示所述当前帧的高频带信号包括的音调成分的位置、数量以及幅度或能量;
重建单元703,用于根据所述高频带参数获得所述当前帧的重建高频带信号;根据所述当前帧的重建高频带信号获得所述当前帧的音频输出信号。
其中,音频解码器的具体实现可以参考前述的音频编码方法,此处不再赘述。
从上可知,本发明实施例中音频编码器会对高频带信号中的音调成分的位置、数量以及幅度或能量进行编码,从而使得音频解码器根据音调成分的位置、数量以及幅度或能量恢复音调成分,使得恢复的音调成分的位置和能量更准确,从而提高了解码信号的质量。
需要说明的是,上述装置各模块/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其带来的技术效果与本申请方法实施例相同,具体内容可参见本申请前述所示的方法实施例中的叙述,此处不再赘述。
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质存储有程序,该程序执行包括上述方法实施例中记载的部分或全部步骤。
接下来介绍本申请实施例提供的另一种音频编码设备,请参阅图8所示,音频编码设备800包括:
接收器801、发射器802、处理器803和存储器804(其中音频编码设备800中的处理器803的数量可以一个或多个,图8中以一个处理器为例)。在本申请的一些实施例中,接收器801、发射器802、处理器803和存储器804可通过总线或其它方式连接,其中,图8中以通过总线连接为例。
存储器804可以包括只读存储器和随机存取存储器,并向处理器803提供指令和数据。存储器804的一部分还可以包括非易失性随机存取存储器(non-volatile random access memory,NVRAM)。存储器804存储有操作系统和操作指令、可执行模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。操作系统可包括各种系统程序,用于实现各种基础业务以及处理基于硬件的任务。
处理器803控制音频编码设备的操作,处理器803还可以称为中央处理单元(central processing unit,CPU)。具体的应用中,音频编码设备的各个组件通过总线系统耦合在一起,其中总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线系统。
上述本申请实施例揭示的方法可以应用于处理器803中,或者由处理器803实现。处理器803可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器803中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器803可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器804,处理器803读取存储器804中的信息,结合其硬件完成上述方法的步骤。
接收器801可用于接收输入的数字或字符信息,以及产生与音频编码设备的相关设置以及功能控制有关的信号输入,发射器802可包括显示屏等显示设备,发射器802可用于通过外接接口输出数字或字符信息。
本申请实施例中,处理器803,用于执行前述图2所示的音频编码方法。
接下来介绍本申请实施例提供的另一种音频解码设备,请参阅图9所示,音频解码设备900包括:
接收器901、发射器902、处理器903和存储器904(其中音频解码设备900中的处理器903的数量可以一个或多个,图9中以一个处理器为例)。在本申请的一些实施例中,接收器901、发射器902、处理器903和存储器904可通过总线或其它方式连接,其中,图9中以通过总线连接为例。
存储器904可以包括只读存储器和随机存取存储器,并向处理器903提供指令和数据。存储器904的一部分还可以包括NVRAM。存储器904存储有操作系统和操作指令、可执行模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。操作系统可包括各种系统程序,用于实现各种基础业务以及处理基于硬件的任务。
处理器903控制音频解码设备的操作,处理器903还可以称为CPU。具体的应用中,音频解码设备的各个组件通过总线系统耦合在一起,其中总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线系统。
上述本申请实施例揭示的方法可以应用于处理器903中,或者由处理器903实现。处理器903可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器903中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器903可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器904,处理器903读取存储器904中的信息,结合其硬件完成上述方法的步骤。
本申请实施例中,处理器903,用于执行前述图3所示的音频解码方法。
在另一种可能的设计中,当音频编码设备或音频解码设备为终端内的芯片时,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使该终端内的芯片执行上述第一方面任意一项的方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述终端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述第一方面方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献 的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (53)

  1. 一种音频编码方法,其特征在于,所述方法包括:
    获取音频信号的当前帧,所述当前帧包括高频带信号;
    根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量;
    对所述高频带编码参数进行码流复用,以得到编码码流。
  2. 根据权利要求1所述的方法,其特征在于,所述高频带参数包括音调成分的位置数量参数、以及所述音调成分的幅度参数或能量参数。
  3. 根据权利要求2所述的方法,其特征在于,所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带,所述根据所述高频带信号获得所述当前帧的高频带参数包括:
    根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数前,所述方法包括:
    确定所述当前频率区域内是否包括音调成分;
    在所述当前频率区域内包括音调成分时,根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  5. 根据权利要求4所述的方法,其特征在于,所述当前帧的高频带参数还包括音调成分指示信息,所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分。
  6. 根据权利要求3至5任一所述的方法,其特征在于,所述根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数包括:
    根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种;
    根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  7. 根据权利要求6所述的方法,其特征在于,根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种包括:
    根据所述至少一个频率区域中的当前频率区域的功率谱、能量谱或幅度谱中的至少一种在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种。
  8. 根据权利要求6所述的方法,其特征在于,所述根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数包括:
    根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息;
    根据所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  9. 根据权利要求3至8任一所述所述的方法,其特征在于,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在音调成分,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在音调成分,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
  10. 根据权利要求1至9任一所述的方法,其特征在于,所述高频带参数还包括所述高频带信号的噪声基底参数。
  11. 一种音频解码方法,其特征在于,包括:
    获取编码码流;
    对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数,所述高频带参数用于表示所述当前帧的高频带信号包括的音调成分的位置、数量以及幅度或能量;
    根据所述高频带参数获得所述当前帧的重建高频带信号;
    根据所述当前帧的重建高频带信号获得所述当前帧的音频输出信号。
  12. 根据权利要求11所述的方法,其特征在于,所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数和所述音调成分的幅度参数或能量参数。
  13. 根据权利要求12所述的方法,其特征在于,所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;
    所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数包括所述至少一个频率区域各自的音调成分的位置数量参数,所述当前帧的高频信号的音调成分的幅度参数或能量参数包括所述至少一个频率区域各自的音调成分的幅度参数或能量参数。
  14. 根据权利要求13所述的方法,其特征在于,所述对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数包括:
    获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;
    根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
  15. 根据权利要求14所述的方法,其特征在于,所述根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数,包括:
    根据所述当前频率区域的音调成分的位置数量参数,确定所述当前频率区域的音调成分的数量参数;
    根据所述当前频率区域的音调成分的数量参数,从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
  16. 根据权利要求13所述的方法,其特征在于,所述对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数包括:
    获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;
    根据所述当前频率区域的音调成分的位置数量参数,确定当前频率区域的音调成分的位置参数和当前频率区域的音调成分的数量参数;
    根据所述当前频率区域的音调成分的数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
  17. 根据权利要求14至16任一所述的方法,其特征在于,
    所述获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数前包括:
    获取所述当前频率区域的音调成分指示信息;
    所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分;
    当所述当前频率区域内包括音调成分时,获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数。
  18. 根据权利要求14至17任一所述的方法,其特征在于,所述获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数包括:
    根据所述当前频率区域包括的子带数量从所述编码码流中读取N个比特位,所述N个比特位为所述当前频率区域的音调成分的位置数量参数,其中,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应。
  19. 根据权利要求14,15,17和18任一所述的方法,其特征在于,所述根据所述高频带参数获得所述当前帧的重建高频带信号包括:
    根据所述当前频率区域的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置;
    根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;
    根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
  20. 根据权利要求19所述的方法,其特征在于,所述根据所述当前频率区域的高频信号的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置包括:
    根据所述当前频率区域的高频信号的音调成分的位置数量参数,确定所述当前频率区域的音调成分的位置参数;
    根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置。
  21. 根据权利要求16至18任一所述的方法,其特征在于,所述根据所述高频带参数 获得所述当前帧的重建高频带信号包括:
    根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置;
    根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;
    根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
  22. 根据权利要求16至21任一所述的方法,其特征在于,所述当前频率区域的音调成分的位置参数用于指示所述当前频率区域中包括音调成分的子带的序号。
  23. 根据权利要求20或21任一所述的方法,其特征在于,所述当前频率区域中音调成分位置位于所述当前频率区域中音调成分所在子带的指定位置。
  24. 根据权利要求23所述的方法,其特征在于,所述子带的指定位置为子带的中心位置。
  25. 根据权利要求19至21任一所述的方法,其特征在于,所述根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度获得所述重建高频带信号包括:
    根据如下计算式确定音调成分的位置的频域信号:
    pSpectralData[tone_pos]=tone_val
    其中,pSpectralData表示所述当前频率区域的重建高频带频域信号,tone_val表示所述当前频率区域内音调成分的位置对应的幅度值,tone_pos表示所述当前频率区域内音调成分的位置。
  26. 一种音频编码器,其特征在于,包括:
    信号获取单元,用于获取音频信号的当前帧,所述当前帧包括高频带信号;
    参数获取单元,用于根据所述高频带信号获得所述当前帧的高频带参数,所述高频带参数用于表示所述高频带信号包括的音调成分的位置、数量以及幅度或能量;
    编码单元,用于对所述高频带编码参数进行码流复用,以得到编码码流。
  27. 根据权利要求26所述的音频编码器,其特征在于,所述高频带参数包括音调成分的位置数量参数、以及所述音调成分的幅度参数或能量参数。
  28. 根据权利要求27所述的音频编码器,其特征在于,所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;
    所述参数获取单元,具体用于:
    根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  29. 根据权利要求28所述的音频编码器,其特征在于,所述音频编码器还包括:
    确定单元,用于确定所述当前频率区域内是否包括音调成分;
    所述参数获取单元,具体用于在所述当前频率区域内包括音调成分时,根据所述至少一个频率区域中的当前频率区域的高频带信号,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  30. 根据权利要求29所述的音频编码器,其特征在于,所述当前帧的高频带参数还包括音调成分指示信息,所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分。
  31. 根据权利要求28至30任一所述的音频编码器,其特征在于,所述参数获取单元,具体用于:
    根据所述至少一个频率区域中的当前频率区域的高频带信号在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种;
    根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  32. 根据权利要求31所述的音频编码器,其特征在于,所述参数获取单元,具体用于:
    根据所述至少一个频率区域中的当前频率区域的功率谱、能量谱或幅度谱中的至少一种在所述当前频率区域内进行峰值搜索,以获得所述当前区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种。
  33. 根据权利要求31所述的音频编码器,其特征在于,所述参数获取单元,具体用于:
    根据所述当前频率区域的峰值数量信息、峰值位置信息以及峰值幅度信息中的至少一种,确定所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息;
    根据所述当前频率区域的音调成分的位置信息,数量信息以及幅度信息确定所述当前频率区域的音调成分的位置数量参数和所述当前频率区域的音调成分的幅度参数或能量参数。
  34. 根据权利要求28至33任一所述所述的音频编码器,其特征在于,所述当前频率区域的音调成分的位置数量参数包括N个比特位,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应;其中,若所述当前频率区域包括的第一子带存在音调成分,则所述N个比特位中与所述第一子带对应的比特位的值为第一值;或若所述当前频率区域包括的第二子带不存在音调成分,则所述N个比特位中与所述第二子带对应的比特位的值为第二值,所述第一值与所述第二值不同。
  35. 根据权利要求26至34任一所述的音频编码器,其特征在于,所述高频带参数还包括所述高频带信号的噪声基底参数。
  36. 一种音频解码器,其特征在于,包括:
    接收单元,用于获取编码码流;
    解复用单元,用于对所述编码码流进行码流解复用,以得到音频信号的当前帧的高频带参数,所述高频带参数用于表示所述当前帧的高频带信号包括的音调成分的位置、数量以及幅度或能量;
    重建单元,用于根据所述高频带参数获得所述当前帧的重建高频带信号;根据所述当前帧的重建高频带信号获得所述当前帧的音频输出信号。
  37. 根据权利要求36所述的音频解码器,其特征在于,所述高频带参数包括所述当前 帧的高频信号的音调成分的位置数量参数和所述音调成分的幅度参数或能量参数。
  38. 根据权利要求37所述的音频解码器,其特征在于,将所述高频带信号对应的高频带包括至少一个频率区域,一个所述频率区域包括至少一个子带;
    所述高频带参数包括所述当前帧的高频信号的音调成分的位置数量参数包括所述至少一个频率区域各自的音调成分的位置数量参数,所述当前帧的高频信号的音调成分的幅度参数或能量参数包括所述至少一个频率区域各自的音调成分的幅度参数或能量参数。
  39. 根据权利要求38所述的音频解码器,其特征在于,所述解复用单元,具体用于:
    获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;
    根据所述当前频率区域的音调成分的位置数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
  40. 根据权利要求39所述的音频解码器,其特征在于,所述解复用单元,具体用于:
    根据所述当前频率区域的音调成分的位置数量参数,确定所述当前频率区域的音调成分的数量参数;
    根据所述当前频率区域的音调成分的数量参数,从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
  41. 根据权利要求38所述的音频解码器,其特征在于,所述解复用单元,具体用于:
    获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数;
    根据所述当前频率区域的音调成分的位置数量参数,确定当前频率区域的音调成分的位置参数和当前频率区域的音调成分的数量参数;
    根据所述当前频率区域的音调成分的数量参数从所述编码码流中解析所述当前频率区域的音调成分的幅度参数或能量参数。
  42. 根据权利要求39至41任一所述的音频解码器,其特征在于,解复用单元,具体用于:获取所述当前频率区域的音调成分指示信息;所述音调成分指示信息用于指示所述当前频率区域内是否包括音调成分;当所述当前频率区域内包括音调成分时,获取所述至少一个频率区域的当前频率区域的音调成分的位置数量参数。
  43. 根据权利要求39至42任一所述的音频解码器,其特征在于,所述解复用单元,具体用于:
    根据所述当前频率区域包括的子带数量从所述编码码流中读取N个比特位,所述N个比特位为所述当前频率区域的音调成分的位置数量参数,其中,N为所述当前频率区域包括的子带数量,所述N个比特位与所述当前频率区域包括的子带一一对应。
  44. 根据权利要求39,40,42和43任一所述的音频解码器,其特征在于,所述重建单元,具体用于:
    根据所述当前频率区域的音调成分的位置数量参数确定所述当前频率区域中音调成分的位置;
    根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;
    根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获 得所述重建高频带信号。
  45. 根据权利要求44所述的音频解码器,其特征在于,所述重建单元,具体用于:
    根据所述当前频率区域的高频信号的音调成分的位置数量参数,确定所述当前频率区域的音调成分的位置参数;
    根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置。
  46. 根据权利要求41至43任一所述的音频解码器,其特征在于,所述重建单元,具体用于:
    根据所述当前频率区域的音调成分的位置参数,确定所述当前频率区域中音调成分位置;
    根据所述当前频率区域的音调成分的幅度参数或能量参数确定所述音调成分的位置对应的幅度或能量;
    根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度或能量获得所述重建高频带信号。
  47. 根据权利要求41至46任一所述的音频解码器,其特征在于,所述当前频率区域的音调成分的位置参数用于指示所述当前频率区域中包括音调成分的子带的序号。
  48. 根据权利要求45或46任一所述的音频解码器,其特征在于,所述当前频率区域中音调成分位置位于所述当前频率区域中音调成分所在子带的指定位置。
  49. 根据权利要求48所述的音频解码器,其特征在于,所述子带的指定位置为子带的中心位置。
  50. 根据权利要求44至49任一所述的音频解码器,其特征在于,所述根据所述当前频率区域中音调成分的位置和所述音调成分的位置对应的幅度获得所述重建高频带信号包括:
    根据如下计算式确定音调成分的位置的频域信号:
    pSpectralData[tone_pos]=tone_val
    其中,pSpectralData表示所述当前频率区域的重建高频带频域信号,tone_val表示所述当前频率区域内音调成分的位置对应的幅度值,tone_pos表示所述当前频率区域内音调成分的位置。
  51. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至25任意一项所述的方法。
  52. 一种音频编码设备,其特征在于,包括至少一个处理器,所述至少一个处理器用于与存储器耦合,读取并执行所述存储器中的指令,以实现如权利要求1至10任一所述的方法。
  53. 一种音频解码设备,其特征在于,包括至少一个处理器,所述至少一个处理器用于与存储器耦合,读取并执行所述存储器中的指令,以实现如权利要求11至15中任一项所述的方法。
PCT/CN2021/071327 2020-01-13 2021-01-12 一种音频编解码方法和音频编解码设备 WO2021143691A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP21740645.3A EP4080503A4 (en) 2020-01-13 2021-01-12 AUDIO ENCODING AND DECODING METHODS AND AUDIO ENCODING AND DECODING DEVICES
KR1020227026986A KR20220117340A (ko) 2020-01-13 2021-01-12 오디오 인코딩 및 디코딩 방법과 오디오 인코딩 및 디코딩 디바이스
JP2022542159A JP2023509201A (ja) 2020-01-13 2021-01-12 オーディオ符号化及び復号方法、並びにオーディオ符号化及び復号デバイス
US17/862,712 US11887610B2 (en) 2020-01-13 2022-07-12 Audio encoding and decoding method and audio encoding and decoding device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010033973.0A CN113192517B (zh) 2020-01-13 一种音频编解码方法和音频编解码设备
CN202010033973.0 2020-01-13

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/862,712 Continuation US11887610B2 (en) 2020-01-13 2022-07-12 Audio encoding and decoding method and audio encoding and decoding device

Publications (1)

Publication Number Publication Date
WO2021143691A1 true WO2021143691A1 (zh) 2021-07-22

Family

ID=76863583

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/071327 WO2021143691A1 (zh) 2020-01-13 2021-01-12 一种音频编解码方法和音频编解码设备

Country Status (5)

Country Link
US (1) US11887610B2 (zh)
EP (1) EP4080503A4 (zh)
JP (1) JP2023509201A (zh)
KR (1) KR20220117340A (zh)
WO (1) WO2021143691A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113808596A (zh) * 2020-05-30 2021-12-17 华为技术有限公司 一种音频编码方法和音频编码装置
CN113808597A (zh) * 2020-05-30 2021-12-17 华为技术有限公司 一种音频编码方法和音频编码装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008096567A (ja) * 2006-10-10 2008-04-24 Matsushita Electric Ind Co Ltd オーディオ符号化装置およびオーディオ符号化方法ならびにプログラム
US20080312912A1 (en) * 2007-06-12 2008-12-18 Samsung Electronics Co., Ltd Audio signal encoding/decoding method and apparatus
CN102750954A (zh) * 2007-04-30 2012-10-24 三星电子株式会社 对高频带编码和解码的方法和设备
CN104584124A (zh) * 2013-01-22 2015-04-29 松下电器产业株式会社 带宽扩展参数生成装置、编码装置、解码装置、带宽扩展参数生成方法、编码方法、以及解码方法
CN105408957A (zh) * 2013-06-11 2016-03-16 松下电器(美国)知识产权公司 进行语音信号的频带扩展的装置及方法
US10224048B2 (en) * 2016-12-27 2019-03-05 Fujitsu Limited Audio coding device and audio coding method

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08162963A (ja) * 1994-11-30 1996-06-21 Sony Corp データ符号化装置および復号装置
CN1163870C (zh) 1996-08-02 2004-08-25 松下电器产业株式会社 声音编码装置和方法,声音译码装置,以及声音译码方法
JP2003233395A (ja) 2002-02-07 2003-08-22 Matsushita Electric Ind Co Ltd オーディオ信号の符号化方法及び装置、並びに符号化及び復号化システム
JP4736812B2 (ja) * 2006-01-13 2011-07-27 ソニー株式会社 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体
WO2008007698A1 (fr) 2006-07-12 2008-01-17 Panasonic Corporation Procédé de compensation des pertes de blocs, appareil de codage audio et appareil de décodage audio
US20080027012A1 (en) 2006-07-24 2008-01-31 Heejin Kim Bridged carbamate macrolides
CN102194458B (zh) 2010-03-02 2013-02-27 中兴通讯股份有限公司 频带复制方法、装置及音频解码方法、系统
CN103098130B (zh) 2010-10-06 2014-11-26 松下电器产业株式会社 编码装置、解码装置、编码方法以及解码方法
EP3220390B1 (en) * 2012-03-29 2018-09-26 Telefonaktiebolaget LM Ericsson (publ) Transform encoding/decoding of harmonic audio signals
EP2830063A1 (en) * 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for decoding an encoded audio signal
FR3017484A1 (fr) * 2014-02-07 2015-08-14 Orange Extension amelioree de bande de frequence dans un decodeur de signaux audiofrequences
BR112016020988B1 (pt) * 2014-03-14 2022-08-30 Telefonaktiebolaget Lm Ericsson (Publ) Método e codificador para codificação de um sinal de áudio, e, dispositivo de comunicação
PL3174050T3 (pl) * 2014-07-25 2019-04-30 Fraunhofer Ges Forschung Urządzenie do kodowania sygnałów audio, urządzenie do dekodowania sygnałów audio i ich sposoby
ES2823250T3 (es) 2014-07-25 2021-05-06 Fraunhofer Ges Forschung Aparato de codificación de señal de audio, dispositivo de decodificación de señal de audio y métodos del mismo
EP2980792A1 (en) * 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating an enhanced signal using independent noise-filling
CN113593586A (zh) * 2020-04-15 2021-11-02 华为技术有限公司 音频信号编码方法、解码方法、编码设备以及解码设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008096567A (ja) * 2006-10-10 2008-04-24 Matsushita Electric Ind Co Ltd オーディオ符号化装置およびオーディオ符号化方法ならびにプログラム
CN102750954A (zh) * 2007-04-30 2012-10-24 三星电子株式会社 对高频带编码和解码的方法和设备
US20080312912A1 (en) * 2007-06-12 2008-12-18 Samsung Electronics Co., Ltd Audio signal encoding/decoding method and apparatus
CN104584124A (zh) * 2013-01-22 2015-04-29 松下电器产业株式会社 带宽扩展参数生成装置、编码装置、解码装置、带宽扩展参数生成方法、编码方法、以及解码方法
CN105408957A (zh) * 2013-06-11 2016-03-16 松下电器(美国)知识产权公司 进行语音信号的频带扩展的装置及方法
US10224048B2 (en) * 2016-12-27 2019-03-05 Fujitsu Limited Audio coding device and audio coding method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4080503A4

Also Published As

Publication number Publication date
KR20220117340A (ko) 2022-08-23
EP4080503A4 (en) 2023-05-03
JP2023509201A (ja) 2023-03-07
CN113192517A (zh) 2021-07-30
US20220343926A1 (en) 2022-10-27
US11887610B2 (en) 2024-01-30
EP4080503A1 (en) 2022-10-26

Similar Documents

Publication Publication Date Title
US7937271B2 (en) Audio decoding using variable-length codebook application ranges
JP5162588B2 (ja) 音声符号化システム
US10089997B2 (en) Method for predicting high frequency band signal, encoding device, and decoding device
WO2021143694A1 (zh) 一种音频编解码方法和音频编解码设备
WO2021143692A1 (zh) 一种音频编解码方法和音频编解码设备
JP2012238034A (ja) マルチチャンネルオーディオ信号復号化方法
JP2001094433A (ja) サブバンド符号化・復号方法
WO2021143691A1 (zh) 一种音频编解码方法和音频编解码设备
US9799339B2 (en) Stereo audio signal encoder
AU2015235133B2 (en) Audio decoding device, audio encoding device, audio decoding method, audio encoding method, audio decoding program, and audio encoding program
CN100489964C (zh) 音频解码
EP2610867A1 (en) Audio reproducing device and audio reproducing method
WO2021213128A1 (zh) 音频信号编码方法和装置
US20220335961A1 (en) Audio signal encoding method and apparatus, and audio signal decoding method and apparatus
CN113192517B (zh) 一种音频编解码方法和音频编解码设备
WO2022012677A1 (zh) 音频编解码方法和相关装置及计算机可读存储介质
WO2021139757A1 (zh) 一种音频编解码方法和音频编解码设备
WO2021136343A1 (zh) 音频信号的编解码方法和编解码装置
WO2024021732A1 (zh) 音频编解码方法、装置、存储介质及计算机程序产品
WO2022141658A1 (zh) 在lc3音频码流中添加额外信息的方法、系统及介质
CN115881139A (zh) 编解码方法、装置、设备、存储介质及计算机程序

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21740645

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022542159

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021740645

Country of ref document: EP

Effective date: 20220721

ENP Entry into the national phase

Ref document number: 20227026986

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE