WO2024050673A1 - 一种音频信号频带扩展方法、装置、设备及存储介质 - Google Patents

一种音频信号频带扩展方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2024050673A1
WO2024050673A1 PCT/CN2022/117110 CN2022117110W WO2024050673A1 WO 2024050673 A1 WO2024050673 A1 WO 2024050673A1 CN 2022117110 W CN2022117110 W CN 2022117110W WO 2024050673 A1 WO2024050673 A1 WO 2024050673A1
Authority
WO
WIPO (PCT)
Prior art keywords
frequency
frequency point
point
band
spectrum signal
Prior art date
Application number
PCT/CN2022/117110
Other languages
English (en)
French (fr)
Inventor
王宾
吴鑫
王薇洁
Original Assignee
北京小米移动软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京小米移动软件有限公司 filed Critical 北京小米移动软件有限公司
Priority to PCT/CN2022/117110 priority Critical patent/WO2024050673A1/zh
Publication of WO2024050673A1 publication Critical patent/WO2024050673A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic

Definitions

  • the present disclosure relates to the field of communication technology, and in particular, to an audio signal band extension method/device/equipment and a storage medium.
  • the signal transmitter when transmitting audio signals, usually first converts the audio signals from time domain signals to frequency domain signals, and then uses encoding equipment to compress and encode the frequency domain signals before transmitting them. And after the signal receiving end receives the encoded signal, it must first use a decoding device to perform a decoding operation to reconstruct the audio frequency domain signal, and then convert the reconstructed audio frequency domain signal into a time domain signal to obtain a reconstructed audio time domain signal.
  • the encoding device uses most of the bits for fine quantization of relatively important low-spectrum signals in the audio signals, that is, low-spectrum signals.
  • the quantization parameters occupy most of the bits; only a small number of bits are used to roughly quantize the high-spectrum signal in the encoded audio signal to obtain the frequency domain envelope of the high-spectrum signal, and then the frequency domain envelope of the high-spectrum signal and the quantization of the low-spectrum signal are The parameters are sent to the decoding device in the form of a bit stream.
  • the decoding device first decodes and multiplexes the received bit stream to decode to obtain the quantization parameters of the low-spectrum signal and the frequency domain envelope of the high-spectrum signal, and then based on the decoded quantization parameters of the low-spectrum signal
  • the low-spectrum signal is recovered, and then based on the quantized parameters of the decoded low-spectrum signal, the frequency band extension technology is used to obtain the high-spectrum signal above the starting frequency point of the preset bandwidth extension band.
  • the decoding equipment will involve the following concepts in the process of decoding audio signals, which are: bandwidth extension frequency band (i.e. extended high frequency band, specifically: the starting frequency point of the preset bandwidth extension frequency band to the preset The frequency band between the highest frequency points of the assumed bandwidth extension band), the frequency point with bit allocation (that is, the frequency point corresponding to the encoded low-spectrum signal), the highest frequency point with bit allocation, the bit allocation
  • bandwidth extension frequency band i.e. extended high frequency band, specifically: the starting frequency point of the preset bandwidth extension frequency band to the preset The frequency band between the highest frequency points of the assumed bandwidth extension band
  • the frequency point with bit allocation that is, the frequency point corresponding to the encoded low-spectrum signal
  • the highest frequency point with bit allocation the bit allocation
  • the highest frequency point is: the highest frequency point of the encoded low-spectrum signal. In other words, no low-spectrum signal is decoded from the highest frequency point with bit allocation.
  • the highest frequency point with bit allocation The frequency band above can be called high frequency band, and the frequency band below the highest frequency point with bit allocation can be called low frequency band. And, there are two distribution methods between the above-mentioned bandwidth extension band and the highest frequency point with bit allocation.
  • 1a-1b are distribution relationship diagrams between a bandwidth extension frequency band and the highest frequency point with bit allocation provided by an embodiment of the present disclosure. As shown in Figure 1a, the starting frequency point of the bandwidth extension band can be higher than the highest frequency point with bit allocation. And, as shown in Figure 1b, the starting frequency point of the bandwidth extension band can be lower than the highest frequency point with bit allocation.
  • the audio signal band expansion method/device/equipment and storage medium proposed in this disclosure are to solve the technical problems of high and low frequency energy imbalance within the frame caused by related technical methods, mechanical feeling caused by spectrum holes, and low reconstructed audio quality. .
  • embodiments of the present disclosure provide an audio signal frequency band extension method, which is executed by a decoding device and includes:
  • the audio frequency domain signal In response to the audio frequency domain signal having a bit-allocated highest frequency point that is lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than a preset starting frequency point of the bandwidth expansion band.
  • Starting frequency band based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal, predicting the range between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band spectrum signal.
  • an audio signal frequency band extension method receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • the method further includes:
  • the starting frequency point and the highest frequency point of the preset bandwidth extension band are determined based on the encoding rate of the encoding device and the frequency band range required for encoding of the audio signal.
  • the frequency points in the predetermined frequency band range or the predetermined frequency point range are lower than the highest frequency point with bit allocation.
  • the highest frequency point with bit allocation is predicted based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to the
  • the preset bandwidth extends the spectrum signal between the highest frequency points of the frequency band, including:
  • n is a positive integer or a positive fraction.
  • the copy method of the n copies of the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal includes:
  • the highest frequency point with bit allocation is predicted based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to the
  • the preset bandwidth extends the spectrum signal between the highest frequency points of the frequency band, including:
  • the copying method of the m copies or h copies of the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal includes:
  • the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal is mirrored multiple times to obtain m copies or h copies of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal.
  • the same method is used in different frames to predict the range between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band. spectrum signal.
  • the method further includes:
  • Frequency domain envelope correction is performed on the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension frequency band.
  • the spectrum signals between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension frequency band are packaged in the frequency domain.
  • Network corrections include at least one of the following:
  • the highest frequency point with bit allocation is corrected to the highest frequency point with bit allocation and the preset bandwidth
  • the frequency domain envelope value of the spectrum signal between the starting frequency point and the intermediate frequency point of the extended frequency band and, based on the preset bandwidth, the frequency domain envelope value of the spectrum signal between the starting frequency point and the second frequency point of the extended frequency band.
  • the frequency domain envelope value corrects the frequency domain envelope value of the spectrum signal between the intermediate frequency point and the starting frequency point of the preset bandwidth extension frequency band; wherein the first frequency point is: W1-0.5 ⁇ Wx; W1 represents the highest frequency point with bit allocation, Wx represents the frequency bandwidth between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band; the second frequency point It is: W2+0.5 ⁇ Wx; W2 represents the starting frequency point of the preset bandwidth extension band;
  • the area between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension frequency band is corrected
  • the frequency domain envelope value of the spectrum signal; wherein, the third frequency point is: W1-Wx;
  • the highest frequency point with bit allocation is corrected to the starting frequency point of the preset bandwidth extension frequency band.
  • the frequency domain envelope value of the spectrum signal between points; wherein, the fourth frequency point is: W2+Wx.
  • the method further includes:
  • the frequency domain envelope value of the spectrum signal between the first frequency point and the highest frequency point with bit allocation, and the starting frequency point of the preset bandwidth extension frequency band to the second
  • the frequency domain envelope value of the spectrum signal between frequency points, the frequency domain envelope value of the spectrum signal between the third frequency point and the highest frequency point with bit allocation, and the starting frequency point of the preset bandwidth extension band At least one of the frequency domain envelope values of the spectrum signal between the fourth frequency point and the fourth frequency point.
  • the method further includes:
  • the frequency band between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band is filled with noise.
  • the method further includes:
  • the audio frequency domain signal and the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band are added and combined, and then converted from the frequency domain to the time domain to obtain Reconstructed audio time domain signal.
  • an embodiment of the present disclosure provides a communication device, which is configured in a decoding device and includes:
  • a transceiver module used to receive the bit stream sent by the encoding device, and decode the bit stream to obtain a decoded audio frequency domain signal
  • a processing module configured to respond to the fact that the audio frequency domain signal has a bit-allocated highest frequency point that is lower than a preset starting frequency point of the bandwidth extension band, or that the audio frequency-domain signal has a bit-allocated frequency band that is smaller than a preset starting frequency point. Assuming a bandwidth extension starting frequency band, predicting the highest frequency point with bit allocation to the highest frequency point of the preset bandwidth extension frequency band based on a predetermined frequency band range or a spectrum signal within a predetermined frequency point range in the audio frequency domain signal. Spectral signal between high frequency points.
  • an embodiment of the present disclosure provides a communication device.
  • the communication device includes a processor.
  • the processor calls a computer program in a memory, it executes the method described in the first aspect.
  • an embodiment of the present disclosure provides a communication device.
  • the communication device includes a processor and a memory, and a computer program is stored in the memory; the processor executes the computer program stored in the memory, so that the communication device executes The method described in the first aspect above.
  • an embodiment of the present disclosure provides a communication device.
  • the device includes a processor and an interface circuit.
  • the interface circuit is used to receive code instructions and transmit them to the processor.
  • the processor is used to run the code instructions to cause the The device performs the method described in the first aspect.
  • embodiments of the present disclosure provide a communication system.
  • the system includes the communication device described in the second aspect, or the system includes the communication device described in the third aspect, or the system includes the communication device described in the fourth aspect.
  • the communication device, or the system includes the communication device described in the fifth aspect.
  • embodiments of the present invention provide a computer-readable storage medium for storing instructions used by the above-mentioned network device. When the instructions are executed, the terminal device is caused to perform the method described in the first aspect. .
  • the present disclosure also provides a computer program product including a computer program, which when run on a computer causes the computer to execute the method described in the first aspect.
  • the present disclosure provides a chip system, which includes at least one processor and an interface for supporting a network device to implement functions involved in any of the methods described in the first aspect, for example, determining or processing At least one of the data and information involved in the above method.
  • the chip system further includes a memory, and the memory is used to store necessary computer programs and data of the source secondary node.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the present disclosure provides a computer program that, when run on a computer, causes the computer to execute the method described in the first aspect.
  • Figures 1a-1b are distribution relationship diagrams between a bandwidth extension frequency band and the highest frequency point with bit allocation provided by an embodiment of the present disclosure
  • Figure 1c is a schematic architectural diagram of a communication system provided by an embodiment of the present disclosure.
  • Figure 2 is a schematic flowchart of an audio signal frequency band expansion method provided by an embodiment of the present disclosure
  • Figure 3a is a schematic flow chart of an audio signal band expansion method provided by an embodiment of the present disclosure
  • Figure 3b is a method provided by an embodiment of the present disclosure based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in n audio frequency domain signals, which is filled with the highest frequency point of the bit allocation to the highest frequency point of the preset bandwidth extension frequency band.
  • Figure 3c is a method provided by an embodiment of the present disclosure based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in n audio frequency domain signals, which is filled with the highest frequency point of bit allocation to the highest frequency point of the preset bandwidth extension frequency band.
  • Figure 4a is a schematic flowchart of an audio signal band expansion method provided by yet another embodiment of the present disclosure.
  • Figure 4b is a method provided by an embodiment of the present disclosure based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in m audio frequency domain signals, which is filled with the highest frequency point of bit allocation to the highest frequency point of the preset bandwidth extension frequency band.
  • Figure 4c is a method provided by an embodiment of the present disclosure based on the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in h audio frequency domain signals, which is filled with the highest frequency point of the bit allocation to the highest frequency point of the preset bandwidth extension frequency band.
  • FIG. 5 is a schematic flowchart of an audio signal band expansion method provided by yet another embodiment of the present disclosure.
  • Figure 6 is a schematic flowchart of an audio signal band expansion method provided by yet another embodiment of the present disclosure.
  • Figure 7 is a schematic flowchart of an audio signal band expansion method provided by yet another embodiment of the present disclosure.
  • Figure 8 is a schematic structural diagram of a communication device provided by an embodiment of the present disclosure.
  • Figure 9 is a schematic structural diagram of a communication device provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a chip provided by an embodiment of the present disclosure.
  • first, second, third, etc. may be used to describe various information in the embodiments of the present disclosure, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.
  • first information may also be called second information, and similarly, the second information may also be called first information.
  • the words "if” and “if” as used herein may be interpreted as “when” or “when” or “in response to determining.”
  • a frequency range or spectrum width, a frequency point is a frequency point on the frequency band.
  • Figure 1c is a schematic architectural diagram of a communication system provided by an embodiment of the present disclosure.
  • the communication system may include but is not limited to an encoding device and a decoding device, wherein the above encoding device and decoding device may both be network devices or terminal devices.
  • the number and shape of the devices shown in Figure 1c are only examples and do not constitute a limitation on the embodiments of the present disclosure. Practical applications may include two or more encoding devices and two or more decoding devices.
  • the communication system shown in Figure 1c takes as an example a coding device 11 and a decoding device 12.
  • the coding device 11 is a network device and the decoding device 12 is a terminal device.
  • LTE long term evolution
  • 5th generation fifth generation
  • 5G new radio (NR) system 5th generation new radio
  • the network device in the embodiment of the present disclosure is an entity on the network side that is used to transmit or receive signals.
  • the network device 11 may be an evolved base station (evolved NodeB, eNB), a transmission reception point (TRP), a next generation base station (next generation NodeB, gNB) in an NR system, or other base stations in future mobile communication systems. Base stations or access nodes in wireless fidelity (WiFi) systems, etc.
  • the embodiments of the present disclosure do not limit the specific technologies and specific equipment forms used by network equipment.
  • the network equipment provided by the embodiments of the present disclosure may be composed of a centralized unit (CU) and a distributed unit (DU).
  • the CU may also be called a control unit (control unit).
  • CU-DU is used.
  • the structure can separate the protocol layers of network equipment, such as base stations, and place some protocol layer functions under centralized control on the CU. The remaining part or all protocol layer functions are distributed in the DU, and the CU centrally controls the
  • the terminal device in the embodiment of the present disclosure is an entity on the user side that is used to receive or transmit signals, such as a mobile phone.
  • Terminal equipment can also be called terminal equipment (terminal), user equipment (user equipment, UE), mobile station (mobile station, MS), mobile terminal equipment (mobile terminal, MT), etc.
  • the terminal device can be a car with communication functions, a smart car, a mobile phone, a wearable device, a tablet computer (Pad), a computer with wireless transceiver functions, a virtual reality (VR) terminal device, an augmented reality (augmented reality (AR) terminal equipment, wireless terminal equipment in industrial control, wireless terminal equipment in self-driving, wireless terminal equipment in remote medical surgery, smart grid ( Wireless terminal equipment in smart grid, wireless terminal equipment in transportation safety, wireless terminal equipment in smart city, wireless terminal equipment in smart home, etc.
  • the embodiments of the present disclosure do not limit the specific technology and specific equipment form used by the terminal equipment.
  • FIG. 2 is a schematic flowchart of an audio signal band expansion method provided by an embodiment of the present disclosure, applied to a decoding device. As shown in Figure 2, the audio signal band expansion method may include the following steps:
  • Step 201 Receive the bit stream sent by the encoding device, and decode the bit stream to obtain a decoded audio frequency domain signal.
  • step 201 the specific execution method of step 201 is similar to the existing technology, and will not be described in detail here.
  • the audio frequency domain signal obtained by decoding in this step is specifically a low-spectrum signal of the audio signal, that is, below the highest frequency point with bit allocation in the above-mentioned Figures 1a-1b spectrum signal corresponding to the frequency band.
  • the “spectrum signal” mentioned in the embodiment of the present disclosure may be a frequency band signal or a frequency point signal.
  • Step 202 In response to the audio frequency domain signal having the highest frequency point allocated with bits lower than the preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having the frequency band allocated with bits being less than the preset starting frequency point of the bandwidth expansion band.
  • Frequency band Predict the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal.
  • the starting frequency point and the highest frequency point of the preset bandwidth extension frequency band may be required by the decoding device based on the encoding rate (ie, the total number of bits) of the encoding device and the audio signal.
  • the encoding band range is predetermined. Specifically, when the encoding rate is higher, the starting frequency point of the bandwidth extension band can be set higher. For example, for ultra-wideband signals, when the encoding rate is 24kbps, the starting frequency point of the preset bandwidth extension band of the frequency domain signal can be 6.4kHz (kilohertz); when the encoding rate is 32kbps, the preset bandwidth of the frequency domain signal The starting frequency point of the extended frequency band can be 8kHz.
  • the highest frequency point of the bandwidth extension band refers to the highest point of the frequency band required to output the signal or a specified frequency point.
  • the highest frequency point of the preset bandwidth extension band can be 7kHz or 8kHz.
  • the highest frequency point of the preset bandwidth extension band can be 14kHz or 16kHz or other preset specific frequency points.
  • the above-mentioned predetermined frequency band range or frequency points in the predetermined frequency point range are lower than the highest frequency point with bit allocation.
  • the predetermined frequency band The range is the black part in Figure 1a and Figure 1b, and the frequency points in the predetermined frequency band range are all lower than the highest frequency point with bit allocation.
  • the predetermined frequency band range or the predetermined frequency point range may be determined based on the signal type and coding rate of the audio signal. Specifically, for example, at a lower coding rate, for harmonic signals, the frequency band range or frequency point range of the relatively better-coded lower spectrum signal in the low-spectrum signal can be selected as the predetermined frequency band range or predetermined frequency point range; for non- For harmonic signals, the frequency band range or frequency point range of the relatively poorly coded higher spectrum signal in the low spectrum signal can be selected as the predetermined frequency band range or frequency point range; at a higher coding rate, the low frequency range can be selected for the harmonic signal. A slightly higher frequency band or frequency point in the spectrum signal is used as the predetermined frequency band range or predetermined frequency point range.
  • step 202 it can be seen from the content of step 202 above that what is specifically predicted in this disclosure is the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band, rather than just predicting the bandwidth extension frequency band.
  • Spectrum signal there will be a predicted spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band, thus avoiding the occurrence of "the highest frequency point with bit allocation”
  • There is no spectrum signal between the starting frequency point of the preset bandwidth extension band thus ensuring the balance of high and low frequency energy within the frame, and avoiding the mechanical feeling caused by spectrum holes, improving the reconstructed audio the quality of.
  • “how the decoding device specifically predicts that there will be a predicted spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band” will be explained in subsequent embodiments.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • FIG 3a is a schematic flow chart of an audio signal band expansion method provided by an embodiment of the present disclosure, applied to a decoding device. As shown in Figure 3a, the audio signal band expansion method may include the following steps:
  • Step 301 Taking the highest frequency point with bit allocation as the starting point, or taking the highest frequency point of the preset bandwidth extension band as the starting point, sequentially copy the n copies of the audio frequency domain signal to the predetermined frequency band range or predetermined frequency point.
  • the spectrum signal within the range is regarded as the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension band.
  • n is a positive integer or a positive fraction.
  • n can be the ratio of the number of frequency points between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band and the number of frequency points within the predetermined frequency band range or the predetermined frequency point range.
  • the copying method of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the n audio frequency domain signals includes any of the following:
  • the first method is to sequentially and repeatedly copy the spectrum signals within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to obtain n copies of the spectrum signals within the predetermined frequency band range or predetermined frequency point range in the audio frequency domain signal.
  • each spectrum signal in the predetermined frequency band range or the predetermined frequency point range in the n audio frequency domain signals is along the same direction (such as from high frequency to low frequency, or along the (from low frequency to high frequency).
  • Figure 3b is a method provided by an embodiment of the present disclosure based on a predetermined frequency band range or a spectrum signal within a predetermined frequency point range in n audio frequency domain signals filled with the highest frequency point allocated by bits to a preset bandwidth extension band.
  • the structural diagram of the spectrum signal between the highest frequency points is shown in Figure 3b. Taking the highest frequency point with bit allocation as the starting point, 4 copies of the predetermined frequency band of the audio frequency domain signal will be copied in a sequential and repeated copying manner.
  • the spectrum signal within the range or the predetermined frequency point range is regarded as the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band.
  • the spectrum signals within a predetermined frequency band range or a predetermined frequency point range in each audio frequency domain signal are copied along the "from low frequency to high frequency direction".
  • the second type is multiple mirror copies (or called folded copies) of spectrum signals within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to obtain n copies of the audio frequency domain signal within a predetermined frequency band range or a predetermined frequency point range. spectrum signal.
  • the copy direction of the i-th spectrum signal is: from high frequency to high frequency. to low frequency
  • the copying direction of the i+1th spectrum signal is: from low frequency to high frequency
  • the copying direction of the ith spectrum signal is: from low frequency to high frequency
  • the copying direction of the i+1th spectrum signal is : From high frequency to low frequency.
  • i 1, 2, 3...n.
  • Figure 3c is a method provided by an embodiment of the present disclosure based on a predetermined frequency band range or a spectrum signal within a predetermined frequency point range in n audio frequency domain signals filled with the highest frequency point allocated by bits to a preset bandwidth extension band.
  • the structural schematic diagram of the spectrum signal between the highest frequency points is shown in Figure 3c. Taking the highest frequency point with bit allocation as the starting point, 4 copies of the predetermined frequency band range or The spectrum signal within the predetermined frequency point range is used as the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band.
  • the same method is used between different frames to predict the highest frequency point with bit allocation to the highest frequency point of the preset bandwidth extension frequency band.
  • spectrum signal For example, the method of the corresponding embodiment in Figure 3a can be used in different frames to predict the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension band, thereby ensuring that the inter-frame The spectrum signal is always consistent, ensuring the continuity of the audio signal between frames and ensuring the reconstructed audio quality of the audio signal.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • FIG 4a is a schematic flow chart of an audio signal band expansion method provided by an embodiment of the present disclosure, applied to a decoding device. As shown in Figure 4a, the audio signal band expansion method may include the following steps:
  • Step 401 Taking the starting frequency point of the preset bandwidth extension frequency band as the starting point, or taking the highest frequency point of the preset bandwidth extension frequency band as the starting point, copy m copies of the predetermined frequency band range or predetermined frequency point in the audio frequency domain signal.
  • the spectrum signal within the range is used as the spectrum signal between the starting frequency point of the preset bandwidth extension frequency band and the highest frequency point of the preset bandwidth extension frequency band.
  • m is a positive integer or a positive fraction.
  • m may be the ratio of the number of frequency points between the starting frequency point of the preset bandwidth extension frequency band to the highest frequency point of the preset bandwidth extension frequency band and the predetermined frequency band range or the number of frequency points within the predetermined frequency point range.
  • the copying method of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the m audio frequency domain signals includes any of the following:
  • the first method is to sequentially and repeatedly copy the spectrum signals within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to obtain m copies of the spectrum signals within the predetermined frequency band range or predetermined frequency point range in the audio frequency domain signal.
  • multiple mirror copies (or folded copies) of spectrum signals within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to obtain m copies of the audio frequency domain signal within a predetermined frequency band range or a predetermined frequency point range.
  • Step 402 Taking the starting frequency point of the preset bandwidth extension frequency band as the starting point, or taking the highest frequency point with bit allocation as the starting point, copy h copies of the audio frequency domain signal within the predetermined frequency band range or the predetermined frequency point range.
  • the spectrum signal is a spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band.
  • h is a positive integer or a positive fraction. h may be the ratio of the number of frequency points between the highest frequency point with bit allocation to the starting frequency point of the preset bandwidth extension frequency band and the number of frequency points within the predetermined frequency band range or the predetermined frequency point range.
  • the copying method of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the h audio frequency domain signals includes any of the following:
  • the first method is to sequentially and repeatedly copy the spectrum signals within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to obtain h copies of the spectrum signals within the predetermined frequency band range or predetermined frequency point range in the audio frequency domain signal.
  • multiple mirror copies (or folded copies) of spectrum signals within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal to obtain h copies of the audio frequency domain signal within a predetermined frequency band range or a predetermined frequency point range.
  • steps 401 to 402 please refer to the above embodiment description.
  • the copying method of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the above-mentioned m audio frequency domain signals is different from the above-mentioned h audio frequency domain signals.
  • the copying method of the spectrum signals within the predetermined frequency band range or the predetermined frequency point range in the domain signal remains consistent; that is, the first method above (i.e., sequential repeated copying) can be used to copy to obtain m copies of the spectrum signal and h copies of the spectrum signal, Alternatively, the above-mentioned second method (ie, multiple mirror copies) can be used to obtain m copies of the spectrum signal and h copies of the spectrum signal.
  • the starting frequency point of the preset bandwidth extension frequency band is filled to the highest frequency of the preset bandwidth extension frequency band, When filling in the frequency band between points, and when filling in the frequency band between the highest frequency point of bit allocation and the starting frequency point of the preset bandwidth extension band, if the filling direction of the two is the same, as in filling When the frequency band is between the starting frequency point of the preset bandwidth extension frequency band and the highest frequency point of the preset bandwidth extension frequency band, the filling starts from the starting frequency point of the preset bandwidth extension frequency band, and, in When filling the frequency band between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band, filling starts from the highest frequency point with bit allocation as the starting point, then the above-mentioned m audio frequency domain
  • the copying direction of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the signal should be the same as the copying direction of the spectrum signal within the predetermined frequency
  • the copy direction of the spectrum signal within the predetermined frequency band range or predetermined frequency point range in m audio frequency domain signals can be: from high frequency to low frequency; the copy direction of the spectrum signal within the predetermined frequency band range or predetermined frequency point range in h audio frequency domain signals
  • the copy direction of the spectrum signal can also be: from high frequency to low frequency.
  • the starting frequency point of the preset bandwidth extension frequency band is filled to the highest frequency point of the preset bandwidth extension frequency band
  • the filling direction of the two is different.
  • the starting frequency point of the preset bandwidth extension frequency band is used as the starting point, and the bit allocation is filled in
  • the frequency band between the highest frequency point and the starting frequency point of the preset bandwidth extension band is filled starting from the starting frequency point of the preset bandwidth extension band, then the above m audio frequency domain signals
  • the copying direction of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range should be opposite to the copying direction of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the h audio frequency domain signals.
  • the copy direction of the spectrum signal within the predetermined frequency band range or predetermined frequency point range in m audio frequency domain signals can be: from high frequency to low frequency; the copy direction of the spectrum signal within the predetermined frequency band range or predetermined frequency point range in h audio frequency domain signals
  • the copy direction of the spectrum signal can be: from low frequency to high frequency.
  • Figure 4b is a method provided by an embodiment of the present disclosure based on a predetermined frequency band range or a spectrum signal within a predetermined frequency point range in m audio frequency domain signals filled with the highest frequency point allocated by bits to a preset bandwidth extension band.
  • the structural schematic diagram of the spectrum signal between the highest frequency points as shown in Figure 4b, for the period from "the starting frequency point of the preset bandwidth extension band" to "the highest frequency point of the preset bandwidth extension band"
  • the spectrum within the predetermined frequency band range or predetermined frequency point range of the two copied audio frequency domain signals is copied sequentially and repeatedly.
  • the signal is a spectrum signal between the starting frequency point of the preset bandwidth extension frequency band and the highest frequency point of the preset bandwidth extension frequency band, where the copy direction of each spectrum signal is: from low frequency to high frequency.
  • Figure 4c is a method provided by an embodiment of the present disclosure based on a predetermined frequency band range or spectrum signal within a predetermined frequency point range in h audio frequency domain signals filled with the highest frequency point allocated by bits to a preset bandwidth extension.
  • the structural schematic diagram of the spectrum signal between the highest frequency points of the frequency band is shown in Figure 4c, for from "the starting frequency point of the preset bandwidth extension frequency band" to "the highest frequency point of the preset bandwidth extension frequency band" For this frequency band, taking the "starting frequency point of the preset bandwidth extension band" as the starting point, the spectrum signals within the predetermined frequency band range or predetermined frequency point range of the two copied audio frequency domain signals are copied in a mirror copy manner.
  • the copy direction of the first spectrum signal is: from low frequency to high frequency
  • the The copy direction of the second spectrum signal is: from high frequency to low frequency
  • the "preset bandwidth "Start frequency point of the extended frequency band” as the starting point and use the mirror copy method to copy the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the two copied audio frequency domain signals as the starting frequency point of the preset bandwidth extension frequency band.
  • the copy direction of the first spectrum signal is: from low frequency to high frequency
  • the copy direction of the second spectrum signal is: from high frequency frequency to low frequency.
  • the same method is used between different frames to predict the highest frequency point with bit allocation to the highest frequency point of the preset bandwidth extension frequency band.
  • spectrum signal For example, the method of the corresponding embodiment in Figure 4a can be used in different frames to predict the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension band, thereby ensuring that the inter-frame The spectrum signal is always consistent, ensuring the continuity of the audio signal between frames and ensuring the reconstructed audio quality of the audio signal.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • FIG. 5 is a schematic flowchart of an audio signal band expansion method provided by an embodiment of the present disclosure, applied to a decoding device. As shown in Figure 5, the audio signal band expansion method may include the following steps:
  • Step 501 Perform frequency domain envelope correction on the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band.
  • the above-mentioned method of performing frequency domain envelope correction on the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band may include Any of the following:
  • the first method is based on the frequency domain envelope value of the spectrum signal between the first frequency point and the highest frequency point with bit allocation, correcting the highest frequency point with bit allocation to the highest frequency point with bit allocation and the preset
  • the frequency domain envelope value of the signal corrects the frequency domain envelope value of the spectrum signal between the intermediate frequency point and the starting frequency point of the preset bandwidth extension frequency band.
  • the above-mentioned first frequency point is: W1-0.5 ⁇ Wx; W1 represents the highest frequency point with bit allocation, Wx represents the highest frequency point with bit allocation and preset The frequency bandwidth between the starting frequency points of the bandwidth expansion band; the second frequency point is: W2+0.5 ⁇ Wx; W2 represents the starting frequency point of the preset bandwidth expansion band.
  • the above-mentioned "based on the frequency domain envelope value of the spectrum signal between the first frequency point and the highest frequency point with bit allocation, correcting the highest frequency point with bit allocation to The frequency domain envelope value of the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band may specifically include: making the highest frequency point with bit allocation
  • the frequency domain envelope value of the spectrum signal between the point and the intermediate frequency point is equal to the frequency domain envelope value of the spectrum signal between the first frequency point and the highest frequency point with bit allocation; or, such that there is bit allocation
  • the changing trend of the frequency domain envelope value of the spectrum signal between the highest frequency point and the intermediate frequency point is equal to the change in the frequency domain envelope value of the spectrum signal between the first frequency point and the highest frequency point with bit allocation. trend.
  • the frequency domain envelope value of the spectrum signal between may include: making the frequency domain envelope value of the spectrum signal between the highest frequency point with bit allocation and the middle frequency point equal to the starting frequency of the preset bandwidth extension band.
  • the frequency domain envelope value of the spectrum signal between the highest frequency point and the second frequency point; or, the changing trend of the frequency domain envelope value of the spectrum signal between the highest frequency point and the middle frequency point with bit allocation is equal to the predetermined The changing trend of the frequency domain envelope value of the spectrum signal between the starting frequency point and the second frequency point of the assumed bandwidth extension frequency band.
  • the second method is to correct the highest frequency point with bit allocation to the starting frequency point of the preset bandwidth extension band based on the frequency domain envelope value of the spectrum signal between the third frequency point and the highest frequency point with bit allocation.
  • the third frequency point may be: W1-Wx.
  • the highest frequency point with bit allocation is corrected to the starting frequency point of the preset bandwidth extension band
  • the frequency domain envelope value of the spectrum signal between them may specifically include: the frequency domain envelope value of the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band is equal to the The frequency domain envelope value of the spectrum signal between the three frequency points and the highest frequency point with bit allocation; or, the value between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band
  • the changing trend of the frequency domain envelope value of the spectrum signal is equal to the changing trend of the frequency domain envelope value of the spectrum signal between the third frequency point and the highest frequency point with bit allocation.
  • the frequency domain envelope value of the frequency band or frequency point near the starting frequency point of the preset bandwidth extension frequency band can also be corrected based on the frequency domain envelope value of the starting frequency point of the preset bandwidth extension frequency band, so as to To ensure that the frequency domain envelope value of the frequency band or frequency point smaller than the preset starting frequency point of the bandwidth extension frequency band remains continuous with the frequency domain envelope value of the preset starting frequency point of the bandwidth extension frequency band.
  • the third method is to correct the highest frequency point with bit allocation to the preset bandwidth extension band based on the frequency domain envelope value of the spectrum signal between the starting frequency point and the fourth frequency point of the preset bandwidth extension frequency band.
  • the frequency domain envelope value of the spectrum signal between the starting frequency points is to correct the highest frequency point with bit allocation to the preset bandwidth extension band based on the frequency domain envelope value of the spectrum signal between the starting frequency point and the fourth frequency point of the preset bandwidth extension frequency band.
  • the fourth frequency point is: W2+Wx.
  • the frequency domain envelope value of the spectrum signal between the starting frequency points may specifically include: the frequency domain envelope value of the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band , equal to the frequency domain envelope value of the spectrum signal between the starting frequency point of the preset bandwidth extension frequency band and the fourth frequency point; or, such that the highest frequency point with bit allocation reaches the start point of the preset bandwidth extension frequency band.
  • the changing trend of the frequency domain envelope value of the spectrum signal between the starting frequency points is equal to the changing trend of the frequency domain envelope value of the spectrum signal between the starting frequency point and the fourth frequency point of the preset bandwidth extension frequency band.
  • the frequency domain envelope value of the frequency band or frequency point near the highest frequency point with bit allocation can also be corrected based on the frequency domain envelope value of the highest frequency point with bit allocation, so as to ensure that it is greater than the frequency domain envelope value with bit allocation.
  • the frequency domain envelope value of the frequency band or frequency point of the highest frequency point remains continuous with the frequency domain envelope value of the highest frequency point with bit allocation.
  • the frequency domain envelope value of the spectrum signal between the above-mentioned first frequency point to the highest frequency point with bit allocation, the starting frequency point of the preset bandwidth extension frequency band to the second frequency point The frequency domain envelope value of the spectrum signal between the third frequency point and the highest frequency point with bit allocation, the frequency domain envelope value of the spectrum signal between the third frequency point and the highest frequency point with bit allocation, the starting frequency point of the preset bandwidth extension band to the third
  • the frequency domain envelope values of the spectrum signals between the four frequency points can be obtained by the decoding device by decoding the bit stream it receives.
  • the highest frequency point with bit allocation is also filled in.
  • the spectrum signal between the points and the starting frequency point of the preset bandwidth extension band is subjected to frequency domain envelope correction, so as to ensure that the highest frequency point with bit allocation is between the starting frequency point of the preset bandwidth extension band.
  • the continuity of the frequency domain envelope values between the spectrum signals can also be ensured, and it can also ensure that the frequency band or the frequency domain envelope value of the frequency point smaller than the starting frequency point of the preset bandwidth extension frequency band is consistent with the preset bandwidth extension frequency band.
  • the continuity of the frequency domain envelope value of the starting frequency point and ensuring that the frequency domain envelope value of the frequency band is greater than the highest frequency point with bit allocation or the frequency domain envelope value of the frequency point is consistent with the frequency domain envelope of the highest frequency point with bit allocation
  • the continuity of the network value thus ensures the continuity of the subsequently reconstructed audio signal, solves the problem of mechanical sense caused by spectrum holes, and ensures the reconstructed audio quality of the audio signal.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • FIG. 6 is a schematic flowchart of an audio signal band expansion method provided by an embodiment of the present disclosure, applied to a decoding device. As shown in Figure 6, the audio signal band expansion method may include the following steps:
  • Step 601 Perform noise filling on the frequency band between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • FIG. 7 is a schematic flowchart of an audio signal band expansion method provided by an embodiment of the present disclosure, applied to a decoding device. As shown in Figure 7, the audio signal band expansion method may include the following steps:
  • Step 701 Add and combine the audio frequency domain signal and the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension band, and then transform from the frequency domain to the time domain to obtain the reconstructed audio time domain signal.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • Figure 8 is a schematic structural diagram of a communication device provided by an embodiment of the present disclosure. As shown in Figure 8, the device may include:
  • a transceiver module used to receive the bit stream sent by the encoding device, and decode the bit stream to obtain a decoded audio frequency domain signal
  • a processing module configured to respond to the fact that the audio frequency domain signal has a bit-allocated highest frequency point that is lower than a preset starting frequency point of the bandwidth extension band, or that the audio frequency-domain signal has a bit-allocated frequency band that is smaller than a preset starting frequency point. Assuming a bandwidth extension starting frequency band, predicting the highest frequency point with bit allocation to the highest frequency point of the preset bandwidth extension frequency band based on a predetermined frequency band range or a spectrum signal within a predetermined frequency point range in the audio frequency domain signal. Spectral signal between high frequency points.
  • the decoding device receives the bit stream sent by the encoding device, and decodes the bit stream to obtain a decoded audio frequency domain signal. And, in response to the audio frequency domain signal having a bit-allocated highest frequency point lower than a preset starting frequency point of the bandwidth expansion band, or the audio frequency domain signal having a bit-allocated frequency band being less than the preset bandwidth expansion starting frequency band.
  • the decoding device will predict the spectrum between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band based on the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal. Signal.
  • the device is also used for:
  • the starting frequency point and the highest frequency point of the preset bandwidth extension band are determined based on the encoding rate of the encoding device and the frequency band range required for encoding of the audio signal.
  • the frequency points in the predetermined frequency band range or the predetermined frequency point range are lower than the highest frequency point with bit allocation.
  • the processing module is also used to:
  • n is a positive integer or a positive fraction.
  • the copy method of the n copies of the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal includes:
  • the processing module is also used to:
  • the copying method of the m copies or h copies of the spectrum signal within a predetermined frequency band range or a predetermined frequency point range in the audio frequency domain signal includes:
  • the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal is mirrored multiple times to obtain m copies or h copies of the spectrum signal within the predetermined frequency band range or the predetermined frequency point range in the audio frequency domain signal.
  • the same method is used in different frames to predict the range between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band. spectrum signal.
  • the device is also used for:
  • Frequency domain envelope correction is performed on the spectrum signal between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension frequency band.
  • the device is also used for any of the following:
  • the highest frequency point with bit allocation is corrected to the highest frequency point with bit allocation and the preset bandwidth
  • the frequency domain envelope value of the spectrum signal between the starting frequency point and the intermediate frequency point of the extended frequency band and, based on the preset bandwidth, the frequency domain envelope value of the spectrum signal between the starting frequency point and the second frequency point of the extended frequency band.
  • the frequency domain envelope value corrects the frequency domain envelope value of the spectrum signal between the intermediate frequency point and the starting frequency point of the preset bandwidth extension frequency band; wherein the first frequency point is: W1-0.5 ⁇ Wx; W1 represents the highest frequency point with bit allocation, Wx represents the frequency bandwidth between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension band; the second frequency point It is: W2+0.5 ⁇ Wx; W2 represents the starting frequency point of the preset bandwidth extension band;
  • the area between the highest frequency point with bit allocation and the starting frequency point of the preset bandwidth extension frequency band is corrected
  • the frequency domain envelope value of the spectrum signal; wherein, the third frequency point is: W1-Wx;
  • the highest frequency point with bit allocation is corrected to the starting frequency point of the preset bandwidth extension frequency band.
  • the frequency domain envelope value of the spectrum signal between points; wherein, the fourth frequency point is: W2+Wx.
  • the device is also used for:
  • the frequency domain envelope value of the spectrum signal between the first frequency point and the highest frequency point with bit allocation, and the starting frequency point of the preset bandwidth extension frequency band to the second
  • the frequency domain envelope value of the spectrum signal between frequency points, the frequency domain envelope value of the spectrum signal between the third frequency point and the highest frequency point with bit allocation, and the starting frequency point of the preset bandwidth extension band At least one of the frequency domain envelope values of the spectrum signal between the fourth frequency point and the fourth frequency point.
  • the device is used for:
  • the frequency band between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band is filled with noise.
  • the device is also used for:
  • the audio frequency domain signal and the spectrum signal between the highest frequency point with bit allocation and the highest frequency point of the preset bandwidth extension frequency band are added and combined, and then converted from the frequency domain to the time domain to obtain Reconstructed audio time domain signal.
  • FIG. 9 is a schematic structural diagram of a communication device 900 provided by an embodiment of the present application.
  • the communication device 900 may be a network device, a terminal device, a chip, a chip system, or a processor that supports a network device to implement the above method, or a chip, a chip system, or a processor that supports a terminal device to implement the above method. Processor etc.
  • the device can be used to implement the method described in the above method embodiment. For details, please refer to the description in the above method embodiment.
  • Communication device 900 may include one or more processors 901.
  • the processor 901 may be a general-purpose processor or a special-purpose processor, or the like. For example, it can be a baseband processor or a central processing unit.
  • the baseband processor can be used to process communication protocols and communication data.
  • the central processor can be used to control communication devices (such as base stations, baseband chips, terminal equipment, terminal equipment chips, DU or CU, etc.) and execute computer programs. , processing data for computer programs.
  • the communication device 900 may also include one or more memories 902, on which a computer program 904 may be stored.
  • the processor 901 executes the computer program 904, so that the communication device 900 performs the steps described in the above method embodiments. method.
  • the memory 902 may also store data.
  • the communication device 900 and the memory 902 can be provided separately or integrated together.
  • the communication device 900 may also include a transceiver 905 and an antenna 906.
  • the transceiver 905 may be called a transceiver unit, a transceiver, a transceiver circuit, etc., and is used to implement transceiver functions.
  • the transceiver 905 may include a receiver and a transmitter.
  • the receiver may be called a receiver or a receiving circuit, etc., used to implement the receiving function;
  • the transmitter may be called a transmitter, a transmitting circuit, etc., used to implement the transmitting function.
  • the communication device 900 may also include one or more interface circuits 907.
  • the interface circuit 907 is used to receive code instructions and transmit them to the processor 901 .
  • the processor 901 executes the code instructions to cause the communication device 900 to perform the method described in the above method embodiment.
  • the processor 901 may include a transceiver for implementing receiving and transmitting functions.
  • the transceiver may be a transceiver circuit, an interface, or an interface circuit.
  • the transceiver circuits, interfaces or interface circuits used to implement the receiving and transmitting functions can be separate or integrated together.
  • the above-mentioned transceiver circuit, interface or interface circuit can be used for reading and writing codes/data, or the above-mentioned transceiver circuit, interface or interface circuit can be used for signal transmission or transfer.
  • the processor 901 may store a computer program 903, and the computer program 903 runs on the processor 901, causing the communication device 900 to perform the method described in the above method embodiment.
  • the computer program 903 may be solidified in the processor 901, in which case the processor 901 may be implemented by hardware.
  • the communication device 900 may include a circuit, and the circuit may implement the functions of sending or receiving or communicating in the foregoing method embodiments.
  • the processor and transceiver described in this application can be implemented in integrated circuits (ICs), analog ICs, radio frequency integrated circuits RFICs, mixed signal ICs, application specific integrated circuits (ASICs), printed circuit boards ( printed circuit board (PCB), electronic equipment, etc.
  • the processor and transceiver can also be manufactured using various IC process technologies, such as complementary metal oxide semiconductor (CMOS), n-type metal oxide-semiconductor (NMOS), P-type Metal oxide semiconductor (positive channel metal oxide semiconductor, PMOS), bipolar junction transistor (BJT), bipolar CMOS (BiCMOS), silicon germanium (SiGe), gallium arsenide (GaAs), etc.
  • CMOS complementary metal oxide semiconductor
  • NMOS n-type metal oxide-semiconductor
  • PMOS P-type Metal oxide semiconductor
  • BJT bipolar junction transistor
  • BiCMOS bipolar CMOS
  • SiGe silicon germanium
  • GaAs gallium arsenide
  • the communication device described in the above embodiments may be a network device or a terminal device, but the scope of the communication device described in this application is not limited thereto, and the structure of the communication device may not be limited by FIG. 9 .
  • the communication device may be a stand-alone device or may be part of a larger device.
  • the communication device may be:
  • the IC collection may also include storage components for storing data and computer programs;
  • the communication device may be a chip or a chip system
  • the communication device may be a chip or a chip system
  • the chip shown in Figure 10 includes a processor 1001 and an interface 1002.
  • the number of processors 1001 may be one or more, and the number of interfaces 1002 may be multiple.
  • the chip also includes a memory 1003, which is used to store necessary computer programs and data.
  • This application also provides a readable storage medium on which instructions are stored. When the instructions are executed by a computer, the functions of any of the above method embodiments are implemented.
  • This application also provides a computer program product, which, when executed by a computer, implements the functions of any of the above method embodiments.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer programs.
  • the computer program When the computer program is loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer program may be stored in or transferred from one computer-readable storage medium to another, for example, the computer program may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated therein.
  • the available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., high-density digital video discs (DVD)), or semiconductor media (e.g., solid state disks, SSD)) etc.
  • magnetic media e.g., floppy disks, hard disks, magnetic tapes
  • optical media e.g., high-density digital video discs (DVD)
  • DVD digital video discs
  • semiconductor media e.g., solid state disks, SSD
  • At least one in this application can also be described as one or more, and the plurality can be two, three, four or more, which is not limited by this application.
  • the technical feature is distinguished by “first”, “second”, “third”, “A”, “B”, “C” and “D”, etc.
  • the technical features described in “first”, “second”, “third”, “A”, “B”, “C” and “D” are in no particular order or order.
  • the corresponding relationships shown in each table in this application can be configured or predefined.
  • the values of the information in each table are only examples and can be configured as other values, which are not limited by this application.
  • the corresponding relationships shown in some rows may not be configured.
  • appropriate deformation adjustments can be made based on the above table, such as splitting, merging, etc.
  • the names of the parameters shown in the titles of the above tables may also be other names understandable by the communication device, and the values or expressions of the parameters may also be other values or expressions understandable by the communication device.
  • other data structures can also be used, such as arrays, queues, containers, stacks, linear lists, pointers, linked lists, trees, graphs, structures, classes, heaps, hash tables or hash tables. wait.
  • Predefinition in this application can be understood as definition, pre-definition, storage, pre-storage, pre-negotiation, pre-configuration, solidification, or pre-burning.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种音频信号频带扩展方法、装置、设备及存储介质,属于通信技术领域。该方法包括:接收编码设备发送的比特流,对比特流进行解码得到解码后的音频频域信号;响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号。该方法可避免出现"有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号"的情况,确保了帧内的高低频能量均衡,避免了由于频谱空洞引起的机械感,提升了重建音频的质量。

Description

一种音频信号频带扩展方法、装置、设备及存储介质 技术领域
本公开涉及通信技术领域,尤其涉及一种音频信号频带扩展方法/装置/设备及存储介质。
背景技术
为了降低音频信号传输过程中占用的资源,信号发送端在传输音频信号时通常会先将音频信号从时域信号转换为频域信号,再利用编码设备对频域信号进行压缩编码后传输。以及信号接收端接收到编码后的信号后,需先利用解码设备进行解码操作以重建出音频频域信号,再将重建的音频频域信号转换为时域信号得到重建的音频时域信号。
相关技术中,由于在低比特率下有限的量化比特不能满足量化所有的待量化的音频信号,编码设备将大部分比特用于精细量化相对重要的音频信号中的低频谱信号,即低频谱信号的量化参数占用大部分比特;而仅用少量比特粗略量化编码音频信号中的高频谱信号,得到高频谱信号的频域包络,然后将高频谱信号的频域包络和低频谱信号的量化参数以比特流的形式发送至解码没备。以及,解码设备在解码时,先对接收到的比特流进行解码复用,以解码得到低频谱信号的量化参数和高频谱信号的频域包络,之后根据解码得到的低频谱信号的量化参数恢复出低频谱信号,再基于解码得到的低频谱信号的量化参数采用频带扩展技术获得预设的带宽扩展频带的起始频点之上的高频谱信号。
由上述内容可知,解码设备在解码音频信号过程中,会涉及到以下概念,分别为:带宽扩展频带(即扩展的高频带,具体为:预设的带宽扩展频带的起始频点到预设的带宽扩展频带的最高频点间的频带)、有比特分配的频点(即被编码了的低频谱信号对应的频点)、有比特分配的最高频点,该有比特分配的最高频点为:被编码了的低频谱信号的最高频点,换言之,从该有比特分配的最高频点以上没有低频谱信号被解码出,其中,有比特分配的最高频点之上的频带可以称为高频带,有比特分配的最高频点之下的频带可以称为低频带。以及,上述的带宽扩展频带和有比特分配的最高频点之间存在两种分布方式。图1a-1b为本公开实施例提供的一种带宽扩展频带和有比特分配的最高频点之间的分布关系图。如图1a所示,带宽扩展频带的起始频点可以高于有比特分配的最高频点。以及,如图1b所示,带宽扩展频带的起始频点可以低于有比特分配的最高频点。
针对上述图1a而言,由于相关技术中的解码方法仅预测带宽扩展频带对应的高频谱信号,因此相关技术的解码方法会使得图1a中的有比特分配的最高频点至带宽扩展频带的起始频点的该部分区域不存在对应的频谱信号,从而会造成帧内的高低频能量不均衡,进而会造成“由于频谱空洞而引起的机械感”的技术问题,降低了重建音频的质量。
发明内容
本公开提出的音频信号频带扩展方法/装置/设备及存储介质,以解决相关技术方法导致的帧内的高低频能量不均衡、由于频谱空洞而引起的机械感、重建音频质量较低的技术问题。
第一方面,本公开实施例提供一种音频信号频带扩展方法,该方法由解码设备执行,包括:
接收编码设备发送的比特流,对所述比特流进行解码得到解码后的音频频域信号;
响应于所述音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,所述音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
本公开中,提供了一种音频信号频带扩展方法,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
可选地,在本公开的一个实施例之中,所述方法还包括:
基于编码设备的编码速率和音频信号所需编码的频带范围确定预设的带宽扩展频带的起始频点和最高频点。
可选地,在本公开的一个实施例之中,所述预定频带范围或预定频点范围中的频点均低于所述有比特分配的最高频点。
可选地,在本公开的一个实施例之中,所述基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,包括:
以所述有比特分配的最高频点为起点,或者,以所述预设的带宽扩展频带的最高频点为起点,依次将拷贝的n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,n为正整数或正分数。
可选地,在本公开的一个实施例之中,所述n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括:
顺次重复拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号;或者
多次镜像拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号。
可选地,在本公开的一个实施例之中,所述基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,包括:
以所述预设的带宽扩展频带的起始频点为起点,或者,以所述预设的带宽扩展频带的最高频点为起点,拷贝m份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述预设的带宽扩展频带的起始频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,m为正整数或正分数;
以所述预设的带宽扩展频带的起始频点为起点,或者,以所述有比特分配的最高频点为起点,拷贝h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号,h为正整数或正分数。
可选地,在本公开的一个实施例之中,所述m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括:
顺次重复拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号;或者
多次镜像拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号。
可选地,在本公开的一个实施例之中,不同帧之间采用相同的方法来预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
可选地,在本公开的一个实施例之中,所述方法还包括:
对所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正。
可选地,在本公开的一个实施例之中,所述对所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正,包括以下至少一种:
基于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至有比特分配的最高频点与所述预设的带宽扩展频带的起始频点两者的中间频点之间的频谱信号的频域包络值;以及,基于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值修正所述中间频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第一频点为:W1-0.5×Wx;W1表示有比特分配的最高频点,Wx表示有比特分配的最高频点和所述预设的带宽扩展频带的起始频点之间的频带宽带;所述第二频点为:W2+0.5×Wx;W2表示预设的带宽扩展频带的起始频点;
基于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第三频点为:W1-Wx;
基于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第四频点为:W2+Wx。
可选地,在本公开的一个实施例之中,所述方法还包括:
通过对所述比特流解码以得到所述第一频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值、第三频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值中的至少一种。
可选地,在本公开的一个实施例之中,所述方法还包括:
对所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频带进行噪声填充。
可选地,在本公开的一个实施例之中,所述方法还包括:
将所述音频频域信号和所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号相加组合后再从频域变换到时域获得重建的音频时域信号。
第二方面,本公开实施例提供一种通信装置,该装置被配置于解码设备中,包括:
收发模块,用于接收编码设备发送的比特流,对所述比特流进行解码得到解码后的音频频域信号;
处理模块,用于响应于所述音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,所述音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
第三方面,本公开实施例提供一种通信装置,该通信装置包括处理器,当该处理器调用存储器中的计算机程序时,执行上述第一方面所述的方法。
第四方面,本公开实施例提供一种通信装置,该通信装置包括处理器和存储器,该存储器中存储有计算机程序;所述处理器执行该存储器所存储的计算机程序,以使该通信装置执行上述第一方面所述的方法。
第五方面,本公开实施例提供一种通信装置,该装置包括处理器和接口电路,该接口电路用于接收代码指令并传输至该处理器,该处理器用于运行所述代码指令以使该装置执行上述第一方面所述的方法。
第六方面,本公开实施例提供一种通信系统,该系统包括第二方面所述的通信装置,或者,该系统包括第三方面所述的通信装置,或者,该系统包括第四方面所述的通信装置,或者,该系统包括第五方面所述的通信装置。
第六方面,本发明实施例提供一种计算机可读存储介质,用于储存为上述网络设备所用的指令,当所述指令被执行时,使所述终端设备执行上述第一方面所述的方法。
第七方面,本公开还提供一种包括计算机程序的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法。
第八方面,本公开提供一种芯片系统,该芯片系统包括至少一个处理器和接口,用于支持网络设备实现第一方面的任一方面所述的方法所涉及的功能,例如,确定或处理上述方法中所涉及的数据和信息中的至少一种。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存源辅节点必要的计算机程序和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
第九方面,本公开提供一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法。
附图说明
本公开上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1a-1b为本公开实施例提供的一种带宽扩展频带和有比特分配的最高频点之间的分布关系图;
图1c为本公开实施例提供的一种通信系统的架构示意图;
图2为本公开一个实施例所提供的音频信号频带扩展方法的流程示意图;
图3a为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图;
图3b为本公开实施例提供的一种基于n份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图;
图3c为本公开实施例提供的一种基于n份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图;
图4a为本公开又一个实施例所提供的音频信号频带扩展方法的流程示意图;
图4b为本公开实施例提供的一种基于m份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图;
图4c为本公开实施例提供的一种基于h份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图;
图5为本公开又一个实施例所提供的音频信号频带扩展方法的流程示意图;
图6为本公开又一个实施例所提供的音频信号频带扩展方法的流程示意图;
图7为本公开又一个实施例所提供的音频信号频带扩展方法的流程示意图;
图8为本公开一个实施例所提供的通信装置的结构示意图;
图9是本公开一个实施例所提供的一种通信装置的结构示意图;
图10为本公开一个实施例所提供的一种芯片的结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开实施例相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开实施例的一些方面相一致的装置和方法的例子。
在本公开实施例使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开实施例。在本公开实施例和所附权利要求书中所使用的单数形式的“一种”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本公开实施例可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开实施例范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”及“若”可以被解释成为“在……时”或“当……时”或“响应于确定”。
下面详细描述本公开的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的要素。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。
为了便于理解,首先介绍本申请涉及的术语。
1、频带
一个频率的范围或者频谱的宽度,频点是频带上的一个频率点。
为了更好的理解本公开实施例公开的一种音频信号频带扩展方法,下面首先对本公开实施例适用的通信系统进行描述。
下面详细描述本公开的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的要素。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。
请参见图1c,图1c为本公开实施例提供的一种通信系统的架构示意图。该通信系统可包括但不限于一个编码设备和一个解码设备,其中,上述的编码设备和解码设备均可以为网络设备或终端设备。以及,图1c所示的设备数量和形态仅用于举例并不构成对本公开实施例的限定,实际应用中可以包括两个或两个以上的编码设备,两个或两个以上的解码设备。图1c所示的通信系统以包括一个编码设备11、一个解码设备12、编码设备11为网络设备、解码设备12为终端设备为例。
需要说明的是,本公开实施例的技术方案可以应用于各种通信系统。例如:长期演进(long term evolution,LTE)系统、第五代(5th generation,5G)移动通信系统、5G新空口(new radio,NR)系统,或者其他未来的新型移动通信系统等。
本公开实施例中的网络设备是网络侧的一种用于发射或接收信号的实体。例如,网络设备11可以为演进型基站(evolved NodeB,eNB)、发送接收点(transmission reception point,TRP)、NR系统中的下一代基站(next generation NodeB,gNB)、其他未来移动通信系统中的基站或无线保真(wireless fidelity,WiFi)系统中的接入节点等。本公开的实施例对网络设备所采用的具体技术和具体设备形态不做限定。本公开实施例提供的网络设备可以是由 集中单元(central unit,CU)与分布式单元(distributed unit,DU)组成的,其中,CU也可以称为控制单元(control unit),采用CU-DU的结构可以将网络设备,例如基站的协议层拆分开,部分协议层的功能放在CU集中控制,剩下部分或全部协议层的功能分布在DU中,由CU集中控制DU。
本公开实施例中的终端设备是用户侧的一种用于接收或发射信号的实体,如手机。终端设备也可以称为终端设备(terminal)、用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端设备(mobile terminal,MT)等。终端设备可以是具备通信功能的汽车、智能汽车、手机(mobile phone)、穿戴式设备、平板电脑(Pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、工业控制(industrial control)中的无线终端设备、无人驾驶(self-driving)中的无线终端设备、远程手术(remote medical surgery)中的无线终端设备、智能电网(smart grid)中的无线终端设备、运输安全(transportation safety)中的无线终端设备、智慧城市(smart city)中的无线终端设备、智慧家庭(smart home)中的无线终端设备等等。本公开的实施例对终端设备所采用的具体技术和具体设备形态不做限定。
图2为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图,应用于解码设备,其中,如图2所示,该音频信号频带扩展方法可以包括以下步骤:
步骤201、接收编码设备发送的比特流,对比特流进行解码得到解码后的音频频域信号。
其中,在本公开的一个实施例之中,该步骤201的具体执行方法与现有技术类似,本公开在此不再赘述。
以及,参考背景技术记载内容可知,本步骤中的解码得到的音频频域信号具体是音频信号的低频谱信号,也即是,上述图1a-1b中的有比特分配的最高频点之下的频带对应的频谱信号。
需要说明的是,本公开实施例提到的“频谱信号”可以为频带信号,也可以为频点信号。
步骤202、响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号。
其中,在本公开的一个实施例之中,该预设的带宽扩展频带的起始频点和最高频点可以是解码设备基于编码设备的编码速率(即总比特数)和音频信号所需编码的频带范围预先确定的。具体的,当编码速率越高时,可以将带宽扩展频带的起始频点设置的较高。例如对超宽带信号,在编码速率为24kbps时,频域信号预设的带宽扩展频带的起始频点可以为6.4kHz(千赫兹);在编码速率为32kbps时,频域信号预设的带宽扩展频带的起始频点可以为8kHz。以及,带宽扩展频带的最高频点是指要求输出信号的频带最高点或指定的某个频点,其中,针对于宽带信号,该预设的带宽扩展频带的最高频点可以是7kHz或者8kHz,针对于超宽带信号,该预设的带宽扩展频带的最高频点可以是14kHz或者16kHz或其他预设的具体频点。
以及,在本公开的一个实施例之中,上述的预定频带范围或预定频点范围中的频点均低于有比特分配的最高频点,参考上图1a和图1b所示,预定频带范围为图1a和图1b黑色部分,该预定频带范围中的频点均低于有比特分配的最高频点。
进一步地,在本公开的一个实施例之中,该预定频带范围或预定频点范围可以是基于音频信号的信号类型和编码速率确定的。具体的,如在较低编码速率时,对谐波信号,可以选取低频谱信号中相对编码较好的较低频谱信号的频带范围或频点范围作为预定频带范围或预定频点范围;对非谐波信号,可以选取低频谱信号中相对编码较差的较高频谱信号的频带范围或频点范围作为预定频带范围或预定频点范围;在较高编码速率时,对谐波信号可以选 取低频谱信号中的稍高的频带或频点作为预定频带范围或预定频点范围。
此外,由上述步骤202的内容可知,本公开中具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。其中,关于“解码设备具体如何预测有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号”的详细介绍会在后续实施例说明。
综上所述,本公开提供的音频信号频带扩展方法之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
图3a为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图,应用于解码设备,其中,如图3a所示,该音频信号频带扩展方法可以包括以下步骤:
步骤301、以有比特分配的最高频点为起点,或者,以预设的带宽扩展频带的最高频点为起点,依次将拷贝的n份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号。
其中,在本公开的一个实施例之中,n为正整数或正分数。n可以为有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频点数量和预定频带范围或预定频点范围内频点数量的比值。
以及,在本公开的一个实施例之中,上述的n份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括以下任一种:
第一种、顺次重复拷贝音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份音频频域信号中预定频带范围或预定频点范围内的频谱信号。
也即是,n份音频频域信号中预定频带范围或预定频点范围内的频谱信号中的每一份频谱信号均是沿着相同方向(如沿着从高频到低频方向,或者,沿着从低频到高频方向)拷贝的。
示例的,图3b为本公开实施例提供的一种基于n份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图,如图3b所示,以有比特分配的最高频点为起点,按照顺次重复拷贝的方式将拷贝4份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号。其中,每份音频频域信号中预定频带范围或预定频点范围内的频谱信号均沿着“从低频到高频方向”进行拷贝。
第二种、多次镜像拷贝(或者称为对折拷贝)音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份音频频域信号中预定频带范围或预定频点范围内的频谱信号。
也即是,n份音频频域信号中预定频带范围或预定频点范围内的频谱信号中的相邻份频谱信号的拷贝方向不同,如:第i份频谱信号的拷贝方向为:从高频到低频,第i+1份频谱信号的拷贝方向为:从低频到高频;或者,第i份频谱信号的拷贝方向为:从低频到高频,第i+1份频谱信号的拷贝方向为:从高频到低频。其中,i=1、2、3....n。
示例的,图3c为本公开实施例提供的一种基于n份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图,如图3c所示,以有比特分配的最高频点为起点,按照镜像拷贝的方式将拷贝4份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号。其中,拷贝第一份频谱信号时是沿着“低频到高频”的方向拷贝,拷贝第二份频谱信号时是沿着“高频到低频”的方向拷贝,拷贝第三份频谱信号时是沿着“低频到高频”的方向拷贝,拷贝第四份频谱信号时是沿着“高频到低频”的方向拷贝。
需要说明的是,在本公开的一个实施例之中,在不同帧之间具体是采用相同的方法来预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。例如,不同帧之间可以均采用图3a对应实施例的方法来预测有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,由此可以确保帧间的频谱信号始终保持一致,确保了帧间音频信号的连续性,保证了音频信号的重建音频质量。
综上所述,本公开提供的音频信号频带扩展方法之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
图4a为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图,应用于解码设备,其中,如图4a所示,该音频信号频带扩展方法可以包括以下步骤:
步骤401、以预设的带宽扩展频带的起始频点为起点,或者,以预设的带宽扩展频带的最高频点为起点,拷贝m份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频谱信号。
其中,在本公开的一个实施例之中,m为正整数或正分数。m可以为预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频点数量和预定频带范围或预定频点范围内频点数量的比值。
以及,在本公开的一个实施例之中,上述的m份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括以下任一种:
第一种、顺次重复拷贝音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份音频频域信号中预定频带范围或预定频点范围内的频谱信号。
第二种、多次镜像拷贝(或者称为对折拷贝)音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份音频频域信号中预定频带范围或预定频点范围内的频谱信号。
步骤402、以预设的带宽扩展频带的起始频点为起点,或者,以有比特分配的最高频点为起点,拷贝h份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号。
其中,在本公开的一个实施例之中,h为正整数或正分数。h可以为有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频点数量和预定频带范围或预定频点范围内频点数量的比值。
以及,在本公开的一个实施例之中,上述的h份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括以下任一种:
第一种、顺次重复拷贝音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到h份音频频域信号中预定频带范围或预定频点范围内的频谱信号。
第二种、多次镜像拷贝(或者称为对折拷贝)音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到h份音频频域信号中预定频带范围或预定频点范围内的频谱信号。
其中,关于步骤401至402的详细介绍可以参考上述实施例描述。
进一步地,需要说明的是,在本公开的一个实施例之中,上述的m份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式,与上述的h份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式保持一致;即,可以均采用上述第一种(即顺次重复拷贝)来拷贝得到m份频谱信号和h份频谱信号,或者,可以均采用上述第二种(即多次镜像拷贝)来拷贝得到m份频谱信号和h份频谱信号。
可选的,在本公开的一个实施例之中,针对采用顺次拷贝方式而言,若在填充预设的带宽扩展频带的起始频点至所述预设的带宽扩展频带的最高频点之间的频带时,以及,在填充有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频带时,如果两者的填充方向相同,如在填充预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频带时,是以预设的带宽扩展频带的起始频点为起点开始填充,以及,在填充有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频带时,是以有比特分配的最高频点为起点开始填充,则上述的m份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向,与上述的h份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向应当相同。如,m份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向可以为:从高频到低频;h份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向也可以为:从高频到低频。
可选的,在本公开的另一个实施例之中,针对采用顺次拷贝方式而言,若在填充预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频带,以及,在填充有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频带时,两者的填充方向不同,如在填充预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频带时,是以预设的带宽扩展频带的起始频点为起点开始填充,以及,在填充有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频带时,是以预设的带宽扩展频带的起始频点为起点开始填充,则上述的m份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向,与上述的h份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向应当相反。如,m份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向可以为:从高频到低频;h份音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方向可以为:从低频到高频。
示例的,图4b为本公开实施例提供的一种基于m份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之 间的频谱信号的结构示意图,如图4b所示,对于从“预设的带宽扩展频带的起始频点”到“预设的带宽扩展频带的最高频点”这一段频带而言,以“预设的带宽扩展频带的起始频点”为起点,按照顺次重复拷贝的方式将拷贝的2份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频谱信号,其中,该每份频谱信号的拷贝方向为:从低频到高频。
以及,相应的,如图4b所示,对于从“有比特分配的最高频点”到“预设的带宽扩展频带的起始频点”这一段频带而言在,以“预设的带宽扩展频带的起始频点”为起点,按照顺次重复拷贝的方式将拷贝的2份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频谱信号,其中,该每份频谱信号的拷贝方向为:从高频到低频。
进一步示例的,图4c为本公开实施例提供的一种基于h份音频频域信号中预定频带范围或预定频点范围内的频谱信号填充有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号的结构示意图,如图4c所示,对于从“预设的带宽扩展频带的起始频点”到“预设的带宽扩展频带的最高频点”这一段频带而言,以“预设的带宽扩展频带的起始频点”为起点,按照镜像拷贝的方式将拷贝的2份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频谱信号,其中,该第一份频谱信号的拷贝方向为:从低频到高频,该第二份频谱信号的拷贝方向为:从高频到低频。
以及,相应的,如图4c所示,对于从“有比特分配的最高频点”到“预设的带宽扩展频带的起始频点”这一段频带而言在,以“预设的带宽扩展频带的起始频点”为起点,按照镜像拷贝的方式将拷贝的2份音频频域信号中预定频带范围或预定频点范围内的频谱信号作为预设的带宽扩展频带的起始频点至预设的带宽扩展频带的最高频点之间的频谱信号,其中,该第一份频谱信号的拷贝方向为:从低频到高频,该第二份频谱信号的拷贝方向为:从高频到低频。
需要说明的是,在本公开的一个实施例之中,在不同帧之间具体是采用相同的方法来预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。例如,不同帧之间可以均采用图4a对应实施例的方法来预测有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,由此可以确保帧间的频谱信号始终保持一致,确保了帧间音频信号的连续性,保证了音频信号的重建音频质量。
综上所述,本公开提供的音频信号频带扩展方法之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
图5为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图,应用于解码设备,其中,如图5所示,该音频信号频带扩展方法可以包括以下步骤:
步骤501、对有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正。
其中,在本公开的一个实施例之中,上述的对有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正的方法可以包括以下任一种:
第一种、基于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至有比特分配的最高频点与预设的带宽扩展频带的起始频点两者的中间频点之间的频谱信号的频域包络值;以及,基于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值修正中间频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值。
具体的,在本公开的一个实施例之中,上述第一频点为:W1-0.5×Wx;W1表示有比特分配的最高频点,Wx表示有比特分配的最高频点和预设的带宽扩展频带的起始频点之间的频带宽带;第二频点为:W2+0.5×Wx;W2表示预设的带宽扩展频带的起始频点。
以及,在本公开的一个实施例之中,上述的“基于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至有比特分配的最高频点与预设的带宽扩展频带的起始频点两者的中间频点之间的频谱信号的频域包络值”具体可以包括:使得有比特分配的最高频点至中间频点之间的频谱信号的频域包络值,等于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值;或者,使得有比特分配的最高频点至中间频点之间的频谱信号的频域包络值的变化趋势,等于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值的变化趋势。
以及,上述的“基于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值修正中间频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值”可以包括:使得有比特分配的最高频点至中间频点之间的频谱信号的频域包络值,等于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值;或者,使得有比特分配的最高频点至中间频点之间的频谱信号的频域包络值的变化趋势,等于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值的变化趋势。
第二种、基于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值。
其中,该第三频点可以为:W1-Wx。
以及,上述的基于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值具体可以包括:使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值,等于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值;或者,使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值的变化趋势,等于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值的变化趋势。
此外,还可以基于预设的带宽扩展频带的起始频点的频域包络值来修正预设的带宽扩展频带的起始频点附近的频带或者频点的频域包络值,以此来保证小于预设的带宽扩展频带的起始频点的频带或者频点的频域包络值与预设的带宽扩展频带的起始频点的频域包络值保持连续。
第三种、基于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值。
其中,第四频点为:W2+Wx。
以及,上述的基于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包 络值修正有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值具体可以包括:使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值,等于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值;或者,使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值的变化趋势,等于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值的变化趋势。
此外,还可以基于有比特分配的最高频点的频域包络值来修正有比特分配的最高频点附近的频带或者频点的频域包络值,以此来保证大于有比特分配的最高频点的频带或者频点的频域包络值与有比特分配的最高频点的频域包络值保持连续。
其中,需要说明的是,上述的第一频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值、第三频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值均可以是解码设备通过对其接收到的比特流解码以得到的。
则由上述内容可知,本公开中,在填充了有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号之后,还会对有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正,以此可以确保有比特分配的最高频点至预设的带宽扩展频带的起始频点之间的频谱信号之间的频域包络值的连续性,同时也可以确保小于预设的带宽扩展频带的起始频点的频带或者频点的频域包络值与预设的带宽扩展频带的起始频点的频域包络值的连续性,以及确保大于有比特分配的最高频点的频带或者频点的频域包络值与有比特分配的最高频点的频域包络值的连续性,由此确保了后续重建的音频信号的连续性,解决了由于频谱空洞而引起的机械感问题,保证了音频信号的重建音频质量。
综上所述,本公开提供的音频信号频带扩展方法之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
图6为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图,应用于解码设备,其中,如图6所示,该音频信号频带扩展方法可以包括以下步骤:
步骤601、对有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频带进行噪声填充。
综上所述,本公开提供的音频信号频带扩展方法之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点 之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
图7为本公开实施例所提供的一种音频信号频带扩展方法的流程示意图,应用于解码设备,其中,如图7所示,该音频信号频带扩展方法可以包括以下步骤:
步骤701、将音频频域信号和有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号相加组合后再从频域变换到时域获得重建的音频时域信号。
综上所述,本公开提供的音频信号频带扩展方法之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重建音频的质量。
图8为本公开实施例所提供的一种通信装置的结构示意图,如图8所示,装置可以包括:
收发模块,用于接收编码设备发送的比特流,对所述比特流进行解码得到解码后的音频频域信号;
处理模块,用于响应于所述音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,所述音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
综上所述,在本公开实施例提供的通信装置之中,解码设备会接收编码设备发送的比特流,并对比特流进行解码得到解码后的音频频域信号。以及,响应于音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,解码设备会基于音频频域信号中预定频带范围或预定频点范围内的频谱信号预测有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。由此可知,本公开中,当音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,当音频频域信号有比特分配的频带小于预设的带宽扩展起始频带时,本公开中在预测频谱信号时,具体预测的是有比特分配的最高频点至预设的带宽扩展频带的最高频点之间的频谱信号,而非仅预测带宽扩展频带的频谱信号,则会使得有比特分配的最高频点至预设的带宽扩展频带的起始频点之间会对应有预测出的频谱信号,而避免出现“有比特分配的最高频点至预设的带宽扩展频带的起始频点之间不存在频谱信号”的情况,从而确保了帧内的高低频能量均衡,以及,避免了由于频谱空洞而引起的机械感,提升了重 建音频的质量。
可选地,在本公开的一个实施例之中,所述装置还用于:
基于编码设备的编码速率和音频信号所需编码的频带范围确定预设的带宽扩展频带的起始频点和最高频点。
可选地,在本公开的一个实施例之中,所述预定频带范围或预定频点范围中的频点均低于所述有比特分配的最高频点。
可选地,在本公开的一个实施例之中,所述处理模块还用于:
以所述有比特分配的最高频点为起点,或者,以所述预设的带宽扩展频带的最高频点为起点,依次将拷贝的n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,n为正整数或正分数。
可选地,在本公开的一个实施例之中,所述n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括:
顺次重复拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号;或者
多次镜像拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号。
可选地,在本公开的一个实施例之中,所述处理模块还用于:
以所述预设的带宽扩展频带的起始频点为起点,或者,以所述预设的带宽扩展频带的最高频点为起点,拷贝m份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述预设的带宽扩展频带的起始频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,m为正整数或正分数;
以所述预设的带宽扩展频带的起始频点为起点,或者,以所述有比特分配的最高频点为起点,拷贝h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号,h为正整数或正分数。
可选地,在本公开的一个实施例之中,所述m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括:
顺次重复拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号;或者
多次镜像拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号。
可选地,在本公开的一个实施例之中,不同帧之间采用相同的方法来预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
可选地,在本公开的一个实施例之中,所述装置还用于:
对所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正。
可选地,在本公开的一个实施例之中,所述装置还用于以下任一种:
基于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至有比特分配的最高频点与所述预设的带宽扩展频带的起始频点两者的中间频点之间的频谱信号的频域包络值;以及,基于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值修正所述中间频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第一频点为:W1-0.5×Wx;W1表示有比特分配的最高频点,Wx表示有比特分配的最高频点和所述预设的带宽扩展频带的起始频点之间的频 带宽带;所述第二频点为:W2+0.5×Wx;W2表示预设的带宽扩展频带的起始频点;
基于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第三频点为:W1-Wx;
基于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第四频点为:W2+Wx。
可选地,在本公开的一个实施例之中,所述装置还用于:
通过对所述比特流解码以得到所述第一频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值、第三频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值中的至少一种。
可选地,在本公开的一个实施例之中,所述装置用于:
对所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频带进行噪声填充。
可选地,在本公开的一个实施例之中,所述装置还用于:
将所述音频频域信号和所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号相加组合后再从频域变换到时域获得重建的音频时域信号。
请参见图9,图9是本申请实施例提供的一种通信装置900的结构示意图。通信装置900可以是网络设备,也可以是终端设备,也可以是支持网络设备实现上述方法的芯片、芯片系统、或处理器等,还可以是支持终端设备实现上述方法的芯片、芯片系统、或处理器等。该装置可用于实现上述方法实施例中描述的方法,具体可以参见上述方法实施例中的说明。
通信装置900可以包括一个或多个处理器901。处理器901可以是通用处理器或者专用处理器等。例如可以是基带处理器或中央处理器。基带处理器可以用于对通信协议以及通信数据进行处理,中央处理器可以用于对通信装置(如,基站、基带芯片,终端设备、终端设备芯片,DU或CU等)进行控制,执行计算机程序,处理计算机程序的数据。
可选的,通信装置900中还可以包括一个或多个存储器902,其上可以存有计算机程序904,处理器901执行所述计算机程序904,以使得通信装置900执行上述方法实施例中描述的方法。可选的,所述存储器902中还可以存储有数据。通信装置900和存储器902可以单独设置,也可以集成在一起。
可选的,通信装置900还可以包括收发器905、天线906。收发器905可以称为收发单元、收发机、或收发电路等,用于实现收发功能。收发器905可以包括接收器和发送器,接收器可以称为接收机或接收电路等,用于实现接收功能;发送器可以称为发送机或发送电路等,用于实现发送功能。
可选的,通信装置900中还可以包括一个或多个接口电路907。接口电路907用于接收代码指令并传输至处理器901。处理器901运行所述代码指令以使通信装置900执行上述方法实施例中描述的方法。
在一种实现方式中,处理器901中可以包括用于实现接收和发送功能的收发器。例如该收发器可以是收发电路,或者是接口,或者是接口电路。用于实现接收和发送功能的收发电路、接口或接口电路可以是分开的,也可以集成在一起。上述收发电路、接口或接口电路可以用于代码/数据的读写,或者,上述收发电路、接口或接口电路可以用于信号的传输或传递。
在一种实现方式中,处理器901可以存有计算机程序903,计算机程序903在处理器901上运行,可使得通信装置900执行上述方法实施例中描述的方法。计算机程序903可能固化 在处理器901中,该种情况下,处理器901可能由硬件实现。
在一种实现方式中,通信装置900可以包括电路,所述电路可以实现前述方法实施例中发送或接收或者通信的功能。本申请中描述的处理器和收发器可实现在集成电路(integrated circuit,IC)、模拟IC、射频集成电路RFIC、混合信号IC、专用集成电路(application specific integrated circuit,ASIC)、印刷电路板(printed circuit board,PCB)、电子设备等上。该处理器和收发器也可以用各种IC工艺技术来制造,例如互补金属氧化物半导体(complementary metal oxide semiconductor,CMOS)、N型金属氧化物半导体(nMetal-oxide-semiconductor,NMOS)、P型金属氧化物半导体(positive channel metal oxide semiconductor,PMOS)、双极结型晶体管(bipolar junction transistor,BJT)、双极CMOS(BiCMOS)、硅锗(SiGe)、砷化镓(GaAs)等。
以上实施例描述中的通信装置可以是网络设备或者终端设备,但本申请中描述的通信装置的范围并不限于此,而且通信装置的结构可以不受图9的限制。通信装置可以是独立的设备或者可以是较大设备的一部分。例如所述通信装置可以是:
(1)独立的集成电路IC,或芯片,或,芯片系统或子系统;
(2)具有一个或多个IC的集合,可选的,该IC集合也可以包括用于存储数据,计算机程序的存储部件;
(3)ASIC,例如调制解调器(Modem);
(4)可嵌入在其他设备内的模块;
(5)接收机、终端设备、智能终端设备、蜂窝电话、无线设备、手持机、移动单元、车载设备、网络设备、云设备、人工智能设备等等;
(6)其他等等。
对于通信装置可以是芯片或芯片系统的情况,可参见图10所示的芯片的结构示意图。图10所示的芯片包括处理器1001和接口1002。其中,处理器1001的数量可以是一个或多个,接口1002的数量可以是多个。
可选的,芯片还包括存储器1003,存储器1003用于存储必要的计算机程序和数据。
本领域技术人员还可以了解到本申请实施例列出的各种说明性逻辑块(illustrative logical block)和步骤(step)可以通过电子硬件、电脑软件,或两者的结合进行实现。这样的功能是通过硬件还是软件来实现取决于特定的应用和整个系统的设计要求。本领域技术人员可以对于每种特定的应用,可以使用各种方法实现所述的功能,但这种实现不应被理解为超出本申请实施例保护的范围。
本申请还提供一种可读存储介质,其上存储有指令,该指令被计算机执行时实现上述任一方法实施例的功能。
本申请还提供一种计算机程序产品,该计算机程序产品被计算机执行时实现上述任一方法实施例的功能。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序。在计算机上加载和执行所述计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机程序可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用 介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以理解:本申请中涉及的第一、第二等各种数字编号仅为描述方便进行的区分,并不用来限制本申请实施例的范围,也表示先后顺序。
本申请中的至少一个还可以描述为一个或多个,多个可以是两个、三个、四个或者更多个,本申请不做限制。在本申请实施例中,对于一种技术特征,通过“第一”、“第二”、“第三”、“A”、“B”、“C”和“D”等区分该种技术特征中的技术特征,该“第一”、“第二”、“第三”、“A”、“B”、“C”和“D”描述的技术特征间无先后顺序或者大小顺序。
本申请中各表所示的对应关系可以被配置,也可以是预定义的。各表中的信息的取值仅仅是举例,可以配置为其他值,本申请并不限定。在配置信息与各参数的对应关系时,并不一定要求必须配置各表中示意出的所有对应关系。例如,本申请中的表格中,某些行示出的对应关系也可以不配置。又例如,可以基于上述表格做适当的变形调整,例如,拆分,合并等等。上述各表中标题示出参数的名称也可以采用通信装置可理解的其他名称,其参数的取值或表示方式也可以通信装置可理解的其他取值或表示方式。上述各表在实现时,也可以采用其他的数据结构,例如可以采用数组、队列、容器、栈、线性表、指针、链表、树、图、结构体、类、堆、散列表或哈希表等。
本申请中的预定义可以理解为定义、预先定义、存储、预存储、预协商、预配置、固化、或预烧制。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (17)

  1. 一种音频信号频带扩展方法,其特征在于,被解码设备执行,包括:
    接收编码设备发送的比特流,对所述比特流进行解码得到解码后的音频频域信号;
    响应于所述音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,所述音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    基于编码设备的编码速率和音频信号所需编码的频带范围确定预设的带宽扩展频带的起始频点和最高频点。
  3. 如权利要求1所述的方法,其特征在于,所述预定频带范围或预定频点范围中的频点均低于所述有比特分配的最高频点。
  4. 如权利要求1所述的方法,其特征在于,所述基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,包括:
    以所述有比特分配的最高频点为起点,或者,以所述预设的带宽扩展频带的最高频点为起点,依次将拷贝的n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,n为正整数或正分数。
  5. 如权利要求4所述的方法,其特征在于,所述n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括:
    顺次重复拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号;或者
    多次镜像拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到n份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号。
  6. 如权利要求1所述的方法,其特征在于,所述基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,包括:
    以所述预设的带宽扩展频带的起始频点为起点,或者,以所述预设的带宽扩展频带的最高频点为起点,拷贝m份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述预设的带宽扩展频带的起始频点至所述预设的带宽扩展频带的最高频点之间的频谱信号,m为正整数或正分数;
    以所述预设的带宽扩展频带的起始频点为起点,或者,以所述有比特分配的最高频点为起点,拷贝h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号作为所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号,h为正整数或正分数。
  7. 如权利要求6所述的方法,其特征在于,
    所述m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号的拷贝方式包括:
    顺次重复拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号;或者
    多次镜像拷贝所述音频频域信号中预定频带范围或预定频点范围内的频谱信号以得到m份或h份所述音频频域信号中预定频带范围或预定频点范围内的频谱信号。
  8. 如权利要求1-7任一所述的方法,其特征在于,不同帧之间采用相同的方法来预测 所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
  9. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    对所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正。
  10. 如权利要求9所述的方法,其特征在于,所述对所述有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号进行频域包络修正,包括以下至少一种:
    基于第一频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至有比特分配的最高频点与所述预设的带宽扩展频带的起始频点两者的中间频点之间的频谱信号的频域包络值;以及,基于预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值修正所述中间频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第一频点为:W1-0.5×Wx;W1表示有比特分配的最高频点,Wx表示有比特分配的最高频点和所述预设的带宽扩展频带的起始频点之间的频带宽带;所述第二频点为:W2+0.5×Wx;W2表示预设的带宽扩展频带的起始频点;
    基于第三频点至有比特分配的最高频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第三频点为:W1-Wx;
    基于预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值修正有比特分配的最高频点至所述预设的带宽扩展频带的起始频点之间的频谱信号的频域包络值;其中,所述第四频点为:W2+Wx。
  11. 如权利要求10所述的方法,其特征在于,所述方法还包括:
    通过对所述比特流解码以得到所述第一频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第二频点之间的频谱信号的频域包络值、第三频点至有比特分配的最高频点之间的频谱信号的频域包络值、预设的带宽扩展频带的起始频点至第四频点之间的频谱信号的频域包络值中的至少一种。
  12. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    对所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频带进行噪声填充。
  13. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    将所述音频频域信号和所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号相加组合后再从频域变换到时域获得重建的音频时域信号。
  14. 一种通信装置,其特征在于,所述装置被配置于解码设备中,包括:
    收发模块,用于接收编码设备发送的比特流,对所述比特流进行解码得到解码后的音频频域信号;
    处理模块,用于响应于所述音频频域信号有比特分配的最高频点低于预设的带宽扩展频带的起始频点,或者,所述音频频域信号有比特分配的频带小于预设的带宽扩展起始频带,基于所述音频频域信号中预定频带范围或预定频点范围内的频谱信号预测所述有比特分配的最高频点至所述预设的带宽扩展频带的最高频点之间的频谱信号。
  15. 一种通信装置,其特征在于,所述装置包括处理器和存储器,其中,所述存储器中存储有计算机程序,所述处理器执行所述存储器中存储的计算机程序,以使所述装置执行如权利要求1至13中任一项所述的方法。
  16. 一种通信装置,其特征在于,包括:处理器和接口电路,其中,所述接口电路,用于接收代码指令并传输至所述处理器;所述处理器,用于运行所述代码指令以执行如权利要求1至13中任一项所述的方法。
  17. 一种计算机可读存储介质,用于存储有指令,当所述指令被执行时,使如权利要求 1至13中任一项所述的方法被实现。
PCT/CN2022/117110 2022-09-05 2022-09-05 一种音频信号频带扩展方法、装置、设备及存储介质 WO2024050673A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/117110 WO2024050673A1 (zh) 2022-09-05 2022-09-05 一种音频信号频带扩展方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/117110 WO2024050673A1 (zh) 2022-09-05 2022-09-05 一种音频信号频带扩展方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2024050673A1 true WO2024050673A1 (zh) 2024-03-14

Family

ID=90192646

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/117110 WO2024050673A1 (zh) 2022-09-05 2022-09-05 一种音频信号频带扩展方法、装置、设备及存储介质

Country Status (1)

Country Link
WO (1) WO2024050673A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083076A (zh) * 2006-06-03 2007-12-05 三星电子株式会社 使用带宽扩展技术对信号编码和解码的方法和设备
CN103971694A (zh) * 2013-01-29 2014-08-06 华为技术有限公司 带宽扩展频带信号的预测方法、解码设备
CN111210831A (zh) * 2018-11-22 2020-05-29 广州广晟数码技术有限公司 基于频谱拉伸的带宽扩展音频编解码方法及装置
KR20220118158A (ko) * 2021-02-18 2022-08-25 한국전자통신연구원 주파수 대역의 확장을 이용한 오디오 신호의 부호화 및 복호화 방법과 그 방법을 수행하는 부호화기 및 복호화기

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083076A (zh) * 2006-06-03 2007-12-05 三星电子株式会社 使用带宽扩展技术对信号编码和解码的方法和设备
CN103971694A (zh) * 2013-01-29 2014-08-06 华为技术有限公司 带宽扩展频带信号的预测方法、解码设备
CN111210831A (zh) * 2018-11-22 2020-05-29 广州广晟数码技术有限公司 基于频谱拉伸的带宽扩展音频编解码方法及装置
KR20220118158A (ko) * 2021-02-18 2022-08-25 한국전자통신연구원 주파수 대역의 확장을 이용한 오디오 신호의 부호화 및 복호화 방법과 그 방법을 수행하는 부호화기 및 복호화기

Similar Documents

Publication Publication Date Title
WO2022213294A1 (zh) 一种时域资源分配的方法及装置
US20230275687A1 (en) Channel coding method and communication apparatus
WO2023019411A1 (zh) 一种下行控制信息的对齐方法及其装置
US11152959B2 (en) Enhanced information sequences for polar codes
WO2024050673A1 (zh) 一种音频信号频带扩展方法、装置、设备及存储介质
CN116348952A (zh) 一种音频信号处理、装置、设备及存储介质
WO2023184372A1 (zh) 上行信道的发送和接收的方法及装置
WO2023201753A1 (zh) 一种终端能力上报方法、确定方法及其装置
WO2023201497A1 (zh) 一种确定非授权频谱中频域资源的方法及装置
WO2023206034A1 (zh) 混合自动重传请求harq反馈的处理方法及其装置
WO2018228589A1 (zh) 一种编码方法、无线设备和芯片
WO2022262031A1 (zh) 一种数据处理方法、装置及系统
WO2024108449A1 (zh) 一种信号量化方法、装置、设备及存储介质
WO2021004239A1 (zh) 数据处理方法及装置
WO2023123476A1 (zh) 一种时域资源传输位置的确定方法和装置
WO2024082196A1 (zh) 一种基于ai模型的终端定位方法及装置
WO2024026792A1 (zh) 通信方法、装置、设备、存储介质、芯片及程序产品
WO2023122990A1 (zh) 物理随机接入信道prach的传输方法和装置
WO2024007273A1 (zh) 峰值数据速率的确定方法及装置
WO2023197187A1 (zh) 一种信道状态信息的处理方法及装置
WO2024031713A1 (zh) 上行8端口码本的生成方法、装置、设备及存储介质
WO2024082195A1 (zh) 一种基于ai模型的终端定位方法及装置
WO2023035202A1 (zh) 一种跟踪参考信号周期的确定方法及其装置
WO2023197121A1 (zh) 一种发送直连测距信号的方法及装置
US20240146582A1 (en) Information encoding control method and related apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22957647

Country of ref document: EP

Kind code of ref document: A1