EP3613042B1 - Nicht-harmonische spracherkennung und bandbreitenerweiterung in einer mehrquellenumgebung - Google Patents

Nicht-harmonische spracherkennung und bandbreitenerweiterung in einer mehrquellenumgebung Download PDF

Info

Publication number
EP3613042B1
EP3613042B1 EP18724649.1A EP18724649A EP3613042B1 EP 3613042 B1 EP3613042 B1 EP 3613042B1 EP 18724649 A EP18724649 A EP 18724649A EP 3613042 B1 EP3613042 B1 EP 3613042B1
Authority
EP
European Patent Office
Prior art keywords
band
signal
channel
harmonic
flag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP18724649.1A
Other languages
English (en)
French (fr)
Other versions
EP3613042A1 (de
Inventor
Venkata Subrahmanyam Chandra Sekhar CHEBIYYAM
Venkatraman ATTI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of EP3613042A1 publication Critical patent/EP3613042A1/de
Application granted granted Critical
Publication of EP3613042B1 publication Critical patent/EP3613042B1/de
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • G10L21/0388Details of processing therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques

Definitions

  • the present disclosure is generally related to encoding of an audio signal or decoding of an audio signal.
  • wireless telephones such as mobile and smart phones, tablets and laptop computers that are small, lightweight, and easily carried by users.
  • These devices can communicate voice and data packets over wireless networks.
  • many such devices incorporate additional functionality such as a digital still camera, a digital video camera, a digital recorder, and an audio file player.
  • such devices can process executable instructions, including software applications, such as a web browser application, that can be used to access the Internet. As such, these devices can include significant computing capabilities.
  • a first device may include or be coupled to one or more microphones to receive an audio signal.
  • the first device encodes the received audio signal and sends the encoded audio signal to a second device.
  • the second device may include one or more output devices (e.g., one or more speakers) to produce an output.
  • the second device decodes the encoded audio signal to generate an output signal that is provided to the one or more output devices.
  • an encoder may generate a low-band signal and a high-band signal based on a received audio signal.
  • the received audio signal may a combination of multiple sound sources, such as two people talking concurrently.
  • a first sound source may provide a voiced segment (such as the sound of the letter "r") and a second sound source may provide an unvoiced segment (such as the sound "ssss").
  • an energy of the voiced segment may be concentrated in the low-band while an energy of the unvoiced segment is concentrated in the high-band.
  • the low-band is highly voiced because the majority (or all) of the energy of the low-band is coming from voiced segment of the first sound source and the high-band is highly noisy because the majority (or all) of the energy of the high-band is coming from the unvoiced segment of the second sound source.
  • Low-band voicing parameters may be generated based on a low-band signal.
  • the low-band voicing parameters may then be used to generate mixing factors (e.g., gain values that indicate how much of the low-band is noisy, how much of the low-band is harmonics, etc.) that are used to generate a high-band excitation.
  • the harmonic nature of the low-band is extrapolated into the high-band by extending a low-band excitation into the high-band. If the low-band voicing parameters indicate that the low-band is harmonic, the high-band extension will also be harmonic. Alternatively, if the low-band voicing parameters indicate that the low-band is noisy, the high-band extension will also be noisy.
  • the low band voicing factors may not be reflective of (or indicate) the harmonicity of the high band. Accordingly, in this situation, using the low-band voicing parameters to control generation of the high-band excitation is not reflective of the high-band.
  • a decoder receives an encoded low-band signal and an encoded high-band signal. To generate an output signal (reflective of an audio signal received by the encoder), the decoder generates a high-band excitation in a manner similar to the encoder. Similar to the problems described above with the encoder, if low-band voicing parameters used at the decoder are not reflective of the high-band (such as when low-band voicing factors indicate that the low-band is highly voiced and the high-band is highly noisy), a high-band excitation generated at the decoder may not match the high-band at the encoder and a play out quality of an output of the decoder may be degraded.
  • an ordinal term e.g., "first,” “second,” “third,” etc.
  • an element such as a structure, a component, an operation, etc.
  • the term “set” refers to one or more of a particular element
  • the term “plurality” refers to multiple (e.g., two or more) of a particular element.
  • determining may be used to describe how one or more operations are performed. It should be noted that such terms are not to be construed as limiting and other techniques may be utilized to perform similar operations. Additionally, as referred to herein, “generating”, “calculating”, “estimating”, “using”, “selecting”, “accessing”, and “determining” may be used interchangeably. For example, “generating”, “calculating”, “estimating”, or “determining” a parameter (or a signal) may refer to actively generating, estimating, calculating, or determining the parameter (or the signal) or may refer to using, selecting, or accessing the parameter (or signal) that is already generated, such as by another component or device.
  • systems and devices operable to encode multiple audio signals are disclosed.
  • the present disclosure is related to coding (e.g., encoding or decoding) signals in a high-band while a low-band may be either harmonic or non-harmonic.
  • systems, devices, and methods may be configured to detect a harmonicity of a high-band signal and to set a value of a flag that indicates a harmonic metric (e.g., the harmonicity, such as a relative degree of harmonicity) of a high band signal.
  • the systems, devices, and methods may further be configured to use the flag to generate high band signals and to modify the flag (e.g., modify the value of the flag).
  • the flag may be used to determine one or more mixing parameters, noise envelope parameters, gain shape parameters, gain frame parameters, or a combination thereof.
  • the systems, devices, and methods described herein are applicable to mono-coding (e.g., mono-encoding or mono-decoding) and to stereo/multi-channel coding (e.g., stereo/multi-channel encoding, stereo/multi-channel decoding, or both).
  • a device may include an encoder configured to encode the multiple audio signals.
  • the multiple audio signals may be captured concurrently in time using multiple recording devices, e.g., multiple microphones.
  • the multiple audio signals (or multi-channel audio) may be synthetically (e.g., artificially) generated by multiplexing several audio channels that are recorded at the same time or at different times.
  • the concurrent recording or multiplexing of the audio channels may result in a 2-channel configuration (i.e., Stereo: Left and Right), a 5.1 channel configuration (Left, Right, Center, Left Surround, Right Surround, and the low frequency emphasis (LFE) channels), a 7.1 channel configuration, a 7.1+4 channel configuration, a 22.2 channel configuration, or a N-channel configuration.
  • 2-channel configuration i.e., Stereo: Left and Right
  • a 5.1 channel configuration Left, Right, Center, Left Surround, Right Surround, and the low frequency emphasis (LFE) channels
  • LFE low frequency emphasis
  • Audio capture devices in teleconference rooms may include multiple microphones that acquire spatial audio.
  • the spatial audio may include speech as well as background audio that is encoded and transmitted.
  • the speech/audio from a given source e.g., a talker
  • the speech/audio from a given source may arrive at the multiple microphones at different times depending on how the microphones are arranged as well as where the source (e.g., the talker) is located with respect to the microphones and room dimensions.
  • a sound source e.g., a talker
  • the device may receive a first audio signal via the first microphone and may receive a second audio signal via the second microphone.
  • Mid-side (MS) coding and parametric stereo (PS) coding are stereo coding techniques that may provide improved efficiency over the dual-mono coding techniques.
  • dual-mono coding the Left (L) channel (or signal) and the Right (R) channel (or signal) are independently coded without making use of inter-channel correlation.
  • MS coding reduces the redundancy between a correlated L/R channel-pair by transforming the Left channel and the Right channel to a sum-channel and a difference-channel (e.g., a side channel) prior to coding.
  • the sum signal and the difference signal are waveform coded or coded based on a model in MS coding. Relatively more bits are spent on the sum signal than on the side signal.
  • PS coding reduces redundancy in each sub-band by transforming the L/R signals into a sum signal and a set of side parameters.
  • the side parameters may indicate an inter-channel intensity difference (IID), an inter-channel phase difference (IPD), an inter-channel time difference (ITD), side or residual prediction gains, etc.
  • the sum signal is waveform coded and transmitted along with the side parameters.
  • the side-channel may be waveform coded in the lower bands (e.g., less than 2 kilohertz (kHz)) and PS coded in the upper bands (e.g., greater than or equal to 2 kHz) where the inter-channel phase preservation is perceptually less critical.
  • the PS coding may be used in the lower bands also to reduce the inter-channel redundancy before waveform coding.
  • the MS coding and the PS coding may be done in either the frequency-domain or in the sub-band domain.
  • the Left channel and the Right channel may be uncorrelated.
  • the Left channel and the Right channel may include uncorrelated synthetic signals.
  • the coding efficiency of the MS coding, the PS coding, or both may approach the coding efficiency of the dual-mono coding.
  • the sum channel and the difference channel may contain comparable energies reducing the coding-gains associated with MS or PS techniques.
  • the reduction in the coding-gains may be based on the amount of temporal (or phase) shift.
  • the comparable energies of the sum signal and the difference signal may limit the usage of MS coding in certain frames where the channels are temporally shifted but are highly correlated.
  • a Mid channel e.g., a sum channel
  • a Side channel e.g., a difference channel
  • M L + R / 2
  • S L ⁇ R / 2
  • Generating the Mid channel and the Side channel based on Formula 1 or Formula 2 may be referred to as "downmixing”.
  • a reverse process of generating the Left channel and the Right channel from the Mid channel and the Side channel based on Formula 1 or Formula 2 may be referred to as "upmixing".
  • An ad-hoc approach used to choose between MS coding or dual-mono coding for a particular frame may include generating a mid signal and a side signal, calculating energies of the mid signal and the side signal, and determining whether to perform MS coding based on the energies. For example, MS coding may be performed in response to determining that the ratio of energies of the side signal and the mid signal is less than a threshold.
  • a first energy of the mid signal (corresponding to a sum of the left signal and the right signal) may be comparable to a second energy of the side signal (corresponding to a difference between the left signal and the right signal) for voiced speech frames.
  • a higher number of bits may be used to encode the Side channel, thereby reducing coding efficiency of MS coding relative to dual-mono coding.
  • Dual-mono coding may thus be used when the first energy is comparable to the second energy (e.g., when the ratio of the first energy and the second energy is greater than or equal to the threshold).
  • the decision between MS coding and dual-mono coding for a particular frame may be made based on a comparison of a threshold and normalized cross-correlation values of the Left channel and the Right channel.
  • the encoder may determine a mismatch value indicative of an amount of temporal misalignment between the first audio signal and the second audio signal.
  • a mismatch value indicative of an amount of temporal misalignment between the first audio signal and the second audio signal.
  • a “temporal shift value”, a “shift value”, and a “mismatch value” may be used interchangeably.
  • the encoder may determine a temporal shift value indicative of a shift (e.g., the temporal mismatch) of the first audio signal relative to the second audio signal.
  • the temporal mismatch value may correspond to an amount of temporal delay between receipt of the first audio signal at the first microphone and receipt of the second audio signal at the second microphone.
  • the encoder may determine the temporal mismatch value on a frame-byframe basis, e.g., based on each 20 milliseconds (ms) speech/audio frame.
  • the temporal mismatch value may correspond to an amount of time that a second frame of the second audio signal is delayed with respect to a first frame of the first audio signal.
  • the temporal mismatch value may correspond to an amount of time that the first frame of the first audio signal is delayed with respect to the second frame of the second audio signal.
  • frames of the second audio signal may be delayed relative to frames of the first audio signal.
  • the first audio signal may be referred to as the "reference audio signal” or “reference channel” and the delayed second audio signal may be referred to as the "target audio signal” or “target channel”.
  • the second audio signal may be referred to as the reference audio signal or reference channel and the delayed first audio signal may be referred to as the target audio signal or target channel.
  • the reference channel and the target channel may change from one frame to another; similarly, the temporal delay value may also change from one frame to another.
  • the temporal mismatch value may always be positive to indicate an amount of delay of the "target" channel relative to the "reference” channel.
  • the temporal mismatch value may correspond to a "non-causal shift" value by which the delayed target channel is "pulled back" in time such that the target channel is aligned (e.g., maximally aligned) with the "reference” channel.
  • the downmix algorithm to determine the mid channel and the side channel may be performed on the reference channel and the non-causal shifted target channel.
  • the device may perform a framing or a buffering algorithm to generate a frame (e.g., 20 ms samples) at a first sampling rate (e.g., 32 kHz sampling rate (i.e., 640 samples per frame)).
  • the encoder may, in response to determining that a first frame of the first audio signal and a second frame of the second audio signal arrive at the same time at the device, estimate a temporal mismatch value (e.g., shift1) as equal to zero samples.
  • a Left channel e.g., corresponding to the first audio signal
  • a Right channel e.g., corresponding to the second audio signal
  • the Left channel and the Right channel may be temporally misaligned due to various reasons (e.g., a sound source, such as a talker, may be closer to one of the microphones than another and the two microphones may be greater than a threshold (e.g., 1-20 centimeters) distance apart).
  • a location of the sound source relative to the microphones may introduce different delays in the Left channel and the Right channel.
  • a reference channel is initially selected based on the levels or energies of the channels, and subsequently refined based on the temporal mismatch values between different pairs of the channels, e.g., t1(ref, ch2), t2(ref, ch3), t3(ref, ch4), ..., where ch1 is the ref channel initially and t1(.), t2(.), etc. are the functions to estimate the mismatch values. If all temporal mismatch values are positive then ch1 is treated as the reference channel.
  • the reference channel is reconfigured to the channel that was associated with a mismatch value that resulted in a negative value and the above process is continued until the best selection (i.e., based on maximally decorrelating maximum number of side channels) of the reference channel is achieved.
  • a hysteresis may be used to overcome any sudden variations in reference channel selection.
  • a time of arrival of audio signals at the microphones from multiple sound sources may vary when the multiple talkers are alternatively talking (e.g., without overlap).
  • the encoder may dynamically adjust a temporal mismatch value based on the talker to identify the reference channel.
  • the multiple talkers may be talking at the same time, which may result in varying temporal mismatch values depending on who is the loudest talker, closest to the microphone, etc.
  • identification of reference and target channels may be based on the varying temporal shift values in the current frame and the estimated temporal mismatch values in the previous frames, and based on the energy or temporal evolution of the first and second audio signals.
  • the first audio signal and second audio signal may be synthesized or artificially generated when the two signals potentially show less (e.g., no) correlation. It should be understood that the examples described herein are illustrative and may be instructive in determining a relationship between the first audio signal and the second audio signal in similar or different situations.
  • the encoder may generate comparison values (e.g., difference values or cross-correlation values) based on a comparison of a first frame of the first audio signal and a plurality of frames of the second audio signal. Each frame of the plurality of frames may correspond to a particular temporal mismatch value.
  • the encoder may generate a first estimated temporal mismatch value based on the comparison values. For example, the first estimated temporal mismatch value may correspond to a comparison value indicating a higher temporal-similarity (or lower difference) between the first frame of the first audio signal and a corresponding first frame of the second audio signal.
  • the encoder may determine a final temporal mismatch value by refining, in multiple stages, a series of estimated temporal mismatch values. For example, the encoder may first estimate a "tentative" temporal mismatch value based on comparison values generated from stereo pre-processed and re-sampled versions of the first audio signal and the second audio signal. The encoder may generate interpolated comparison values associated with temporal mismatch values proximate to the estimated "tentative" temporal mismatch value. The encoder may determine a second estimated "interpolated" temporal mismatch value based on the interpolated comparison values.
  • the second estimated “interpolated” temporal mismatch value may correspond to a particular interpolated comparison value that indicates a higher temporal-similarity (or lower difference) than the remaining interpolated comparison values and the first estimated “tentative” temporal mismatch value. If the second estimated “interpolated” temporal mismatch value of the current frame (e.g., the first frame of the first audio signal) is different than a final temporal mismatch value of a previous frame (e.g., a frame of the first audio signal that precedes the first frame), then the "interpolated” temporal mismatch value of the current frame is further “amended” to improve the temporal-similarity between the first audio signal and the shifted second audio signal.
  • a final temporal mismatch value of a previous frame e.g., a frame of the first audio signal that precedes the first frame
  • a third estimated “amended" temporal mismatch value may correspond to a more accurate measure of temporal-similarity by searching around the second estimated “interpolated” temporal mismatch value of the current frame and the final estimated temporal mismatch value of the previous frame.
  • the third estimated “amended” temporal mismatch value is further conditioned to estimate the final temporal mismatch value by limiting any spurious changes in the temporal mismatch value between frames and further controlled to not switch from a negative temporal mismatch value to a positive temporal mismatch value (or vice versa) in two successive (or consecutive) frames as described herein.
  • the encoder may refrain from switching between a positive temporal mismatch value and a negative temporal mismatch value or vice-versa in consecutive frames or in adjacent frames. For example, the encoder may set the final temporal mismatch value to a particular value (e.g., 0) indicating no temporal-shift based on the estimated "interpolated” or “amended” temporal mismatch value of the first frame and a corresponding estimated “interpolated” or “amended” or final temporal mismatch value in a particular frame that precedes the first frame.
  • a particular value e.g., 0
  • the previous frame e.g., the frame preceding the first frame
  • the final temporal mismatch value of the previous frame e.g., the frame preceding the first frame
  • the encoder may select a frame of the first audio signal or the second audio signal as a "reference” or "target” based on the temporal mismatch value. For example, in response to determining that the final temporal mismatch value is positive, the encoder may generate a reference channel or signal indicator having a first value (e.g., 0) indicating that the first audio signal is a "reference” signal and that the second audio signal is the "target” signal. Alternatively, in response to determining that the final temporal mismatch value is negative, the encoder may generate the reference channel or signal indicator having a second value (e.g., 1) indicating that the second audio signal is the "reference” signal and that the first audio signal is the "target” signal.
  • a first value e.g., 0
  • the encoder may generate the reference channel or signal indicator having a second value (e.g., 1) indicating that the second audio signal is the "reference” signal and that the first audio signal is the "target” signal.
  • the encoder may estimate a relative gain (e.g., a relative gain parameter) associated with the reference signal and the non-causal shifted target signal. For example, in response to determining that the final temporal mismatch value is positive, the encoder may estimate a gain value to normalize or equalize the amplitude or power levels of the first audio signal relative to the second audio signal that is offset by the non-causal temporal mismatch value (e.g., an absolute value of the final temporal mismatch value). Alternatively, in response to determining that the final temporal mismatch value is negative, the encoder may estimate a gain value to normalize or equalize the power or amplitude levels of the non-causal shifted first audio signal relative to the second audio signal.
  • a relative gain e.g., a relative gain parameter
  • the encoder may estimate a gain value to normalize or equalize the amplitude or power levels of the "reference" signal relative to the non-causal shifted "target” signal. In other examples, the encoder may estimate the gain value (e.g., a relative gain value) based on the reference signal relative to the target signal (e.g., the unshifted target signal).
  • the encoder may generate at least one encoded signal (e.g., a mid signal, a side signal, or both) based on the reference signal, the target signal, the non-causal temporal mismatch value, and the relative gain parameter.
  • the encoder may generate at least one encoded signal (e.g., a mid channel, a side channel, or both) based on the reference channel and the temporal-mismatch adjusted target channel.
  • the side signal may correspond to a difference between first samples of the first frame of the first audio signal and selected samples of a selected frame of the second audio signal.
  • the encoder may select the selected frame based on the final temporal mismatch value.
  • a transmitter of the device may transmit the at least one encoded signal, the non-causal temporal mismatch value, the relative gain parameter, the reference channel or signal indicator, or a combination thereof.
  • the encoder may generate at least one encoded signal (e.g., a mid signal, a side signal, or both) based on the reference signal, the target signal, the non-causal temporal mismatch value, the relative gain parameter, low band parameters of a particular frame of the first audio signal, high band parameters of the particular frame, or a combination thereof.
  • the particular frame may precede the first frame.
  • Certain low band parameters, high band parameters, or a combination thereof, from one or more preceding frames may be used to encode a mid signal, a side signal, or both, of the first frame.
  • Encoding the mid signal, the side signal, or both, based on the low band parameters, the high band parameters, or a combination thereof, may improve estimates of the non-causal temporal mismatch value and inter-channel relative gain parameter.
  • the low band parameters, the high band parameters, or a combination thereof may include a pitch parameter, a voicing parameter, a coder type parameter, a low-band energy parameter, a high-band energy parameter, an envelope parameter (e.g., a tilt parameter), a pitch gain parameter, a FCB gain parameter, a coding mode parameter, a voice activity parameter, a noise estimate parameter, a signal-to-noise ratio parameter, a formants parameter, a speech/music decision parameter, the non-causal shift, the interchannel gain parameter, or a combination thereof.
  • a transmitter of the device may transmit the at least one encoded signal, the non-causal temporal mismatch value, the relative gain parameter, the reference channel (or signal) indicator, or a combination thereof.
  • terms such as “determining”, “calculating”, “estimating”, “shifting”, “adjusting”, etc. may be used to describe how one or more operations are performed. It should be noted that such terms are not to be construed as limiting and other techniques may be utilized to perform similar operations.
  • the encoder includes a down-mixer configured to convert a stereo pair of channels into a mid/side channel pair.
  • a low-band mid channel (a low-band portion of the mid channel) and a low-band side channel are provided to a low-band encoder.
  • the low-band encoder is configured to generate a low-band bit stream.
  • the low-band encoder is configured to generate low-band parameters, such as a low-band excitation, a low-band voicing parameter(s), etc.
  • the low-band excitation and a high-band mid channel are provided to a BWE encoder.
  • the BWE encoder generates a high-band mid channel bitstream and high-band parameters (e.g., LPC, gain frame, gain shift, etc.).
  • the encoder such as the BWE encoder, is configured to determine a flag value that indicates a harmonicity of a high-band signal, such as the high-band mid signal.
  • the flag value may indicate a harmonicity metric of the high-band signal.
  • the flag value may indicate whether the high-band signal is harmonic or non-harmonic (e.g., noisy).
  • the flag value may indicate whether the high-band signal is strongly harmonic, strongly non-harmonic, or weakly harmonic (e.g., between strongly harmonic and strongly non-harmonic).
  • the flag value may be determined based on one or more low-band parameters, one or more high-band parameters, or a combination thereof.
  • the one or more low-band parameters and the one or more high-band parameters may correspond to a current frame or to a previous frame.
  • the encoder may determine, based on the Low Band (LB) and High Band (HB) parameters, a Non-Harmonic HB flag which indicates whether the HB is non-harmonic or not.
  • LB Low Band
  • HB High Band
  • parameters that may be used to determine the flag value include a high-band long term energy, a high-band short term energy, a ratio based on the high-band short term energy and the high-band long term energy, a previous frame's high-band gain frame, a current frame's high-band gain frame, low-band voicing parameters, or a combination thereof. Additionally or alternatively, other parameters available to an encoder (or decoder) may be used to determine the flag value (the harmonicity of the high-band signal). In a particular implementation, a value of the flag (for a current frame) is determined based on low band voicing (of the current frame), a previous frame's gain frame, and the high-band mid channel (of the current frame).
  • an estimation or a prediction is made whether the high-band is harmonic (or is non harmonic).
  • One or more techniques may be used to determine a value of the flag (e.g., to determine the harmonic metric). Some techniques may include: If-else logic (Decision Trees) (with or without some smoothing/hysteresis for smoother decisions), Gaussian Mixture Model (GMM) (e.g., based on measures provided by the GMM such as the degree of HB Harmonic and the degree of HB Non-Harmonic), other classification tools (e.g., Support Vector Machines, Neural Networks, etc.), or a combination thereof.
  • a predetermined GMM may be used to determine probabilities of whether the high-band signal is harmonic and non harmonic. For example, a first likelihood that the high-band is harmonic may be determined. Alternatively, a second likelihood that the high-band is non harmonic may be determined. In some implementations, both the first likelihood and the second likelihood are determined. In implementations where the flag can have one of two values (e.g., a first value indicating harmonic and a second value indicating non harmonic), the first likelihood (of the high-band being harmonic) may be compared to a first threshold.
  • the flag indicates that the high-band signal is harmonic; otherwise, the value of the flag indicates that the high-band signal is non harmonic.
  • the second likelihood (of the high-band being non harmonic) may be compared to a second threshold. If the second likelihood is greater than or equal to the second threshold, the flag indicates that the high-band signal is non harmonic; otherwise, the value of the flag indicates that the high-band signal is harmonic.
  • the value of the flag may be set to correspond to the greater of the first likelihood and the second likelihood.
  • the flag can have more than two values (e.g., a first value indicating harmonic, a second value indicating non harmonic, and a third value indicating neither dominate harmonic nor dominate non harmonic), if the first likelihood is less than the first threshold and the second likelihood is less than the second threshold, the flag is set to the third value. Additional thresholds may be applied to the first likelihood or the second likelihood to determine additional values of the flag that correspond to additional harmonic metrics. Additional examples of the flag, the value of the flag, and how the value of the flag can impact encoding or decoding operations are described further herein.
  • the low band excitation is non-linearly extended (e.g., apply a non-linearity function) to generate a harmonic high-band excitation.
  • the harmonic high-band excitation can be used to determine a high band excitation, as described further below.
  • One or more high-band parameters may be determined based on the high band excitation.
  • envelope modulated noise is used to generate a noisy component of the high band excitation.
  • the envelope is extracted from (e.g., based on) the harmonic high-band excitation.
  • the envelope modulation is performed by applying a low pass filter on the absolute values of the harmonic high-band excitation.
  • a noise envelope modulator may extract an envelope from the harmonic high band excitation and apply that envelope on random noise (from a random noise generator) so that modulated noise output by the noise envelope modulator has a similar temporal envelope as the high band excitation.
  • the flag (indicating the harmonic metric) is used to control a noise envelope estimation process which estimates the noise envelope to be applied to the random noise by the noise envelope modulator (to generate the modulated noise).
  • noise envelope control parameters may include filter coefficients for the low pass filtering to be performed on the harmonic high band excitation.
  • the noise envelope control parameters indicate that the envelope to be applied to the random noise is to be a slow varying envelope (e.g., the noise envelope modulator can use a large length of samples such that the noise envelope has a large resolution).
  • the noise envelope control parameters indicate that the envelope to be applied to the random noise is to be a fast-varying envelope (e.g., the noise envelope modulator can use a small length of samples such that the noise envelope has a fine resolution).
  • mixing parameters e.g., gain values, such as Gain1 (Encoder) and Gain2 (Encoder)
  • gain values such as Gain1 (Encoder) and Gain2 (Encoder)
  • the mixing parameters indicate the proportions of the harmonic high-band excitation and the modulated noise that are to be combined to generate the high band excitation.
  • Gain1 + Gain2 1.
  • Gain1 may be applied to the harmonic high-band excitation and Gain2 may be applied to the modulated noise.
  • the gain adjusted harmonic high-band excitation and the gain adjusted modulated noise may be combined (e.g., summed) to generate the high band excitation.
  • Gain2 is greater than Gain1.
  • Gain2 is set to one and Gain1 is set to zero.
  • the flag indicates that the high band is non harmonic (e.g., strongly non harmonic)
  • the high-band excitation should reflect a noisy high band.
  • Gain1 may be greater than Gain2. In some implementations, if the flag indicates that the high band is harmonic (e.g., strongly harmonic), Gain1 is set to one and Gain2 is set to zero. Thus, if the flag indicates that the high band is harmonic (e.g., strongly harmonic), the high-band excitation should reflect a harmonic high band.
  • Gain1 may be set to a first value and Gain2 may be set to a second value. In some examples, Gain1 may be greater than or equal to Gain2. In other examples, Gain1 may be less than or equal to Gain 2. The value of Gain1 and the value of Gain2 may be determined based on the low band voice factors.
  • high band gain shapes and high-band gain frames may be determined based at least in part on the high-band excitation.
  • the value of the flag (for the current frame) can be modified to generate a modified flag. For example, if the high-band gain frame (of the current frame) is greater than a threshold, thus indicating that there is non-harmonic content in the high band, the flag may be modified to indicate the high-band is non-harmonic (e.g., strongly non-harmonic).
  • modification of the flag may be based on the pre-quantized high-band gain frame, the quantized high-band gain frame, the quantized or unquantized high-band gain shape, or a combination thereof.
  • the modified flag may be transmitted to the decoder.
  • the unmodified flag is transmitted to the decoder and the decoder may generate a modified version of the flag.
  • the flag may be used for coding the inter channel relationships to be transmitted to the decoder.
  • the flag may be used to determine mixing values (e.g., gains) associated with generation of the ICBWE non-reference channel excitation.
  • the decoder may receive the flag (or the modified flag). In implementations where the decoder receives the flag (and does not receive the modified flag), the decoder may generate a modified flag based on the flag. In some implementations, the decoder does not receive the flag or the modified flag and is configured to generate a modified flag based on one or more parameters, such as the parameters described above with respect to the encoder (and that are available to the decoder), front end stereo scene analysis results, downmix parameters, other parameters, or a combination thereof, as non-limiting, illustrative examples.
  • the decoder To generate an output signal (reflective of an audio signal received by the encoder), the decoder generates a high-band excitation in a manner similar to the encoder. To illustrate, based on the received modified flag, the decoder generates a gain adjusted modulated noise and a gain adjusted harmonic high-band excitation that are combined to generate a high-band excitation. Based on the generated excitation, decoder values of the gain frame and the gain shapes and other parameters are generated. It is noted that since the flag used at the encoder and decoder may differ in value for a particular frame, the high-band excitation based on which the high-band gain frame and the high-band gain shapes are estimated at the encoder may be different from the excitation on which these values are applied at the decoder.
  • the flag may be used for coding the inter channel relationships at the decoder.
  • the flag may be used to determine mixing values (e.g., gains) associated with generation of the ICBWE non-reference channel excitation.
  • problems associated with low-band voicing parameters not reflecting a harmonicity of the high-band may be reduced or eliminated.
  • a high-band excitation generated at the decoder using the flag may better match the high-band at the encoder and a play out quality of an output of the decoder may not be degraded.
  • an encoder may generate a low-band signal and a high-band signal based on a received audio signal.
  • the received audio signal may a combination of multiple sound sources, such as two people talking concurrently.
  • a first sound source may provide a voiced segment (such as the sound of the letter "r") and a second sound source may provide an unvoiced segment (such as the sound "ssss").
  • an energy of the voiced segment may be concentrated in the low-band while an energy of the unvoiced segment is concentrated in the high-band.
  • the low-band is highly voiced because the majority (or all) of the energy of the low-band is coming from voiced segment of the first sound source and the high-band is highly noisy because the majority (or all) of the energy of the high-band is coming from the unvoiced segment of the second sound source.
  • the flag (or the modified flag) may be used during encoding, decoding, or both so that the nature of the low-band signal does not negatively impact the high-band excitation, such that the high-band excitation is not reflective of the high-band.
  • the system 100 includes a first device 104 communicatively coupled, via a network 120, to a second device 106.
  • the network 120 may include one or more wireless networks, one or more wired networks, or a combination thereof.
  • the first device 104 may include a memory 153, an encoder 200, a transmitter 110, and one or more input interfaces 112.
  • the memory 153 may be a nontransitory computer-readable medium that includes instructions 191.
  • the instructions 191 may be executable by the encoder 200 to perform one or more of the operations described herein.
  • a first input interface of the input interfaces 112 may be coupled to a first microphone 146.
  • a second input interface of the input interfaces 112 may be coupled to a second microphone 148.
  • the encoder 200 may include an inter-channel bandwidth extension (ICBWE) encoder 204.
  • the ICBWE encoder 204 may be configured to estimate one or more spectral mapping parameters based on a synthesized non-reference high-band and a non-reference target channel.
  • the first device 104 may also include a flag (e.g., a non harmonic high-band (HB) flag (x) 910) or a modified flag (e.g., a modified non harmonic high-band (HB) flag (y) 920), as described further with reference to FIG. 9 .
  • the first device 104 may not include the modified flag (e.g., the modified non harmonic HB flag (y) 920).
  • the second device 106 may include a decoder 300.
  • the decoder 300 may include an ICBWE decoder 306.
  • the ICBWE decoder 306 may be configured to extract one or more spectral mapping parameters from a received spectral mapping bitstream. Additional details associated with the operations of the ICBWE decoder 306 are described with respect to FIGS. 3 and 6 .
  • the second device 106 may be coupled to a first loudspeaker 142, a second loudspeaker 144, or both. Although not shown, the second device 106 may include other components, such a processor (e.g., central processing unit), a microphone, a receiver, a transmitter, an antenna, a memory, etc.
  • a processor e.g., central processing unit
  • the second device 106 may also include the modified flag (e.g., the modified non harmonic HB flag (y) 920), as described further with reference to FIG. 10 .
  • the second device 106 may additionally or alternatively include the flag (e.g., a non harmonic HB flag (x) 910).
  • the first device 104 may receive a first audio channel 130 (e.g., a first audio signal) via the first input interface from the first microphone 146 and may receive a second audio channel 132 (e.g., a second audio signal) via the second input interface from the second microphone 148.
  • the first audio channel 130 may correspond to one of a right channel or a left channel.
  • the second audio channel 132 may correspond to the other of the right channel or the left channel.
  • a sound source 152 e.g., a user, a speaker, ambient noise, a musical instrument, etc.
  • an audio signal from the sound source 152 may be received at the input interfaces 112 via the first microphone 146 at an earlier time than via the second microphone 148.
  • This natural delay in the multi-channel signal acquisition through the multiple microphones may introduce a temporal misalignment between the first audio channel 130 and the second audio channel 132.
  • the first audio channel 130 may be a "reference channel” and the second audio channel 132 may be a "target channel”.
  • the target channel may be adjusted (e.g., temporally shifted) to substantially align with the reference channel.
  • the second audio channel 132 may be the reference channel and the first audio channel 130 may be the target channel.
  • the reference channel and the target channel may vary on a frame-to-frame basis. For example, for a first frame, the first audio channel 130 may be the reference channel and the second audio channel 132 may be the target channel. However, for a second frame (e.g., a subsequent frame), the first audio channel 130 may be the target channel and the second audio channel 132 may be the reference channel.
  • the first audio channel 130 is the reference channel and the second audio channel 132 is the target channel.
  • the reference channel described with respect to the audio channels 130, 132 may be independent from the high-band reference channel indicator that is described below.
  • the high-band reference channel indicator may indicate that a high-band of either of the audio channels 130, 132 is the high-band reference channel
  • the high-band reference channel indicator may indicate a high-band reference channel which could be either the same channel or a different channel from the reference channel.
  • the encoder 200 may generate a down-mix bitstream 216, an ICBWE bitstream 242, a high-band mid channel bitstream 244, and a low-band bitstream 246.
  • the transmitter 110 may transmit the down-mix bitstream 216, the ICBWE bitstream 242, the high-band mid channel bitstream 244, or a combination thereof, via the network 120, to the second device 106.
  • the transmitter 110 may store the down-mix bitstream 216, the ICBWE bitstream 242, the high-band mid channel bitstream 244, or a combination thereof, at a device of the network 120 or a local device for further processing or decoding later.
  • the decoder 300 may perform decoding operations based on the down-mix bitstream 216, the ICBWE bitstream 242, the high-band mid channel bitstream 244, and the low-band bitstream 246. For example, the decoder 300 may generate a first channel (e.g., a first output channel 126) and a second channel (e.g., a second output channel 128) based on the down-mix bitstream 216, the low-band bitstream 246, the ICBWE bitstream 242, and the high-band mid channel bitstream 244.
  • the second device 106 may output the first output channel 126 via the first loudspeaker 142.
  • the second device 106 may output the second output channel 128 via the second loudspeaker 144.
  • the first output channel 126 and second output channel 128 may be transmitted as a stereo signal pair to a single output loudspeaker.
  • the ICBWE encoder 204 of FIG. 1 may estimate spectral mapping parameters based on a maximum-likelihood measure, or an open-loop or a closed-loop spectral distortion reduction measure such that a spectral shape (e.g., the spectral envelope or spectral tilt) of a spectrally shaped synthesized non-reference high-band channel is substantially similar to a spectral shape (e.g., spectral envelope) of a non-reference target channel.
  • the spectral mapping parameters may be transmitted to the decoder 300 in the ICBWE bitstream 242 and used at the decoder 300 to generate the output signals 126, 128 having reduced artifacts and improved spatial balance between left and right channels.
  • the encoder 200 receives an audio signal, such as the first audio channel 130.
  • the encoder 200 generates a high band signal (not shown) based on the received audio signal (e.g., the first audio channel 130).
  • the encoder 200 determines a first flag value (of the non harmonic HB flag (x) 910) indicating a harmonic metric of the high band signal.
  • the encoder 200 is further configured to generate a high band excitation signal (not shown) at least partially based on the first flag value (e.g., the non harmonic HB flag (x) 910).
  • the high band excitation signal may be used to generate one or more parameters, such as a gain shape parameter, a gain frame parameter, etc.
  • the encoder 200 outputs an encoded version of the high band signal, such as high-band mid channel bitstream 244.
  • the encoder 200 may determine a gain frame parameter corresponding to a frame of a high-band signal and may compare a gain frame parameter to a threshold. In response to the gain frame parameter being greater than the threshold, the encoder 200 can selectively modify the flag (e.g., the non harmonic HB flag (x) 910 that corresponds to the frame and that indicates a harmonic metric of the high band signal) to generate a modified flag (e.g., the modified non harmonic HB flag (y) 920). The encoder 200 may output the modified flag (e.g., the modified non harmonic HB flag (y) 920).
  • the flag e.g., the non harmonic HB flag (x) 910 that corresponds to the frame and that indicates a harmonic metric of the high band signal
  • the encoder 200 may output the modified flag (e.g., the modified non harmonic HB flag (y) 920).
  • the decoder 300 may receive a bitstream corresponding to an encoded version of an audio signal.
  • the bitstream may include or correspond to the high-band mid channel bitstream 244, the low-band bitstream 246, the ICBWE bitstream 242, the down-mix bitstream 216, or a combination thereof.
  • the decoder 300 may generate a high band excitation signal (not shown) based on a low band excitation signal (not shown) and further based on a flag value (e.g., the modified non harmonic HB flag (y) 920) indicating a harmonic metric of a high band signal.
  • the high band signal corresponds to a high band portion of the audio signal, such as a high band portion of the first audio channel 130.
  • the encoder 200 includes a down-mixer 202, the ICBWE encoder 204, a mid channel BWE encoder 206, a low-band encoder 208, and a filterbank 290.
  • a left channel 212 and a right channel 214 may be provided to the down-mixer 202.
  • the left channel 212 and the right channel 214 may be frequency-domain channels (e.g., transform-domain channels).
  • the left channel 212 and the right channel 214 may be timedomain channels.
  • the down-mixer 202 may be configured to down-mix the left channel 212 and the right channel 214 to generate a down-mix bitstream 216, a mid channel 222, and a low-band side channel 224.
  • the low-band side channel 224 is shown to be estimated, in other alternative implementations, a full bandwidth side channel may be alternatively generated and encoded and a corresponding bit-stream may be transmitted to a decoder.
  • the down-mix bitstream 216 may include down-mix parameters (e.g., shift parameters, target gain parameters, reference channel indicator, interchannel level differences, interchannel phase differences, etc.) based on the left channel 212 and the right channel 214.
  • the down-mix bitstream 216 may be transmitted from the encoder 200 to a decoder, such as a decoder 300 of FIG. 3A .
  • the mid channel 222 may represent an entire frequency band of the channels 212, 214, and the low-band side channel 224 may represent a low-band portion of the channels 212, 214.
  • the mid channel 222 may represent the entire frequency band (20 Hz to 16 kHz) of the channels 212, 214 if the channels 212, 214 are super-wideband channels
  • the low-band side channel 224 may represent the low-band portion (e.g., 20 Hz to 8 kHz or 20 Hz to 6.4 kHz) of the channels 212, 214.
  • the mid channel 222 may be provided to the filterbank 290, and the low-band side channel 224 may be provided to the low-band encoder 208.
  • the filterbank 290 may be configured to separate high-frequency components and low-frequency components of the mid channel 222.
  • the filterbank 290 may separate the high-frequency components of the mid channel 222 to generate a high-band mid channel 292
  • the filterbank 290 may separate the low-frequency components of the mid channel 222 to generate a low-band mid channel 294.
  • the high-band mid channel 292 may span from 8 kHz to 16 kHz
  • the low-band mid channel 294 may span from 20 Hz to 8 kHz. It should be appreciated that the coding mode and the frequency ranges described herein are merely for illustrative purposes and should not be construed as limiting.
  • the coding mode may be different (e.g., a wideband coding mode, a full-band coding mode, etc.) and/or the frequency ranges may be different.
  • the down-mixer 202 may be configured to directly provide the low-band mid channel 294 and the high-band mid channel 292. In such implementations, filtering operations at the filterbank 290 may be bypassed.
  • the high-band mid channel 292 may be provided to the mid channel BWE encoder 206, and the low-band mid channel 294 may be provided to the low-band encoder 208.
  • the low-band encoder 208 may be configured to encode the low-band mid channel 294 and the low-band side channel 224 to generate a low-band bitstream 246. In some implementations, one or more of the following steps including, generation of the low-band side channel 224, encoding of the low-band side channel 224, and including the information corresponding to the low-band side channel as a part of the low-band bitstream 246, may be bypassed. According to one implementation, the low-band encoder 208 may include a mid channel low-band encoder (e.g., not shown and based on ACELP or TCX coding) configured to generate a low-band mid channel bitstream by encoding the low-band mid channel 294.
  • a mid channel low-band encoder e.g., not shown and based on ACELP or TCX coding
  • the low-band encoder 208 may also include a side channel low-band encoder (e.g., not shown and based on ACELP or TCX coding) configured to generate a low-band side channel bitstream by encoding the low-band side channel 224.
  • the low-band bitstream 246 may be transmitted from the encoder 200 to a decoder (e.g., the decoder 300 of FIG. 3A ).
  • the low-band encoder 208 may also generate a low-band excitation 232 that is provided to the mid channel BWE encoder 206.
  • the mid channel BWE encoder 206 may be configured to encode the high-band mid channel 292 to generate a high-band mid channel bitstream 244.
  • the mid channel BWE encoder 206 may estimate linear prediction coefficients (LPCs), gain shape parameters, gain frame parameters, etc., based on the low-band excitation 232 and the high-band mid channel 292 to generate the high-band mid channel bitstream 244.
  • the mid channel BWE encoder 206 may encode the high-band mid channel 292 using time domain bandwidth extension.
  • the high-band mid channel bitstream 244 may be transmitted from the encoder 200 to a decoder (e.g., the decoder 300 of FIG. 3A ).
  • the mid channel BWE encoder 206 may provide one or more parameters 234 to the ICBWE encoder 204.
  • the one or more parameters 234 may include a harmonic high-band excitation (e.g., the harmonic high-band excitation 237 of FIG. 2B ), modulated noise (e.g., the modulated noise 482 of FIG. 4 ), quantized gain shapes, quantized linear prediction coefficients (LPCs), quantized gain frames, etc.
  • the left channel 212 and the right channel 214 may also be provided to the ICBWE encoder 204.
  • the ICBWE encoder 204 may be configured to extract gain mapping parameters associated with the channels 212, 214, spectral shape mapping parameters associated with the channels 212, 214, etc., to facilitate mapping the one or more parameters 234 to the channels 212, 214.
  • the extracted parameters may be included in the ICBWE bitstream 242.
  • the ICBWE bitstream 242 may be transmitted from the encoder 200 to the decoder. Operations associated with the ICBWE encoder 204 are described in further detail with respect to FIGS. 4-5 .
  • the ICBWE encoder 204 of FIG. 2A may estimate spectral shape mapping parameters, quantize the spectral shape mapping parameters into the ICBWE bitstream 242, and transmit the ICBWE bitstream 242 to the decoder.
  • the encoder 200 of FIG. 2A may receive two channels 212, 214 and perform a downmix of the channels 212, 214 to generate the mid channel 222, the down-mix bitstream 216, and, in some implementations, the low-band side channel 224.
  • the encoder 200 may encode the mid channel 222 and the low-band side channel 224 using the low-band encoder 208 to generate the low-band bitstream 246.
  • the encoder 200 may also generate mapping information indicating how to map left and right decoded high-band channels (at the decoder) from a high-band mid channel (at the decoder) using the ICBWE encoder 204.
  • the ICBWE encoder 204 of FIG. 2A may estimate spectral mapping parameters based on a maximum-likelihood measure, or an open-loop or a closed-loop spectral distortion reduction measure such that a spectral envelope of a spectrally shaped synthesized non-reference high-band channel is substantially similar to a spectral envelope of a non-reference target channel.
  • the spectral mapping parameters may be transmitted to the decoder 300 in the ICBWE bitstream 242 and used at the decoder 300 to generate the output signals having reduced artifacts.
  • FIG. 2A may not include the down-mixer 202, the ICBWE encoder 204, and the side LB encoding portion of the low-band encoder 208.
  • there is a single input channel and low-band and high band split encoding is performed.
  • the low band may undergo ACELP encoding, and an excitation from the low-band ACELP, may be used for the high band coding.
  • the mid channel BWE encoder 206 includes a linear prediction coefficient (LPC) estimator 251, an LPC quantizer 252, and an LPC synthesis filter 259.
  • the high-band mid channel 292 is provided to the LPC estimator 251, and the LPC estimator 251 may be configured to predict high-band LPCs 271 based on the high-band mid channel 292.
  • the high-band LPCs 271 are provided to the LPC quantizer 252.
  • the LPC quantizer 252 may be configured to quantize the high-band LPCs to generate quantized high-band LPCs 457 and a high-band LPC bitstream 272.
  • the quantized high-band LPCs 457 are provided to the LPC synthesis filter 259, and the high-band LPC bitstream is provided to a multiplexer 265.
  • the mid channel BWE encoder 206 also includes a high-band excitation generator 299 that includes a non-linear bandwidth extension (BWE) generator 253, a random noise generator 254, a multiplier 255, a noise envelope modulator 256, a summer 257, and a multiplier 258.
  • the low-band excitation 232 from the low-band encoder 208 is provided to the non-linear BWE generator 253.
  • the non-linear BWE generator 253 may perform a non-linear extension on the low-band excitation 232 to generate a harmonic high-band excitation 237.
  • the harmonic high-band excitation 237 may be included in the one or more parameters 234.
  • the harmonic high-band excitation 237 is provided to the multiplier 255 and the noise envelope modulator 256.
  • the signal multiplier may be configured to adjust the harmonic high-band excitation 237 based on a gain factor (Gain(1) (encoder)) to generate a gain-adjusted harmonic high-band excitation 273.
  • the gain-adjusted harmonic high-band excitation 273 is provided to the summer 257.
  • the random noise generator 254 may be configured to generate noise 274 that is provided to the noise envelope modulator 256.
  • the noise envelope modulator 256 may be configured to modulate the noise 274 based on the harmonic high-band excitation 237 to generate modulated noise 482.
  • the modulated noise 482 is provided to the multiplier 258.
  • the multiplier 258 may be configured to adjust the modulated noise 482 based on a gain factor (Gain(2) (encoder)) to generate gain-adjusted modulated noise 275.
  • Gain(2) encoder
  • the gain-adjusted modulated noise 275 is provided to the summer 257, and the summer 257 may be configured to add the gain-adjusted harmonic high-band excitation 273 and the gain-adjusted modulated noise 275 to generate a high-band excitation 276.
  • the high-band excitation 276 is provided to the LPC synthesis filter 259.
  • Gain(1) (encoder) and Gain(2) (encoder) may be vectors with each value of the vector corresponding to a scaling factor of the corresponding signal in subframes.
  • the LPC synthesis filter 259 may be configured to apply the quantized high-band LPCs 457 to the high-band excitation 276 to generate a synthesized high-band mid channel 277.
  • the synthesized high-band mid channel 277 is provided to a high-band gain shape estimator 260 and to a high-band gain shape scaler 262.
  • the high-band mid channel 292 is also provided to the high-band gain shape estimator 260.
  • the high-band gain shape estimator 260 may be configured to generate high-band gain shape parameters 278 based on the high-band mid channel 292 and the synthesized high-band mid channel 277.
  • the high-band gain shape parameters 278 are provided to a high-band gain shape quantizer 261.
  • the high-band gain shape quantizer 261 may be configured to quantize the high-band gain shape parameters 278 and generate quantized high-band gain shape parameters 279.
  • the quantized high-band gain shape parameters 279 are provided to the high-band gain shape scaler 262.
  • the high-band gain shape quantizer 261 may also be configured to generate a high-band gain shape bitstream 280 that is provided to the multiplexer 265.
  • the high-band gain shape scaler 262 may be configured to scale the synthesized high-band mid channel 277 based on the quantized high-band gain shape parameters 279 to generate a scaled synthesized high-band mid channel 281.
  • the scaled synthesized high-band mid channel 281 is provided to a high-band gain frame estimator 263.
  • the high-band gain frame estimator 263 may be configured to estimate high-band gain frame parameters 282 based on the scaled synthesized high-band mid channel 281.
  • the high-band gain frame parameters 282 are provided to a high-band gain frame quantizer 264.
  • the high-band gain frame quantizer 264 may be configured to quantize the high-band gain frame parameters 282 to generate a high-band gain frame bitstream 283.
  • the high-band gain frame bitstream 283 is provided to the multiplexer 265.
  • the multiplexer 265 may be configured to combine the high-band LPC bitstream 272, the high-band gain shape bitstream 280, the high-band gain frame bitstream 283, and other information to generate the high-band mid channel bitstream 244.
  • the other information may include information associated with the modulated noise 482, the harmonic high-band excitation 237, the quantized high-band LPCs 457, etc.
  • the ICBWE encoder 204 may use the information provided to the multiplexer 265 for signal processing operations.
  • the decoder 300 includes a mid channel BWE decoder 302, a low-band decoder 304, an ICBWE decoder 306, a low-band up-mixer 308, a signal combiner 310, a signal combiner 312, and an inter-channel shifter 314.
  • FIG. 3A illustrates the decoder 300 in a stereo implementation.
  • the upmix, Shifter, ICBWE and side LB decoding part of the Mid-Side LB Decoder may be omitted.
  • Input to the decoder is mid LB bitstream and mid HB bitstream, and the LB decoded Mid signal is mixed with the Mid BWE decoded HB signal to generate the decoded Mid signal, which is output from the decoder.
  • the low-band bitstream 246, transmitted from the encoder 200 may be provided to the low-band decoder 304.
  • the low-band bitstream 246 may include the low-band mid channel bitstream and the low-band side channel bitstream.
  • the low-band decoder 304 may be configured to decode the low-band mid channel bitstream to generate a low-band mid channel 326 that is provided to the low-band up-mixer 308.
  • the low-band decoder 304 may also be configured to decode the low-band side channel bitstream to generate a low-band side channel 328 that is provided to the low-band up-mixer 308.
  • the low-band decoder 304 may also be configured to generate a low-band excitation signal 325 that is provided to the mid channel BWE decoder 302.
  • the mid channel BWE decoder 302 may be configured to decode the high-band mid channel bitstream 244 based on the low-band excitation signal 325 to generate one or more parameters 322 (e.g., a harmonic high-band excitation, modulated noise, quantized gain shapes, quantized linear prediction coefficients (LPCs), quantized gain frames, etc.) and a high-band mid channel 324.
  • the one or more parameters 322 may correspond to the one or more parameters 234 of FIG. 2A .
  • the mid channel BWE decoder 302 may use time domain bandwidth extension decoding to decoder the high-band mid channel bitstream 244.
  • the one or more parameters 322 and the high-band mid channel 324 are provided to the ICBWE decoder 306.
  • the ICBWE bitstream 242 may also be provided to the ICBWE decoder 306.
  • the ICBWE decoder 306 may be configured to generate left high-band channel 330 and a right high-band channel 332 based on the ICBWE bitstream 242, the one or more parameters 322, and the high-band mid channel 324.
  • the ICBWE decoder 306 may generate the decoded left high-band channel 330 and the decoded right high-band channel 332. Operations associated with the ICBWE decoder 306 are described in further detail with respect to FIG. 6 .
  • the left high-band channel 330 is provided to the signal combiner 310, and the right high-band channel 332 is provided to the signal combiner 312.
  • the low-band up-mixer 308 may be configured to up-mix the low-band mid channel 326 and the low-band side channel 328 based on the down-mix bitstream 216 to generate a left low-band channel 334 and a right low-band channel 336.
  • the left low-band channel 334 is provided to the signal combiner 310, and the right low-band channel 336 is provided to the signal combiner 312.
  • the signal combiner 310 may be configured to combine the left high-band channel 330 and the left low-band channel 334 to generate an unshifted left channel 340.
  • the unshifted left channel 340 is provided to the inter-channel shifter 314.
  • the signal combiner 312 may be configured to combine the right high-band channel 332 and the right low-band channel 336 to generate an unshifted right channel 342.
  • the unshifted right channel 342 is provided to the inter-channel shifter 314. It should be noted that in some implementations, operations associated with the inter-channel shifter 314 may be bypassed. For example, if the down-mixer at the corresponding encoder is not configured to shift any of the channels prior to mid channel and side channel generation, operations associated with the inter-channel shifter 314 may be bypassed.
  • the inter-channel shifter 314 may be configured to shift the unshifted left channel 340 based on the shift information associated with the down-mix bitstream 216 to generate a left channel 350.
  • the inter-channel shifter 314 may also be configured to shift the unshifted right channel 342 based on the shift information associated with the down-mix bitstream 216 to generate a right channel 352.
  • the inter-channel shifter 314 may use the shift information from the down-mix bitstream 216 to shift the unshifted left channel 340, the unshifted right channel 342, or a combination thereof, to generate the left channel 350 and the right channel 352.
  • the left channel 350 is a decoded version of the left channel 212
  • the right channel 352 is a decoded version of the right channel 214.
  • the mid channel BWE decoder 302 includes an LPC dequantizer 360, a high-band excitation generator 362, an LPC synthesis filter 364, a high-band gain shape dequantizer 366, a high-band gain shape scaler 368, a high-band gain frame dequantizer 370, and a high-band gain frame scaler 372.
  • the high-band LPC bitstream 272 is provided to the LPC dequantizer 360.
  • the LPC dequantizer may extract dequantized high-band LPCs 640 from the high-band LPC bitstream 272. As described with respect to FIG. 6 , the dequantized high-band LPCs 640 may be used by the ICBWE decoder 306 for signal processing operations.
  • the low-band excitation signal 325 is provided to the high-band excitation generator 362.
  • the high-band excitation generator 362 may generate a harmonic high-band excitation 630 based on the low-band excitation signal 325 and may generate modulated noise 632. As described with respect to FIG. 6 , the harmonic high-band excitation 630 and the modulated noise 632 may be used by the ICBWE decoder 306 for signal processing operations.
  • the high-band excitation generator 362 may also generate a high-band excitation 380.
  • the high-band excitation generator 362 may be configured to operate in a substantially similar manner as the high-band excitation generator 299 of FIG. 2B .
  • the high-band excitation generator 362 may perform similar operations on the low-band excitation signal 325 (as the high-band excitation generator 299 performs on the low-band excitation 232) to generate the high-band excitation 380.
  • the high-band excitation 380 may be substantially similar to the high-band excitation 276 of FIG. 2B .
  • the high-band excitation 380 is provided to the LPC synthesis filter 364.
  • the LPC synthesis filter 364 may apply the dequantized high-band LPCs 640 to the high-band excitation 380 to generate a synthesized high-band mid channel 382.
  • the synthesized high-band mid channel 382 is provided to the high-band gain shape scaler 368.
  • the high-band gain shape bitstream 280 is provided to the high-band gain shape dequantizer 366.
  • the high-band gain shape dequantizer 366 may be configured to extract a dequantized high-band gain shape 648 from the high-band gain shape bitstream 280.
  • the dequantized high-band gain shape 648 is provided to the high-band gain shape scaler 368 and to the ICBWE decoder 306 for signal processing operations, as described with respect to FIG. 6 .
  • the high-band gain shape scaler 368 may be configured to scale the synthesized high-band mid channel 382 based on the dequantized high-band gain shape 648 to generate a scaled synthesized high-band mid channel 384.
  • the scaled synthesized high-band mid channel 384 is provided to the high-band gain frame scaler 372.
  • the high-band gain frame bitstream 283 is provided to the high-band gain frame dequantizer 370.
  • the high-band gain frame dequantizer 370 may be configured to extract a dequantized high-band gain frame 652 from the high-band gain frame bitstream 283.
  • the dequantized high-band gain frame 652 is provided to the high-band gain frame scaler 372 and to the ICBWE decoder 306 for signal processing operations, as described with respect to FIG. 6 .
  • the high-band gain frame scaler 372 may apply the dequantized high-band gain frame 652 to the scaled synthesized high-band mid channel 384 to generate a decoded high-band mid channel 662.
  • the decoded high-band mid channel 662 is provided to the ICBWE decoder 306 for signal processing operations, as described with respect to FIG. 6 .
  • FIGS. 4-5 a particular implementation of the ICBWE encoder 204 is shown.
  • a first portion 204a of the ICBWE encoder 204 is shown in FIG. 4
  • a second portion 204b of the ICBWE encoder 204 is shown in FIG. 5 .
  • the first portion 204a of the ICBWE encoder 204 includes a high-band reference channel determination unit 404 and a high-band reference channel indicator encoder 406.
  • the left channel 212 and the right channel 214 are provided to the high-band reference channel determination unit 404.
  • the high-band reference channel determination unit 404 may be configured to determine whether the left channel 212 or the right channel 214 is the high-band reference channel.
  • the high-band reference channel determination unit 404 may generate a high-band reference channel indicator 440 indicating whether the left channel 212 or the right channel 214 is used to estimate the non-reference channel 459.
  • the high-band reference channel indicator 440 may be estimated based on energies of the left channel 212 and the right channel 214, the inter-channel shift between the left channel 212 and the right channel 214, the reference channel indicator generated at the down-mixer, the reference channel indicator based on the non-casual shift estimation, and the left and right high-band channel energies.
  • the high-band reference channel indicator 440 may be determined using multi-stage techniques where each stage improves an output of a previous stage to determine the high-band reference channel indicator 440. For example, at a first stage, the high-band reference channel determination unit 404 may generate the high-band reference channel indicator 440 based on a reference signal. To illustrate, the high-band reference channel determination unit 404 may generate the high-band reference channel indicator 440 to indicate that the right channel 214 is designated as a high-band reference channel in response to determining that the reference signal indicates that the second audio channel 132 (e.g., a right audio signal) is designated as a reference signal.
  • the second audio channel 132 e.g., a right audio signal
  • the high-band reference channel determination unit 404 may generate the high-band reference channel indicator 440 to indicate that the left channel 212 is designated as a high-band reference channel in response to determining that the reference signal indicates that the first audio channel 130 (e.g., a left audio signal) is designated as a reference signal.
  • the first audio channel 130 e.g., a left audio signal
  • the high-band reference channel determination unit 404 may refine (e.g., update) the high-band reference channel indicator 440 based on a gain parameter, a first energy associated with the left channel 212, a second energy associated with the right channel 214, or a combination thereof.
  • the high-band reference channel determination unit 404 may set (e.g., update) the high-band reference channel indicator 440 to indicate that the left channel 212 is designated as a reference channel and that the right channel 214 is designated as a non-reference channel in response to determining that the gain parameter satisfies a first threshold, that a ratio of the first energy (e.g., the left full-band energy) and the right energy (e.g., the right full-band energy) satisfies a second threshold, or both.
  • a ratio of the first energy e.g., the left full-band energy
  • the right energy e.g., the right full-band energy
  • the high-band reference channel determination unit 404 may set (e.g., update) the high-band reference channel indicator 440 to indicate that the right channel 214 is designated as a reference channel and that the left channel 212 is designated as a non-reference channel in response to determining that the gain parameter fails to satisfy the first threshold, that the ratio of the first energy (e.g., the left full-band energy) and the right energy (e.g., the right full-band energy) fails to satisfy the second threshold, or both.
  • the first energy e.g., the left full-band energy
  • the right energy e.g., the right full-band energy
  • the high-band reference channel determination unit 404 may refine (e.g., further update) the high-band reference channel indicator 440 based on the left energy and the right energy.
  • the high-band reference channel determination unit 404 may set (e.g., update) the high-band reference channel indicator 440 to indicate that the left channel 212 is designated as a reference channel and that the right channel 214 is designated as a non-reference channel in response to determining that a ratio of the left energy (e.g., the left HB energy) and the right energy (e.g., the right HB energy) satisfies a threshold.
  • the high-band reference channel determination unit 404 may set (e.g., update) the high-band reference channel indicator 440 to indicate that the right channel 214 is designated as a reference channel and that the left channel 212 is designated as a non-reference channel in response to determining that a ratio of the left energy (e.g., the left HB energy) and the right energy (e.g., the right HB energy) fails to satisfy a threshold.
  • the high-band reference channel indicator encoder 406 may encode the high-band reference channel indicator 440 to generate a high-band reference channel indicator bitstream 442.
  • the first portion 204a of the ICBWE encoder 204 also includes a non-reference high-band excitation generator 408, a linear prediction coefficient (LPC) synthesis filter 410, a high-band target channel generator 412, a spectral mapping estimator 414, and a spectral mapping quantizer 416.
  • the non-reference high-band excitation generator 408 includes a signal multiplier 418, a signal multiplier 420, and a signal combiner 422.
  • the harmonic high-band excitation 237 is provided to the signal multiplier 418, and modulated noise 482 is provided to the signal multiplier 420.
  • the harmonic high-band excitation 237 may be based on a harmonic modeling (e.g., (.) ⁇ 2 or
  • the harmonic high-band excitation 237 may be based on the non-reference low band excitation signal.
  • the modulated noise 482 may be based on the envelope modulated noise of the harmonic high-band excitation 237 or the low-band excitation 232.
  • the modulated noise 482 may be random noise that is temporally shaped based on the non-linear harmonic high-band excitation signal 237 (e.g., a whitened non-linear harmonic high-band excitation signal).
  • the temporal shaping may be based on a voice-factor controlled first-order adaptive filter.
  • the signal multiplier 418 applies a gain (Gain(a) (encoder)) to the harmonic high-band excitation 237 to generate a gain-adjusted harmonic high-band excitation 452, and the signal multiplier 420 applies a gain (Gain(b) (encoder)) to the modulated noise 482 to generate gain-adjusted modulated noise 454.
  • the gain-adjusted harmonic high-band excitation 452 and the gain-adjusted modulated noise 454 are provided to the signal combiner 422.
  • the signal combiner 422 may be configured to combine the gain-adjusted harmonic high-band excitation 452 and the gain-adjusted modulated noise 454 to generate a non-reference high-band excitation 456.
  • the non-reference high-band excitation 456 may be generated in a similar manner as the high-band mid channel excitation.
  • the gains (Gain(a) (encoder) and Gain(b) (encoder))) may be modified versions of the gains used to generate the high-band mid channel excitation based on the relative energies of the high-band reference and high-band non-reference channels, the noise floor of the high-band non-reference channel, etc.
  • Gain(a) (encoder) and Gain(b) (encoder) may be vectors with each value of the vector corresponding to a scaling factor of the corresponding signal in subframes.
  • the mixing gains (Gain(a) (encoder) and Gain(b) (encoder)) may also be based on the voice factors corresponding to a high-band mid channel, a high-band non-reference channel, or derived from the low-band voice factor or voicing information.
  • the mixing gains (Gain(a) (encoder) and Gain(b) (encoder)) may also be based on the spectral envelope corresponding to the high-band mid channel and the high-band non-reference channel.
  • the mixing gains may be based on the number of talkers or background sources in the signal and the voiced-unvoiced characteristic of the left (or reference, target) and right (or target, reference) channels.
  • the non-reference high-band excitation 456 is provided to the LPC synthesis filter 410.
  • the LPC synthesis filter 410 may be configured to generate a synthesized non-reference high-band 458 based on the non-reference high-band excitation 456 and quantized high-band LPCs 457 (e.g., LPCs of the high-band mid channel). For example, the LPC synthesis filter 410 may apply the quantized high-band LPCs 457 to the non-reference high-band excitation 456 to generate the synthesized non-reference high-band 458.
  • the synthesized non-reference high-band 458 is provided to the spectral mapping estimator 414.
  • the high-band reference channel indicator 440 may be provided (as a control signal) to a switch 424 that receives the left channel 212 and the right channel 214 as inputs. Based on the high-band reference channel indicator 440, the switch 424 may provide either the left channel 212 or the right channel 214 to the high-band target channel generator 412 as a non-reference channel 459. For example, if the high-band reference channel indicator 440 indicates that the left channel 212 is the reference channel, the switch 424 may provide the right channel 214 to the high-band target channel generator 412 as the non-reference channel 459. If the high-band reference channel indicator 440 indicates that the right channel 214 is the reference channel, the switch 424 may provide the left channel 212 to the high-band target channel generator 412 as the non-reference channel 459.
  • the high-band target channel generator 412 may filter low-band signal components of the non-reference channel 459 to generate a non-reference high-band channel 460 (e.g., the high-band portion of the non-reference channel 459).
  • the non-reference high-band channel 460 may be spectrally flipped based on further signal processing operations (e.g., a spectral flip operation).
  • the non-reference high-band channel 460 is provided to the spectral mapping estimator 414.
  • the spectral mapping estimator 414 may be configured to generate spectral mapping parameters 462 that map the spectrum (or energies) of the non-reference high-band channel 460 to the spectrum of the synthesized non-reference high-band 458.
  • the spectral mapping estimator 414 may generate filter coefficients that map the spectrum of the non-reference high-band channel 460 to the spectrum of the synthesized non-reference high-band 458. For example, the spectral mapping estimator 414 determines the spectral mapping parameters 462 that map the spectral envelope of the synthesized non-reference high-band 458 to be substantially approximate to the spectral envelope of the non-reference high-band channel 460 (e.g., the non-reference high-band signal). The spectral mapping parameters 462 are provided to the spectral mapping quantizer 416.
  • the spectral mapping quantizer 416 may be configured to quantize the spectral mapping parameters 462 to generate a high-band spectral mapping bitstream 464 and quantized spectral mapping parameters 466.
  • the second portion 204b of the ICBWE encoder 204 includes a spectral mapping applicator 502, a gain mapping estimator and quantizer 504, and a multiplexer 590.
  • the synthesized non-reference high-band 458 and the quantized spectral mapping parameters 466 are provided to the spectral mapping applicator 502.
  • the spectral mapping applicator 502 may be configured to generate a spectrally shaped synthesized non-reference high-band 514 based on the synthesized non-reference high-band 458 and the quantized spectral mapping parameters 466.
  • spectral mapping applicator 502 may apply the quantized spectral mapping parameters to the synthesized non-reference high-band 458 to generate the spectrally shaped synthesized non-reference high-band 514.
  • the spectral mapping applicator 502 may apply the spectral mapping parameters 462 (e.g., the unquantized parameter) to the synthesized non-reference high-band 458 to generate the spectrally shaped synthesized non-reference high-band 514.
  • the spectrally shaped synthesized non-reference high-band 514 may be used to estimate the high-band gain mapping parameters.
  • the spectrally shaped synthesized non-reference high-band 514 is provided to the gain mapping estimator and quantizer 504.
  • the spectral mapping estimator 414 may use a spectral shape application that filters using the above-described filter h(z).
  • the spectral mapping estimator 414 may estimate and quantize a value for the parameter ( u i ) .
  • the filter h(z) may be a first order filter and the spectral envelope of a signal may be approximated as a ratio of autocorrelation coefficients of lag index one (lag(1)) and lag index zero (lag(0)).
  • t(n) represents the n th sample of the non-reference high-band channel 460
  • x(n) represents the n th sample of the synthesized non-reference high-band 458
  • y(n) represents the n th sample of the spectrally shaped synthesized non-reference high-band 514
  • y ( n ) h ( n ) ⁇ x ( n ), where ⁇ is the symbol for the signal convolution operation.
  • the non-reference channel has a steeper roll-off in spectral energy at higher frequencies
  • smaller values of (u) may be preferred (including negative values).
  • a smaller value of (u) envelopes the signal such that there is a steeper roll off in spectral energy at higher frequencies.
  • values of (u) whose absolute value is ⁇ 1 i.e.,
  • the previous frame's (u) may be used as the current frame's (u). If there are one or more real solutions and there are no real solution with an absolute value less than one, the previous frame's u final value may be used for the current frame. If there are one or more real solutions and there is one real solution with an absolute value less than one, the current frame may use the real solution as the u final value. If there are one or more real solutions and there is more than one real solution with an absolute value less than one, the current frame may use the smallest (u) value as the u final value or the current frame may use the (u) value that is closest to the previous frame's (u) value.
  • the spectral mapping parameters may be estimated based on the spectral analysis of the non-reference high-band channel and the non-reference high-band excitation 456, to maximize the spectral match between the spectrally shaped non-reference HB signal and the non-reference HB target channel.
  • the spectral mapping parameters may be based on the LP analysis of the non-reference high-band channel and the synthesized high-band mid channel 520 or high-band mid channel 292.
  • a non-reference high-band channel 516, a synthesized high-band mid channel 520, and the high-band mid channel 292 are also provided to the gain mapping estimator and quantizer 504.
  • the gain mapping estimator and quantizer 504 may generate a high-band gain mapping bitstream 522 and a quantized high-band gain mapping bitstream 524 based on the spectrally shaped synthesized non-reference high-band 514, the non-reference high-band channel 516, the synthesized high-band mid channel 520, and the high-band mid channel 292.
  • the gain mapping estimator and quantizer 504 may generate a set of adjustment gain parameters based on the synthesized high-band mid channel 520 and the spectrally shaped synthesized non-reference high-band 514.
  • the gain mapping estimator and quantizer 504 may determine a synthesized high-band gain corresponding to a difference (or ratio) between an energy (or power) of the synthesized high-band mid channel 510 and an energy (or power) of the spectrally shaped synthesized non-reference high-band 514.
  • the set of adjustment gain parameters may indicate the synthesized high-band gain.
  • the gain mapping estimator and quantizer 504 may generate the first set of adjustment gain parameters based on a set of adjustment gain parameters and a predicted set of adjustment gain parameters.
  • the first set of adjustment gain parameters may indicate a difference between the set of adjustment gain parameters and the predicted set of adjustment gain parameters.
  • the high-band reference channel indicator bitstream 442, the high-band spectral mapping bitstream 464, and the high-band gain mapping bitstream 522 are provided to the multiplexer 590.
  • the multiplexer 590 may be configured to generate the ICBWE bitstream 242 by multiplexing the high-band reference channel indicator bitstream 442, the high-band spectral mapping bitstream 464, and the high-band gain mapping bitstream 522.
  • the ICBWE bitstream 242 may be transmitted to a decoder, such as the decoder 300 of FIG. 3A .
  • the ICBWE decoder 306 includes a non-reference high-band excitation generator 602, a LPC synthesis filter 604, a spectral mapping applicator 606, a spectral mapping dequantizer 608, a high-band gain shape scaler 610, a non-reference high-band gain scaler 612, a gain mapping dequantizer 616, a reference high-band gain scaler 618, and a high-band channel mapper 620.
  • the non-reference high-band excitation generator 602 includes a signal multiplier 622, a signal multiplier 624, and a signal combiner 626.
  • a harmonic high-band excitation 630 (generated from the low-band bitstream 246) is provided to the signal multiplier 622, and modulated noise 632 is provided to the signal multiplier 624.
  • the signal multiplier 622 applies a gain (Gain(a) (decoder)) to the harmonic high-band excitation 630 to generate a gain-adjusted harmonic high-band excitation 634
  • the signal multiplier 624 applies a gain (Gain(b) (decoder)) to the modulated noise 632 to generate gain-adjusted modulated noise 636.
  • Gain(a) (decoder) and Gain(b) (decoder) may be vectors with each value of the vector corresponding to a scaling factor of the corresponding signal in subframes.
  • the mixing gains (Gain(a) (decoder) and Gain(b) (decoder)) may also be based on the voice factors corresponding to synthesized high-band mid channel, synthesized high-band non-reference channel, or derived from the low-band voice factor or voicing information.
  • the mixing gains (Gain(a) (decoder) and Gain(b) (decoder)) may also be based on the spectral envelope corresponding to the synthesized high-band mid channel, synthesized high-band non-reference channel, or derived from the low-band voice factor or voicing information.
  • the mixing gains (Gain(a) (decoder) and Gain(b) (decoder)) may be based on the number of talkers or background sources in the signal and the voiced-unvoiced characteristic of the left (or reference, target) and right (or target, reference) channels.
  • the gain-adjusted harmonic high-band excitation 634 and the gain-adjusted modulated noise 636 are provided to the signal combiner 626.
  • the signal combiner 626 may be configured to combine the gain-adjusted harmonic high-band excitation 634 and the gain-adjusted modulated noise 636 to generate a non-reference high-band excitation 638.
  • the non-reference high-band excitation 638 may be generated in a substantially similar manner as the non-reference high-band excitation 456 of the ICBWE encoder 204.
  • the LPC synthesis filter 604 may be configured to generate a synthesized non-reference high-band 642 based on the non-reference high-band excitation 638 and dequantized high-band LPCs 640 (from a bitstream transmitted from the encoder 200) of the high-band mid channel.
  • the LPC synthesis filter 604 may apply the dequantized high-band LPCs 640 to the non-reference high-band excitation 638 to generate the synthesized non-reference high-band 642.
  • the synthesized non-reference high-band 642 is provided to the spectral mapping applicator 606.
  • the high-band spectral mapping bitstream 464 from the encoder 200 is provided to the spectral mapping dequantizer 608.
  • the spectral mapping dequantizer 608 may be configured to decode the high-band spectral mapping bitstream 464 to generate a dequantized spectral mapping bitstream 644.
  • the dequantized spectral mapping bitstream 644 is provided to the spectral mapping applicator 606.
  • the spectral mapping applicator 606 may be configured to apply the dequantized spectral mapping bitstream 644 to the synthesized non-reference high-band 642 (in a substantially similar manner as at the ICBWE encoder 204) to generate a spectrally shaped synthesized non-reference high-band 646.
  • the dequantized spectral mapping bitstream 644 may be applied as a filter as follows: 1 1 ⁇ u * z ⁇ 1 where u is the quantized spectral mapping parameters.
  • the spectrally shaped synthesized non-reference high-band 646 is provided to the high-band gain shape scaler 610.
  • the high-band gain shape scaler 610 may be configured to scale the spectrally shaped synthesized non-reference high-band 646 based on a quantized high-band gain shape (from a bitstream transmitted from the encoder 200) to generate a scaled signal 650.
  • the scaled signal 650 is provided to the non-reference high-band gain scaler 612.
  • a multiplier 651 may be configured to multiply a dequantized high-band gain frame 652 (e.g., the mid channel gain frame) by quantized high-band gain mapping parameters 660 (from the high-band gain mapping bitstream 522) to generate a resulting signal 656.
  • the resulting signal 656 may be generated by applying the product of the dequantized high-band gain frame 652 and the quantized high-band gain mapping parameters 660 or using two sequential gain stages.
  • the resulting signal 656 is provided to the non-reference high-band gain scaler 612.
  • the non-reference high-band gain scaler 612 may be configured to scale the scaled signal 650 by the resulting signal 656 to generate a decoded high-band non-reference channel 658.
  • the decoded high-band non-reference channel 658 is provided to the high-band channel mapper 620.
  • a predicted reference channel gain mapping parameter may be applied to the mid channel to generate the decoded high-band non-reference channel 658.
  • the high-band gain mapping bitstream 522 from the encoder 200 is provided to the gain mapping dequantizer 616.
  • the gain mapping dequantizer 616 may be configured to decode the high-band gain mapping bitstream 522 to generate quantized high-band gain mapping parameters 660.
  • the quantized high-band gain mapping parameters 660 are provided to the reference high-band gain scaler 618, and a decoded high-band mid channel 662 (generated from the high-band mid channel bitstream 244) is provided to the reference high-band gain scaler 618.
  • the reference high-band gain scaler 618 may be configured to scale the decoded high-band mid channel 662 based on the quantized high-band gain mapping parameters 660 to generate a decoded high-band reference channel 664.
  • the decoded high-band reference channel 664 is provided to the high-band channel mapper 620.
  • the high-band channel mapper 620 may be configured to designate the decoded high-band reference channel 664 or the decoded high-band non-reference channel 658 as the left high-band channel 330. For example, the high-band channel mapper 620 may determine whether the left high-band channel 330 is a reference channel (or non-reference channel) based on the high-band reference channel indicator bitstream 442 from the encoder 200. Using similar techniques, the high-band channel mapper 620 may be configured to designate the other of the decoded high-band reference channel 664 and the decoded high-band non-reference channel 658 as the right high-band channel 332.
  • the quantized spectral mapping parameters 466 may be used to generate a synthesized high-band channel (e.g., the spectrally shaped synthesized non-reference high-band 514) having a spectral envelope that approximates the spectral envelope of a high-band channel (e.g., the non-reference high-band channel 460).
  • the quantized spectral mapping parameters 466 may be used at the decoder 300 to generate a synthesized high-band channel (e.g., the spectrally shaped synthesized non-reference high-band 646) that approximates the spectral envelope of the high-band channel at the encoder 200.
  • a synthesized high-band channel e.g., the spectrally shaped synthesized non-reference high-band 646
  • reduced artifacts may occur when reconstructing the high-band at the decoder 300 because the high-band may have a similar spectral envelope as the low-band on the encoder-side.
  • the method 700 may be performed by the first device 104 of FIG. 1 .
  • the method 700 may be performed by the encoder 200.
  • the method 700 includes selecting, at an encoder of a first device, a left channel or a right channel as a non-reference target channel based on a high-band reference channel indicator, at 702.
  • the switch 424 may select the left channel 212 or the right channel 214 as the non-reference high-band channel 460 based on the high-band reference channel indicator 440.
  • the method 700 includes generating a synthesized non-reference high-band channel based on a non-reference high-band excitation corresponding to the non-reference target channel, at 704.
  • the LPC synthesis filter 410 may generate the synthesized non-reference high-band 458 by applying the quantized high-band LPCs 457 to the non-reference high-band excitation 456.
  • the method 700 also includes generating a high-band portion of the non-reference target channel.
  • the method 700 also includes estimating one or more spectral mapping parameters based on the synthesized non-reference high-band channel and a high-band portion of the non-reference target channel, at 706.
  • the spectral mapping estimator 414 may estimate the spectral mapping parameters 462 based on the synthesized non-reference high-band 458 and the non-reference high-band channel 460.
  • the one or more spectral mapping parameters are estimated based on a first autocorrelation value of the non-reference target channel at lag index one and a second autocorrelation value of the non-reference target channel at lag index zero.
  • the one or more spectral mapping parameters may include a particular spectral mapping parameter of at least two spectral mapping parameter candidates.
  • the particular spectral mapping parameter may correspond to a spectral mapping parameter of a previous frame if the at least two spectral mapping parameter candidates are non-real candidates.
  • the particular spectral mapping parameter may correspond to a spectral mapping parameter of a previous frame if each spectral mapping parameter candidate of the at least two spectral mapping parameter candidates have an absolute value that is greater than one.
  • the particular spectral mapping parameter may correspond to a spectral mapping parameter candidate having an absolute value less than one if only one spectral mapping parameter candidate of the at least two spectral mapping parameter candidates has an absolute value less than one.
  • the particular spectral mapping parameter may correspond to a spectral mapping parameter candidate having a smallest value if more than one of the at least two spectral mapping parameter candidates have an absolute value less than one.
  • the particular spectral mapping parameter may correspond to a spectral mapping parameter of a previous frame if more than one of the at least two spectral mapping parameter candidates have an absolute value less than one.
  • the method 700 also includes applying the one or more spectral mapping parameters to the synthesized non-reference high-band channel to generate a spectrally shaped synthesized non-reference high-band channel, at 708.
  • Applying the one or more spectral parameters may correspond to filtering the synthesized non-reference high-band channel based on a spectral mapping filter.
  • the spectrally shaped synthesized non-reference high-band channel may have a spectral envelope that is similar to a spectral envelope of the non-reference target channel. For example, referring to FIG.
  • the spectral mapping applicator 502 may apply the quantized spectral mapping parameters 466 to the synthesized non-reference high-band 458 to generate the spectrally shaped synthesized non-reference high-band 514.
  • the spectrally shaped synthesized non-reference high-band 514 may have a spectral envelope that is similar to a spectral envelope of the non-reference high-band channel 460.
  • the spectrally shaped synthesized non-reference high-band channel may be used to estimate a gain mapping parameter.
  • the method 700 also includes generating an encoded bitstream based on the one or more spectral mapping parameters, at 710.
  • the spectral mapping quantizer 416 may generate the high-band spectral mapping bitstream 464 based on the spectral mapping parameters 462.
  • the method 700 further includes transmitting the encoded bitstream to a second device, at 712.
  • the transmitter 110 may transmit the ICBWE bitstream 242 (that includes the high-band spectral mapping bitstream 464) to the second device 106.
  • the method 700 may enable improved high-band estimation for audio encoding and audio decoding.
  • the quantized spectral mapping parameters 466 may be used to generate a synthesized high-band channel (e.g., the spectrally shaped synthesized non-reference high-band 514) having a spectral envelope that approximates the spectral envelope of a high-band channel (e.g., the non-reference high-band channel 460).
  • the quantized spectral mapping parameters 466 may be used at the decoder 300 to generate a synthesized high-band channel (e.g., the spectrally shaped synthesized non-reference high-band 646) that approximates the spectral envelope of the high-band channel at the encoder 200.
  • reduced artifacts may occur when reconstructing the high-band at the decoder 300 because the high-band may have a similar spectral envelope as the low-band on the encoder-side.
  • a method 800 of extracting spectral mapping parameters is shown.
  • the method 800 may be performed by the second device 106 of FIG. 1 .
  • the method 800 may be performed by the decoder 300.
  • the method 800 includes generating, at a decoder of a device, a reference channel and a non-reference target channel from a received bitstream, at 802.
  • the bitstream may be received from an encoder of a second device.
  • the decoder 300 may generate a non-reference channel from the low-band bitstream 246.
  • the reference channel and the non-reference target channel may be upmixed channels generated at the decoder 300.
  • the decoder 300 may generate the left and right channels without generating the reference channel and the non-reference target channel.
  • the method 800 also includes generating a synthesized non-reference high-band channel based on a non-reference high-band excitation corresponding to the non-reference target channel, at 804.
  • the LPC synthesis filter 604 may generate the synthesized non-reference high-band 642 by applying the dequantized high-band LPCs 640 to the non-reference high-band excitation 638.
  • the method 800 further includes extracting one or more spectral mapping parameters from a received spectral mapping bitstream, at 806.
  • the spectral mapping bitstream may be received from the encoder of the second device.
  • the spectral mapping dequantizer 608 may extract the dequantized spectral mapping bitstream 644 from the high-band spectral mapping bitstream 464.
  • the method 800 also includes generating a spectrally shaped non-reference high-band channel by applying the one or more spectral mapping parameters to the synthesized non-reference high-band channel, at 808.
  • the spectrally shaped synthesized non-reference high-band channel may have a spectral envelope that is similar to a spectral envelope of the non-reference target channel.
  • the spectral mapping applicator 606 may apply the dequantized spectral mapping bitstream 644 to the synthesized non-reference high-band to generate the spectrally shaped synthesized non-reference high-band 646.
  • the spectrally shaped synthesized non-reference high-band 646 may have a spectral envelope that is similar to a spectral envelope of the non-reference target channel.
  • the method 800 also includes generating an output signal based at least on the spectrally shaped non-reference high-band channel, the reference channel, and the non-reference target channel, at 810.
  • the decoder 300 may generate at least one of the output signals 126, 128 based on the spectrally shaped synthesized non-reference high-band 646.
  • the method 800 further includes rendering the output signal at playback device, at 812.
  • the loudspeakers 142, 144 may render and output the output signals 126, 128, respectively.
  • the method 800 may enable improved high-band estimation for audio encoding and audio decoding.
  • the quantized spectral mapping parameters 466 may be used to generate a synthesized high-band channel (e.g., the spectrally shaped synthesized non-reference high-band 514) having a spectral envelope that approximates the spectral envelope of a high-band channel (e.g., the non-reference high-band channel 460).
  • the quantized spectral mapping parameters 466 may be used at the decoder 300 to generate a synthesized high-band channel (e.g., the spectrally shaped synthesized non-reference high-band 646) that approximates the spectral envelope of the high-band channel at the encoder 200.
  • reduced artifacts may occur when reconstructing the high-band at the decoder 300 because the high-band may have a similar spectral envelope as the low-band on the encoder-side.
  • the encoder 900 may include or correspond to the encoder 200 of FIG. 1 or the mid channel BWE encoder 206 of FIG. 2B .
  • the encoder 900 includes the LPC estimator 251, the LPC quantizer 252, the high-band excitation generator 299 (including the non-linear BWE generator 253, the multiplier 255, the summer 257, the random noise generator 254, the noise envelope modulator 256, and the multiplier 258), the LPC synthesis filter 259, the high-band gain shape estimator 260, the high-band gain shape quantizer 261, the high-band gain shape scaler 262, the high-band gain frame estimator 263, the high-band gain frame quantizer 264, the multiplexer 265, a non harmonic high band detector 906, a high band mixing gains estimator 912, and a noise envelope control parameter estimator 916. Additionally, in some implementations, the encoder 900 also includes a non harmonic high band flag modifier 922.
  • the non harmonic high band detector 906 is configured to generate the non harmonic HB flag (x), (e.g., the multi-source flag) 910.
  • the non harmonic HB flag (e.g., the multi-source flag, x) 910 may have a value that indicates a harmonic metric of a high band signal, such as the high-band mid channel 292.
  • the non harmonic high band detector 906 may receive low band voicing (w) 902, a previous frame's gain frame 904, and the high-band mid channel 292, and the non harmonic high band detector 906 may determine the non harmonic HB flag (e.g., the multi-source flag, x) 910 based on the low band voicing (w) 902, the previous frame's gain frame 904, and the high-band mid channel 292, as further described herein.
  • the non harmonic HB flag e.g., the multi-source flag, x
  • the high band mixing gains estimator 912 is configured to receive low band voicing factors (z) 908 and the non harmonic HB flag (x) 910.
  • the high band mixing gains estimator 912 is configured to generate mixing gains (e.g., a first gain "Gain(1)" (encoder) and a second gain “Gain(2)" (encoder)) based on the low band voicing factors (z) 908 and the non harmonic HB flag (x) 910, as further described herein. It is noted that mixing at a high band excitation generator of the decoder is performed based on Gain(1) (decoder) and the Gain(2) (decoder), as described with reference to FIG. 10 .
  • the low-band excitation 232 is non-linearly extended by the non-linear BWE generator 253 to generate the harmonic high-band excitation 237.
  • the noise envelope control parameter estimator 916 is configured to receive low band voice factors (z) 914 and the non harmonic HB flag (x) 910.
  • the low band voice factors (z) 914 may be the same as or different from the low band voicing factors (z) 908.
  • the noise envelope control parameter estimator 916 is configured to generate a noise envelope control parameter(s) 918 (encoder) based on the low band voice factors (z) 914 and the non harmonic HB flag (x) 910.
  • the noise envelope control parameter estimator 916 is configured to provide the noise envelope control parameter(s) 918 (encoder) to the noise envelope modulator 256.
  • a "parameter (encoder)" refers to a parameter used by an encoder
  • a “parameter (decoder)” refers to a parameter used by a decoder.
  • Envelope modulated noise (e.g., modulated noise 482 (encoder)) is used for generating the noisy component of the high-band excitation 276.
  • an envelope used by the noise envelope modulator 256 (to generate the modulated noise 482 (encoder)) may be extracted based on the harmonic high-band excitation 237.
  • the envelope modulation is performed by the noise envelope modulator 256 by applying a low pass filter on the absolute values of the harmonic high-band excitation 237.
  • the low pass filter parameters are determined based on the noise envelope control parameter(s) 918 (encoder) determined by the noise envelope control parameter estimator 916.
  • the decoder may determine a noise envelope control parameter (decoder) based on low band voice factors and a non harmonic HB flag, such as the non harmonic HB flag (x) 910, the modified non harmonic HB flag (y) 920, or another non harmonic HB flag.
  • a non harmonic HB flag such as the non harmonic HB flag (x) 910, the modified non harmonic HB flag (y) 920, or another non harmonic HB flag.
  • the gain-adjusted harmonic high-band excitation 273 may not be generated or the Gain(1) (encoder) may be set to a value of zero.
  • the noise envelope control parameter(s) 918 indicate that the envelope to be applied to the noise 274 is to be a fast-varying envelope (e.g., the noise envelope modulator 256 can use a small length of samples - the noise envelope estimation process for each sample is less heavily reliant on the absolute value of the harmonic HB Excitation's corresponding sample).
  • the flag indicates that the high-band is non harmonic
  • the noise envelope control parameter(s) 918 indicate that the envelope to be applied to the noise 274 is to be a slow-varying envelope
  • the noise envelope modulator 256 can use a large length of samples - the noise envelope estimation process for each sample is more heavily reliant on the absolute value of the harmonic HB Excitation's corresponding sample.
  • the flag indicates whether multiple audio sources are associated with the high-band mid signal.
  • the non harmonic flag or the multi-source flag (x) is used to control the noise envelope parameter 916, 1016, and the Gain (1) and Gain(2) for the high-band exictataion generation 299, 362.
  • the noise envelope modulator 256 may apply the envelope (e.g., based on the noise envelope control parameter(s) 918) to the noise 274 to generate the modulated noise 482 (encoder).
  • the high-band excitation 276 (e.g., a mixed HB excitation determined based on the harmonic high-band excitation 237, Gain1 (encoder), the modulated noise 482 (encoded), and Gain2 (encoder)) is used for further processing.
  • the encoder 900 may estimate and quantize one or more LPCs to be applied to the high-band excitation 276 to generate the synthesized high-band mid channel 277.
  • high band gain shapes and high band gain frame are further extracted and quantized for transmission to the decoder, such as the decoder 300 of FIG. 1 .
  • the non harmonic high band flag modifier 922 is configured to receive the high-band gain frame parameters 282 and the non harmonic HB flag (x) 910.
  • the non harmonic high band flag modifier 922 is configured to generate a modified non harmonic HB flag (y) 920 based on the high-band gain frame parameters 282 and the non harmonic HB flag (x) 910.
  • the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920 may indicate the same harmonic metric for the high-band (e.g., the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920 may have the same value).
  • the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920 may indicate different harmonic metrics for the high-band (e.g., the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920 may have different values).
  • modification of the non harmonic HB flag (x) 910 is described as being based on the high-band gain frame parameters 282 (e.g., pre-quantized HB gain frame parameters), in other implementations, the non harmonic HB flag (x) 910 may be modified based on the high-band gain frame bitstream 283 (e.g., quantized HB gain frame parameters) or both the high-band gain frame bitstream 283 (e.g., the quantized HB gain frame parameters) and the high-band gain frame parameters 282 (e.g., pre-quantized HB gain frame parameters). Additionally, it is noted that modification of the non harmonic HB flag (x) 910 is optional. In some implementations, such as stereo operation implementations, the encoder 900 (e.g., a TD-BWE encoder) outputs one or more other parameters for use in in the ICBWE as described with reference to FIGS. 2B and 11 .
  • the encoder 900 e.g., a TD-BWE encoder
  • the decoder 1000 may include or correspond to the decoder 300 of FIG. 1 or the ICBWE decoder 306 of FIG. 3 .
  • the decoder 1000 includes the LPC dequantizer 360, the high-band excitation generator 362, the LPC synthesis filter 364, the high-band gain shape dequantizer 366, the high-band gain shape scaler 368, the high-band gain frame dequantizer 370, the high-band gain frame scaler 372, a high band mixing gains estimator 1012, and a noise envelope control parameter estimator 1016.
  • the decoder 1000 is a TD-BWE decoder used for mid signal high band coding (e.g., mid channel BWE decoding).
  • the decoder 1000 is configured to receive one or more bitstreams.
  • the one or more bit streams may include the high-band LPC bitstream 272, the high-band gain shape bitstream 280 and the high-band gain frame bitstream 283.
  • the decoder 1000 is further configured to receive a modified non harmonic HB flag (y) 1020.
  • the modified non harmonic HB flag (e.g., the multi-source flag, y) 1020 may include or correspond to the non harmonic HB flag (x) 910 or the modified non harmonic HB flag (y) 920.
  • the decoder 1000 may receive the modified non harmonic HB flag (y) 920 (from the encoder 900) as the modified non harmonic HB flag (y) 1020.
  • the decoder 1000 may receive the non harmonic HB flag (x) 910 (from the encoder 900) and may generate the modified non harmonic HB flag (y) 1020.
  • the decoder 1000 may include a non harmonic high band flag modifier, such as the non harmonic high band flag modifier 922 of FIG. 9 , and may receive the non harmonic HB flag (x) 910.
  • the decoder 1000 may also receive a high band gain frame parameter, such as the high-band gain frame parameters 282 from the encoder 900, and the decoder 1000 may determine the non harmonic HB flag (y) 1020 based on the high band gain frame parameter and the non harmonic HB flag (x) 910.
  • the decoder 1000 is configured to generate the modified non harmonic HB flag (y) 1020 independent of the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920.
  • the decoder 1000 may also receive low band voice factors (z) 1014.
  • the low band voice factors (z) 1014 may include or correspond to the low band voice factors (z) 914 of FIG. 9 .
  • the decoder 1000 may receive the low band voice factors (z) 914 as the low band voice factors (z) 1014.
  • the decoder 1000 may calculate the low band voice factors (z) 1014 or may receive the low band voice factors (z) 1014 from another component, such as the low-band decoder 304, the mid channel BWE decoder 302, or the ICBWE decoder 306 of FIG. 3A .
  • the decoder 1000 may perform operations similar to those described with reference to the ICBWE decoder 306 of FIGS. 3A and 3B and similar to those described with reference to the encoder 900 of FIG. 9 .
  • the high band mixing gains estimator 1012 may perform operations similar to those described with reference to the high band mixing gains estimator 912 of FIG. 9 .
  • the high band mixing gains estimator 1012 may receive the low band voice factors (z) 1014 and the modified non harmonic HB flag (y) 1020.
  • the high band mixing gains estimator 1012 Based on the low band voice factors (z) 1014 and the modified non harmonic HB flag (y) 1020, the high band mixing gains estimator 1012 generates mixing gains (e.g., Gain(1) (decoder) and Gain(2) (decoder)), as further described herein.
  • the mixing gains e.g., Gain(1) (decoder) and Gain(2) (decoder)
  • the high-band excitation generator 362 may correspond to the high-band excitation generator 299 of FIG. 9 and perform operations similar to those described with respect to the high-band excitation generator 299 of FIG. 9 .
  • the noise envelope control parameter estimator 1016 may perform operations similar to the noise envelope control parameter estimator 916 of FIG. 9 .
  • the noise envelope control parameter estimator 1016 receives the low band voice factors (z) 1014 and the modified non harmonic HB flag (y) 1020.
  • the noise envelope control parameter estimator 1016 generates the noise envelope control parameter 1018 (decoder) based on the low band voice factors (z) 1014 and the modified non harmonic HB flag (y) 1020, similar to the generation of the noise envelope control parameter(s) 918 described with reference to FIG. 9 .
  • the decoder 1000 Based on the modified non harmonic HB flag (y) 1020, the decoder 1000 generates a high-band excitation 380.
  • Generation of the high-band excitation 380 my include the high-band excitation generator 362 generating modulated noise and performing a mixing operation to generate the high-band excitation 380.
  • the modulated noise may be generated based on the noise envelope control parameter 1018 (decoder).
  • the mixing operation may be performed based on Gain(1) (decoder) and Gain(2) (decoder), as described with reference to FIG. 9 .
  • decoder 1000 Based on the generated high-band excitation 380, decoder values of the gain frame and the gain shapes, and other parameters from the BWE bitstream are determined. Additionally, the decoder 1000 generates the decoded high-band mid channel 662. For example, dequantized high-band LPCs 640, dequantized high-band gain shape 648, and dequantized high-band gain frame 652 are used to generate the decoded high-band mid channel.
  • the modified non harmonic HB flag (y) 1020 used by the decoder 1000 may differ (in value for a particular frame) from the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920 used by the encoder 900, the high-band excitation 276 on which the gain frame and gain shapes are estimated at the encoder 900 may be different from the high-band excitation 380 on which the gain frame and gain shapes are applied at the decoder 1000.
  • the decoder 1000 (e.g., a TD-BWE decoder) also outputs some other parameters which are used in the ICBWE decoding in case of stereo operation, as described with reference to FIGS. 3A , 3B , and 6 .
  • envelope shape modulated noise for the ICBWE, the target high band channel, and the mid channel may be similar or may differ for the different channels. Also, mixing gains may differ for the mid channel, the ICBWE, and the target high band channel, and may be determined as described in FIGS. 11-12 .
  • BWE may be performed with different non-linear mixing, different non-linear configurations, etc., based on the value of the flag, such as the non harmonic HB flag (x) 910.
  • the value of the flag may indicate the presence of multiple sources or multiple objects, etc., that may correspond to different coding modes (e.g., voiced, unvoiced, background, etc.).
  • the non harmonic HB flag (x) 910 may be referred to as a multi-source flag.
  • enhanced coding and reproduction may be achieved by the encoder/decoder of FIGS. 9-12 .
  • a particular implementation of a third portion 1100 of an inter-channel bandwidth extension encoder of the encoder of FIG. 1 is shown.
  • the third portion 1100 is included in the ICBWE encoder 204.
  • the third portion 1100 includes a high band mixing gains estimator 1102.
  • the high band mixing gains estimator 1102 is configured to receive the mixing gains (e.g., Gain(1) (encoder) and Gain(2) (encoder)), described with reference to FIGS. 2B and 9 , and to receive the modified non harmonic HB flag (y) 920, described with reference to FIG. 9 .
  • the high band mixing gains estimator 1102 is configured to generate Gain(a) (encoder) and Gain(b) (encoder), which may be provided to the non-reference high-band excitation generator 408 of FIG. 4 .
  • the Gain(a) (encoder) and the Gain(b) (encoder) are determined based on the relative energies of the HB reference and non reference channels, the noise floor of the HB non reference channel, etc. Additionally, or alternatively, the Gain(a) (encoder) and the Gain(b) (encoder) may be the same as the Gain(1) (encoder) and the Gain(2) (encoder) described with reference to FIGS. 2B and 9 .
  • the Gain(a) (encoder) and Gain(b) (encoder) are an average value of Gain(1) (encoder) and Gain (2) (encoder) respectively estimated in multiple subframes per each processing frame, and these values are modified further based on the modified non harmonic HB flag (y) 920.
  • the high band mixing gains estimator 1102 may determine the values of Gain(a) (encoder) and Gain(b) (encoder) based on the non harmonic HB flag (x) 910.
  • portion 1200 of an inter-channel bandwidth extension decoder of the decoder of FIG. 1 is shown.
  • the portion 1200 is included in the ICBWE decoder 306.
  • the portion 1200 includes a high band mixing gains estimator 1202.
  • the high band mixing gains estimator 1202 is configured to receive the mixing gains (e.g., Gain(1) (decoder) and Gain(2) (decoder)), described with reference to FIGS. 3B and 10 , and to receive the modified non harmonic HB flag (y) 920, described with reference to FIGS. 9 and 10 .
  • the high band mixing gains estimator 1202 is configured to generate Gain(a) (decoder) and Gain(b) (decoder).
  • the Gain(a) (decoder) and the Gain(b) (decoder) may be provided to the non-reference high-band excitation generator 602 of FIG. 6 .
  • the Gain (a) (decoder) and Gain (b) (decoder) are an average value of Gain(1) (decoder) and Gain (2) (decoder) respectively estimated in multiple subframes per each processing frame, and these values are modified further based on the modified non harmonic HB flag (y) 1020.
  • the high band mixing gains estimator 1202 may determine the values of Gain(a) (decoder) and Gain(b) (decoder) based on the non harmonic HB flag (x) equivalent either transmitted from an encoder or estimated at the ICBWE decoder 306 itself.
  • the following example is provided along with pseudo-code related to generation, use, and modification of the flag (e.g., the non harmonic HB flag (x) 910), the modified flag (e.g., the modified non harmonic HB flag (y) 920), or both.
  • the non harmonic HB flag e.g., the non harmonic HB flag (x) 910)
  • the non harmonic HB flag e.g., the non harmonic HB flag (x) 910
  • An average of the LB voicing is determined based on a strength of correlation of the LB signal at pitch lag. Voicing is different from voice factors: a voice factor is a parameter of the algebraic code-excited linear prediction (ACELP) coding method of mid LB which signifies the ratio of a mixture of the adaptive codebook gain and the fixed codebook gain). Additionally, a previous (e.g., most recent) frame's gain frame may be retrieved.
  • ACELP algebraic code-excited linear prediction
  • the HB energy ratio, the average of the LB voicing, and the previous frame's gain frame may be used to calculate the likelihood (denoted pu below) of the HB being non harmonic based on a Gaussian Mixture Model (GMM) with pre-computed mean and covariance components for non harmonic HB signals. Additionally, the ratio, the average of the LB voicing, and the previous frame's gain frame may be used to calculate the likelihood (denoted pv below) of the HB being harmonic based on a Gaussian Mixture Model with pre-computed mean and covariance components for harmonic HB signals. Based on these likelihoods (pu and pv), different possible relations between these likelihoods may be classified as varying levels of harmonicity of HB.
  • GMM Gaussian Mixture Model
  • examples below depict illustrative pseudo-code (e.g., simplified C-code in floating point) that may be compiled and stored in a memory, such as the memory 153 of the first device 104 or a memory of the second device 106 of FIG. 1 , or the memory 1832 of FIG. 18 .
  • the pseudo-code illustrates a possible implementation of aspects described herein.
  • the pseudo-code includes comments which are not part of the executable code.
  • a beginning of a comment is indicated by a forward slash and asterisk (e.g., "/ ⁇ ") and an end of the comment is indicated by an asterisk and a forward slash (e.g., " ⁇ /").
  • a comment "COMMENT” may appear in the pseudo-code as / ⁇ COMMENT ⁇ /.
  • the "&&" operator indicates a logical AND operation.
  • the " ⁇ " operator indicates a logical OR operation.
  • the ">"operator represents “greater than”
  • the " ⁇ ” operator indicates “less than”.
  • the term "f' following a number indicates a floating point (e.g., decimal) number format.
  • may represent a multiplication operation
  • "+” or “sum” may represent an addition operation
  • "abs” may represent an absolute value operation
  • "avg” may represent an average operation
  • "++” may indicate an increment
  • "-” may indicate a subtraction operation
  • "/" may represent a division operation.
  • Example 1A is presented below which classifies different possible relations between likelihoods as varying levels of harmonicity of a high-band.
  • the operations of Example 1A are performed by the non harmonic high band detector 906 of FIG. 9 .
  • Example 1B is presented below which classifies different possible relations between likelihoods as one of two different levels of harmonicity of a high band.
  • the non-harmonic HB flag may indicate harmonic or non harmonic.
  • the operations of Example 1B are performed by the non harmonic high band detector 906 of FIG. 9 .
  • Example 2 is presented below which extracts the noisy envelope based on the noisy envelope control parameter and applies it on the white noise signal.
  • Example 2 also includes operations to determine a noise envelope control parameter, such as the noise envelope control parameter(s) 918 (encoder) or the noise envelope control parameter 1018 (decoder).
  • the operations of Example 2 are performed by the noise envelope control parameter estimator 916 and the noise envelope modulator 256 of FIG. 9 or the noise envelope control parameter estimator 1016 and the high-band excitation generator 362 of FIG. 10 .
  • Example 2 includes a non harmonic flag having at least three possible values, in other implementations, similar operations may be performed based on a non harmonic flag having two possible values. Additionally or alternatively, similar operations may be performed based on the multi-source flag MSFlag of Example 1B.
  • Control of how the noise envelope is estimated based on the Non_Harmonic_HB_Flag enables control the envelope of the noise, which in effect controls the "buzziness" of the decoded high-band signal.
  • the Non Harmonic HB Flag when implemented at a decoder, such as the decoder 300 or the decoder 1000, the Non Harmonic HB Flag is replaced by the received Non Harmonic HB Flag, which may be either the same or it may be the modified non harmonic HB Flag. In other implementations, when implemented at the decoder, the Non Harmonic HB Flag is determined at the decoder.
  • Example 3 is presented below which the excitation mixing (e.g., gains) is based on the Non Harmonic HB Flag.
  • the operations of Example 3 are performed by the high-band excitation generator 299 of FIG. 9 or the high-band excitation generator 362 of FIG. 10 .
  • Example 3 includes anon harmonic flag having at least three possible values, in other implementations, similar operations may be performed based on a non harmonic flag having two possible values. Additionally or alternatively, similar operations may be performed based on the multi-source flag MSFlag of Example 1B.
  • the method 1300 may be performed by the first device 104 of FIG. 1 .
  • the method 1300 may be performed by the encoder 200, such as at the encoder 900 of FIG. 9 (e.g., a mid channel BWE encoder).
  • the method 1300 includes receiving an audio signal at an encoder, at 1302.
  • the audio signal may correspond to the mid channel 222 of FIG. 2 that is received at the encoder 900.
  • the audio signal may correspond to an audio signal received via the first audio channel 130 or the second audio channel 132 of FIG. 1 .
  • the method 1300 includes generating a high band signal based on the received audio signal, at 1304.
  • the high band signal may correspond to the high-band mid channel 292 of FIG. 2 .
  • the method 1300 also includes determining a first flag value indicating a harmonic metric of the high band signal, at 1306.
  • the first flag value may correspond to a value of the non harmonic HB flag (x) 910 of FIG. 9 .
  • the harmonic metric may be determined to have a value of strong harmonic, weak harmonic, or strong non-harmonic.
  • the harmonic metric may be determined to have a value of harmonic or non harmonic.
  • an encoded version of the high band signal may be transmitted, at 1308.
  • the encoded version of the high band signal may correspond to the high-band mid channel bitstream 244, the ICBWE bitstream 242, the down-mix bitstream 216, or any combination thereof, of FIG. 2 .
  • the method 1300 may also include generating a low band signal based on the received audio signal (e.g., the low-band mid channel 294 of FIG. 2A ) and determining the flag value at least partially based on a low band voicing value (e.g., the low band voicing (w) 902 of FIG. 9 ) of the low band signal.
  • a gain frame value e.g., the high-band gain frame parameters 282 of FIG. 9
  • the first flag value corresponding to a second frame that follows the first frame of the audio signal may be determined at least partially based on the gain frame value of the first frame (e.g., the previous frame's gain frame 904 of FIG. 9 ).
  • the first flag value may be determined at least partially based on a ratio of an energy metric of a frame of the high band signal (e.g., the high-band mid channel 292 of FIG. 9 ) to a multi-frame energy metric of the high-band signal, such as described with reference to the non harmonic high band detector 906 of FIG. 9 .
  • a high band excitation signal may be generated based on a harmonically extended low band excitation signal and further based on the first flag value to generate a synthesized version of the high band signal, such as the scaled synthesized high-band mid channel 281 of FIG. 9 generated using the high-band excitation 276 that is based on the harmonic high-band excitation 237 and using mixing gains and noise envelope control parameter(s) 918 that are based on the non harmonic HB flag (x) 910.
  • the encoder may modify the first flag value based on a gain frame parameter corresponding to the synthesized version exceeding a threshold, such as at the non harmonic high band flag modifier 922.
  • the method 1300 may be performed at a stereo encoder that receives the audio signal (e.g., the first audio channel 130) and a second audio signal (e.g., the second audio channel 132) and generates a mid signal (e.g., the mid channel 222) based on the audio signal and the second audio signal.
  • the high band signal may correspond to a high-band portion of the mid signal (e.g., the high-band mid channel 292 of FIG. 2 and FIG. 9 ).
  • the first flag value may be used to generate the high-band excitation 276 in the BWE encoder of FIG. 9 .
  • the first flag value may be used to generate a non-reference high band excitation signal at least partially based on the first flag value during an inter-channel band width extension (ICBWE) encoding operation (e.g., the non-reference high-band excitation 638 of FIG. 6 generated using mixing gains from the high band mixing gains estimator 1102 of FIG. 11 ).
  • IBWE inter-channel band width extension
  • the method 1300 may enable improved encoding accuracy based on the first flag value indicating a harmonic metric of the high band signal.
  • the first flag value may be used to control generation the high-band excitation 276, such as depicted with reference to the high-band excitation generator 299 of FIG. 9 .
  • Enhanced encoding accuracy may enable improved accuracy of audio playback at a decoding device, such as the second device 106 of FIG. 1 .
  • the method 1400 may be performed by the first device 104 of FIG. 1 .
  • the method 1400 may be performed by the encoder 200, such as at the encoder 900 of FIG. 9 (e.g., a mid channel BWE encoder).
  • the method 1400 includes determining a gain frame parameter corresponding to a frame of a high band signal, at 1402.
  • the gain frame parameter may correspond to one or more of the high-band gain frame parameters 282 of FIG. 9 .
  • the gain frame parameter may be generated by generating a high-band excitation signal (e.g., the high-band excitation 276 of FIG. 9 ) based on a low-band excitation signal and based on a flag (e.g., the non harmonic HB flag (x) 910 of FIG. 9 ), generating a synthesized version of the high-band signal (e.g., the scaled synthesized high-band mid channel 281 of FIG. 9 ) based on the high-band excitation signal, and comparing the frame of the high-band signal to a frame of the synthesized version of the high-band signal (e.g., to generate the high-band gain frame parameters 282).
  • a high-band excitation signal e.g., the high-band excitation 2
  • the method 1400 includes comparing the gain frame parameter to a threshold, at 1404.
  • the non harmonic high band flag modifier 922 may compare one or more of the high-band gain frame parameters to a threshold amount.
  • a relatively large value of the high-band gain frame parameter may indicate that a frame of a high band signal that is predicted to be strongly harmonic may instead be non-harmonic.
  • the method 1400 includes, in response to the gain frame parameter being greater than the threshold, modifying a flag that corresponds to the frame and that indicates a harmonic metric of the high band signal.
  • the flag e.g., the non harmonic HB flag (x) 910 of FIG. 9
  • the flag may be modified from having a first value indicating the high band signal is harmonic to having a second value indicating the high band signal is non-harmonic.
  • the method 1400 further includes, transmitting the modified flag, at 1408.
  • the modified flag e.g., the modified non harmonic HB flag (y) 920 of FIG. 9
  • the modified flag may be transmitted to the second device 106 via the high-band mid channel bitstream 244, the ICBWE bitstream 242, the down-mix bitstream 216, or any combination thereof, of FIG. 2 .
  • the method 1400 may enable improved encoding accuracy by correcting flag values that are determined to incorrectly indicate a harmonic metric of the high band.
  • the modified flag value may be used in additional encoding, such as to determine mixing gain values for inter-channel BWE encoding, as described with reference to FIGs, 2 , 6 , and 11 .
  • Sending the modified flag value to a decoder may enable the decoder to generate a more accurate synthesized version of an audio signal at the decoder.
  • Enhanced decoding accuracy may enable improved accuracy of audio playback at a decoding device.
  • the method 1500 may be performed by the first device 104 of FIG. 1 .
  • the method 1500 may be performed by the encoder 200, such as at the encoder 900 of FIG. 9 (e.g., a mid channel BWE encoder).
  • the method 1500 includes receiving at least a first audio signal and a second audio signal at an encoder, at 1502.
  • the first audio signal may correspond to the left channel of FIG. 2 and the second audio signal may correspond to the right channel of FIG. 2 .
  • the method 1500 includes performing a downmix operation on the first audio signal and the second audio signal to generate a mid signal, at 1504.
  • the mid signal may correspond to the mid channel 222 of FIG. 2 .
  • the downmix operation may be performed by the downmixer 202 of FIG. 2 .
  • the method 1500 includes generating a low-band mid signal and a high-band mid signal based on the mid signal, at 1506.
  • the low-band mid signal may correspond to the low-band mid channel 294 of FIG. 2
  • the high-band mid signal may correspond to the high-band mid channel 292 of FIG. 2 .
  • the low-band mid signal corresponds to a low frequency portion of the mid signal
  • the high-band mid signal corresponds to a high frequency portion of the mid signal.
  • the method 1500 includes determining, based at least partially on a voicing value of the low band signal and a gain value corresponding to the high-band mid signal, a value of a multi-source flag associated with the high-band mid signal, at 1508.
  • the flag may correspond to a value of the non harmonic HB flag (x) 910 of FIG. 9 , which may be referred to as a multi-source flag.
  • the multi-source flag indicates whether multiple audio sources are associated with the high-band mid signal.
  • the value of the flag may be based on the low band voicing (w) 902 and the previous frame's gain frame 904 of FIG. 9 .
  • the method 1500 includes generating a high-band mid excitation signal based at least in part on the multi-source flag, at 1510.
  • the high-band mid excitation signal may include or correspond to the high-band excitation 276 of FIG. 9 .
  • the encoder may be configured to generate the high band excitation signal by combining a non-linear harmonic excitation signal (e.g., the harmonic high-band excitation 237) and modulated noise (e.g., the modulated noise 482), and the encoder may control mixing of the non-linear harmonic excitation signal and the modulated noise based on the multi-source flag.
  • a non-linear harmonic excitation signal e.g., the harmonic high-band excitation 237
  • modulated noise e.g., the modulated noise 482
  • the encoder may be configured to set a value of at least one of a first gain associated with the non-linear harmonic excitation signal (e.g., Gain(1) of FIG. 9 ) and a second gain associated with the modulated noise (e.g., Gain(2) of FIG. 9 ) based on the multi-source flag.
  • the encoder may be configured to generate modulated noise based on the non-linear harmonic excitation signal (e.g., the harmonic high-band excitation 237) and further based on a noise envelope control parameter (e.g., the noise envelope control parameter(s) 918 of FIG. 9 ).
  • the noise envelope control parameter may be at least partially based on the multi-source flag (e.g., the noise envelope control parameter estimator 916 is responsive to the non harmonic HB flag (x) 910), and the encoder may be configured to generate the high-band mid excitation signal at least partially based on the modulated noise (e.g., via applying Gain (2) to the modulated noise 482 at the multiplier 258 and combining with an output of the multiplier 255 of FIG. 9 to generate the high-band excitation 276).
  • the noise envelope control parameter may be further based on a low band voice factor, such as one or more of the low band voice factors (z) 914 of FIG. 9 .
  • the method 1500 includes generating a bitstream based at least in part on the high-band mid excitation signal, at 1512.
  • the bitstream may correspond to the high-band mid channel bitstream 244, the ICBWE bitstream 242, the down-mix bitstream 216, or any combination thereof, of FIG. 2A .
  • the method 1500 further includes transmitting the bitstream and the multi-source flag from the encoder to a device, at 1514.
  • the bitstream may correspond to the high-band mid channel bitstream 244, the ICBWE bitstream 242, the down-mix bitstream 216, or any combination thereof, of FIG. 2A , and the bitstream and the multi-source flag may be transmitted to the second device 106 (e.g., a decoder) of FIG. 1 .
  • the method 1500 may enable improved encoding accuracy based on the flag indicating a harmonic metric of the high band signal that is used to control generation the high-band excitation 276, such as depicted with reference to the high-band excitation generator 299 of FIG. 9 .
  • Enhanced encoding accuracy may enable improved accuracy of audio playback at a decoding device, such as the second device 106 of FIG. 1 .
  • a method 1600 of audio signal decoding is shown.
  • the method 1600 may be performed by the second device 106 of FIG. 1 .
  • the method 1600 may be performed by the decoder 300, such as at the decoder 1000 of FIG. 10 (e.g., a mid channel BWE decoder).
  • the method 1600 includes receiving a bitstream corresponding to an encoded version of an audio signal, at 1602.
  • the decoder 300 may receive the bitstream including the low-band bitstream 246, the high-band mid channel bitstream 244, the ICBWE bitstream 242, the down-mix bitstream 216, or any combination thereof.
  • the method 1600 also includes generating a high band excitation signal based on a low band excitation signal and further based on a first flag value indicating a harmonic metric of a high band signal, where the high band signal corresponds to a high band portion of the audio signal, at 1604.
  • the harmonic metric may have a value of strong harmonic, weak harmonic, or strong non-harmonic, such as described with reference to the non harmonic HB flag (x) 910 and the modified non harmonic HB flag (y) 920, 1020 of FIG. 9 and FIG. 10 .
  • the harmonic metric may have a value of harmonic or non-harmonic, as described herein.
  • the bitstream includes the flag value.
  • the mid channel BWE encoder illustrated in FIG. 9 may determine the modified non harmonic HB flag (y) 920 and may transmit the modified non harmonic HB flag (y) 920 (e.g., via data in the bitstream indicating a value of the modified non harmonic HB flag (y) 920) to the decoder 300.
  • the decoder determines the flag value at least partially based on a low band voicing value of a low band signal, where the low band signal corresponds to a low band portion of the audio signal.
  • the mid channel BWE decoder depicted in FIG. 10 may include the non harmonic high band detector 906 and the non harmonic high band flag modifier 922 of FIG.
  • the bitstream includes a first flag value (e.g., the non harmonic HB flag (x) 910) and the decoder determines a gain frame parameter corresponding to a frame of the high band signal and modifies the first flag value to generate the flag value in response to the gain frame parameter being greater than a threshold (e.g., the decoder of FIG. 10 receives the non harmonic HB flag (x) 910 from an encoder and include the non harmonic high band flag modifier 922 to generate the modified harmonic HB flag (y) 1020).
  • a threshold e.g., the decoder of FIG. 10 receives the non harmonic HB flag (x) 910 from an encoder and include the non harmonic high band flag modifier 922 to generate the modified harmonic HB flag (y) 1020).
  • the high band excitation signal may be generated by non-linearly extending the low band excitation signal and combining the non-linearly extended low band excitation signal with modulated noise, such as at the high-band excitation generator 362 of FIG. 10 functioning in a similar manner as described with reference to the high-band excitation generator 299 of FIG. 9 .
  • the method 1600 may include setting a value of at least one of a first gain associated with the non-linearly extended low band excitation signal and a second gain associated with the modulated noise based on the first flag value, such as Gain(1) and Gain(2) output by the high band mixing gains estimator 1012 and input to the high-band excitation generator 362 of FIG. 10 .
  • the modulated noise may be generated by non-linearly extending the low band excitation signal and by modulating a noise signal based on the non-linearly extended low band excitation signal and further based on a noise envelope control parameter.
  • the noise envelope control parameter may be at least partially based on the first flag value, such as noise envelope control parameter 1018 of FIG. 10 generated by the noise envelope control parameter estimator 1016 based on the modified non harmonic HB flag (y) 920.
  • the noise envelope control parameter may be further based on the low band voice factor (z) 1014 received at the noise envelope control parameter estimator 1016.
  • a synthesized version of the high band signal may be generated based on the high band excitation signal.
  • the high-band excitation signal may be used to generate the decoded high-band mid channel 662 of FIG. 3B , FIG. 6 and FIG. 10 .
  • the decoded high-band mid channel 662 may be used to generate the left high-band channel 330 and the right high-band channel 332.
  • the synthesized version of the high band signal may be combined with a synthesized version of a low band signal (e.g., the left low-band channel 334 or the right low-band channel 336) to generate a synthesized version of the audio signal (e.g., the left channel 350 or the right channel 352).
  • the decoder may be a stereo decoder and may generate the high band excitation signal during an inter-channel bandwidth extension (ICBWE) operation, such as the non-reference high-band excitation 638 of the ICBWE decoder 306 of FIG. 6 .
  • ICBWE inter-channel bandwidth extension
  • the method 1600 may enable improved accuracy of synthesized audio signals where the original audio signal has a non-harmonic high band. Enhanced accuracy may enable an improved user experience during audio playback at a decoding device, such as the second device 106 of FIG. 1 .
  • FIG. 17 a block diagram of a particular illustrative example of a device (e.g., a wireless communication device) is depicted and generally designated 1700.
  • the device 1700 may have fewer or more components than illustrated in FIG. 17 .
  • the device 1700 may correspond to the first device 104 of FIG. 1 or the second device 106 of FIG. 1 .
  • the device 1700 may perform one or more operations described with reference to systems and methods of FIGS. 1-16 .
  • the device 1700 includes a processor 1706 (e.g., a central processing unit (CPU)).
  • the device 1700 may include one or more additional processors 1710 (e.g., one or more digital signal processors (DSPs)).
  • the processors 1710 may include a media (e.g., speech and music) coder-decoder (CODEC) 1708, and an echo canceller 1712.
  • the CODEC 1708 may include the decoder 300, the encoder 200, or a combination thereof.
  • the encoder 200 may include the ICBWE encoder 204, and the decoder 300 may include the ICBWE decoder 306.
  • the encoder 200 may be configured to generate the non harmonic HB flag (x) 910.
  • the encoder 200 is configured to modify the non harmonic HB flag (x) 910 to generate the modified non harmonic HB flag (y) 920.
  • the encoder 200 may be configured to use the non harmonic HB flag (x) 910, the modified non harmonic HB flag (y) 920, or both, as described herein with reference to at least FIGS. 1 and 9-16 .
  • the decoder 300 may be configured to receive or generate a non harmonic HB flag, a modified non harmonic HB flag, or both.
  • the decoder 300 may be configure to use the non harmonic HB flag, the modified non harmonic HB flag, or both, as described herein with reference to at least FIGS. 1 and 9-16 .
  • the device 1700 may include a memory 153 and a CODEC 1734.
  • the CODEC 1708 is illustrated as a component of the processors 1710 (e.g., dedicated circuitry and/or executable programming code), in other implementations one or more components of the CODEC 1708, such as the decoder 300, the encoder 200, or a combination thereof, may be included in the processor 1706, the CODEC 1734, another processing component, or a combination thereof.
  • the device 1700 may include the transmitter 110 coupled to an antenna 1742.
  • the device 1700 may include a display 1728 coupled to a display controller 1726.
  • One or more speakers 1748 may be coupled to the CODEC 1734.
  • One or more microphones 1746 may be coupled, via the input interfaces 112, to the CODEC 1734.
  • the speakers 1748 may include the first loudspeaker 142, the second loudspeaker 144 of FIG. 1 , or a combination thereof.
  • the microphones 1746 may include the first microphone 146, the second microphone 148 of FIG. 1 , or a combination thereof.
  • the CODEC 1734 may include a digital-to-analog converter (DAC) 1702 and an analog-to-digital converter (ADC) 1704.
  • DAC digital-to-analog converter
  • ADC analog-to-digital converter
  • the memory 153 may include instructions 191 executable by the processor 1706, the processors 1710, the CODEC 1734, another processing unit of the device 1700, or a combination thereof, to perform one or more operations described with reference to FIGS. 1-16 .
  • One or more components of the device 1700 may be implemented via dedicated hardware (e.g., circuitry), by a processor executing instructions to perform one or more tasks, or a combination thereof.
  • the memory 153 or one or more components of the processor 1706, the processors 1710, and/or the CODEC 1734 may be a memory device, such as a random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
  • RAM random access memory
  • MRAM magnetoresistive random access memory
  • STT-MRAM spin-torque transfer MRAM
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM era
  • the memory device may include instructions (e.g., the instructions 191) that, when executed by a computer (e.g., a processor in the CODEC 1734, the processor 1706, and/or the processors 1710), may cause the computer to perform one or more operations described with reference to FIGS. 1-16 .
  • a computer e.g., a processor in the CODEC 1734, the processor 1706, and/or the processors 1710
  • the memory 153 or the one or more components of the processor 1706, the processors 1710, and/or the CODEC 1734 may be a non-transitory computer-readable medium that includes instructions (e.g., the instructions 191) that, when executed by a computer (e.g., a processor in the CODEC 1734, the processor 1706, and/or the processors 1710), cause the computer perform one or more operations described with reference to FIGS. 1-16 .
  • the device 1700 may be included in a system-in-package or system-on-chip device 1722 (e.g., a mobile station modem (MSM)).
  • the processor 1706, the processors 1710, the display controller 1726, the memory 153, the CODEC 1734, and the transmitter 110 are included in a system-in-package or the system-on-chip device 1722.
  • an input device 1730, such as a touchscreen and/or keypad, and a power supply 1744 are coupled to the system-on-chip device 1722.
  • each of the display 1728, the input device 1730, the speakers 1748, the microphones 1746, the antenna 1742, and the power supply 1744 can be coupled to a component of the system-on-chip device 1722, such as an interface or a controller.
  • the device 1700 may include a wireless telephone, a mobile communication device, a mobile phone, a smart phone, a cellular phone, a laptop computer, a desktop computer, a computer, a tablet computer, a set top box, a personal digital assistant (PDA), a display device, a television, a gaming console, a music player, a radio, a video player, an entertainment unit, a communication device, a fixed location data unit, a personal media player, a digital video player, a digital video disc (DVD) player, a tuner, a camera, a navigation device, a decoder system, an encoder system, or any combination thereof.
  • PDA personal digital assistant
  • FIG. 18 a block diagram of a particular illustrative example of a base station 1800 is depicted.
  • the base station 1800 may have more components or fewer components than illustrated in FIG. 18 .
  • the base station 1800 may include the first device 104 or the second device 106 of FIG. 1 .
  • the base station 1800 may operate according to one or more of the methods or systems described with reference to FIGS. 1-16 .
  • the base station 1800 may be part of a wireless communication system.
  • the wireless communication system may include multiple base stations and multiple wireless devices.
  • the wireless communication system may be a Long Term Evolution (LTE) system, a Code Division Multiple Access (CDMA) system, a Global System for Mobile Communications (GSM) system, a wireless local area network (WLAN) system, or some other wireless system.
  • LTE Long Term Evolution
  • CDMA Code Division Multiple Access
  • GSM Global System for Mobile Communications
  • WLAN wireless local area network
  • a CDMA system may implement Wideband CDMA (WCDMA), CDMA IX, Evolution-Data Optimized (EVDO), Time Division Synchronous CDMA (TD-SCDMA), or some other version of CDMA.
  • WCDMA Wideband CDMA
  • CDMA IX Code Division Multiple Access
  • EVDO Evolution-Data Optimized
  • TD-SCDMA Time Division Synchronous CDMA
  • the wireless devices may also be referred to as user equipment (UE), a mobile station, a terminal, an access terminal, a subscriber unit, a station, etc.
  • the wireless devices may include a cellular phone, a smartphone, a tablet, a wireless modem, a personal digital assistant (PDA), a handheld device, a laptop computer, a smartbook, a netbook, a tablet, a cordless phone, a wireless local loop (WLL) station, a Bluetooth device, etc.
  • the wireless devices may include or correspond to the device 1700 of FIG. 17 .
  • the base station 1800 includes a processor 1806 (e.g., a CPU).
  • the base station 1800 may include a transcoder 1810.
  • the transcoder 1810 may include an audio CODEC 1808.
  • the transcoder 1810 may include one or more components (e.g., circuitry) configured to perform operations of the audio CODEC 1808.
  • the transcoder 1810 may be configured to execute one or more computer-readable instructions to perform the operations of the audio CODEC 1808.
  • the audio CODEC 1808 is illustrated as a component of the transcoder 1810, in other examples one or more components of the audio CODEC 1808 may be included in the processor 1806, another processing component, or a combination thereof.
  • a decoder 1838 e.g., a vocoder decoder
  • a receiver data processor 1864 may be included in a receiver data processor 1864.
  • an encoder 1836 e.g., a vocoder encoder
  • a transmission data processor 1882 may be included in a transmission data processor 1882.
  • the transcoder 1810 may function to transcode messages and data between two or more networks.
  • the transcoder 1810 may be configured to convert message and audio data from a first format (e.g., a digital format) to a second format.
  • the decoder 1838 may decode encoded signals having a first format and the encoder 1836 may encode the decoded signals into encoded signals having a second format.
  • the transcoder 1810 may be configured to perform data rate adaptation. For example, the transcoder 1810 may down-convert a data rate or up-convert the data rate without changing a format the audio data. To illustrate, the transcoder 1810 may down-convert 64 kbit/s signals into 16 kbit/s signals.
  • the audio CODEC 1808 may include the encoder 1836 and the decoder 1838.
  • the encoder 1836 may include the encoder 200 of FIG. 1 .
  • the decoder 1838 may include the decoder 300 of FIG. 1 .
  • the encoder 1836 may be configured to generate the non harmonic HB flag (x) 910. Additionally, in some implementations, the encoder 1836 is configured to modify the non harmonic HB flag (x) 910 to generate the modified non harmonic HB flag (y) 920.
  • the encoder 1836 may be configure to use the non harmonic HB flag (x) 910, the modified non harmonic HB flag (y) 920, or both, as described herein with reference to at least FIGS. 1 and 9-16 .
  • the decoder 1838 may be configured to receive or generate a non harmonic HB flag (x) 910, a modified non harmonic HB flag(y) 920, or both.
  • the decoder 1838 may be configure to use the non harmonic HB flag(x) 910, the modified non harmonic HB flag(y) 920, or both, as described herein with reference to at least FIGS. 1 and 9-16 .
  • the base station 1800 may include a memory 1832.
  • the memory 1832 such as a computer-readable storage device, may include instructions.
  • the instructions may include one or more instructions that are executable by the processor 1806, the transcoder 1810, or a combination thereof, to perform one or more operations described with reference to the methods and systems of FIGS. 1-16 .
  • the base station 1800 may include multiple transmitters and receivers (e.g., transceivers), such as a first transceiver 1852 and a second transceiver 1854, coupled to an array of antennas.
  • the array of antennas may include a first antenna 1842 and a second antenna 1844.
  • the array of antennas may be configured to wirelessly communicate with one or more wireless devices, such as the device 1700 of FIG. 17 .
  • the second antenna 1844 may receive a data stream 1814 (e.g., a bitstream) from a wireless device.
  • the data stream 1814 may include messages, data (e.g., encoded speech data), or
  • the base station 1800 may include a network connection 1860, such as backhaul connection.
  • the network connection 1860 may be configured to communicate with a core network or one or more base stations of the wireless communication network.
  • the base station 1800 may receive a second data stream (e.g., messages or audio data) from a core network via the network connection 1860.
  • the base station 1800 may process the second data stream to generate messages or audio data and provide the messages or the audio data to one or more wireless device via one or more antennas of the array of antennas or to another base station via the network connection 1860.
  • the network connection 1860 may be a wide area network (WAN) connection, as an illustrative, non-limiting example.
  • the core network may include or correspond to a Public Switched Telephone Network (PSTN), a packet backbone network, or both.
  • PSTN Public Switched Telephone Network
  • packet backbone network or both.
  • the base station 1800 may include a media gateway 1870 that is coupled to the network connection 1860 and the processor 1806.
  • the media gateway 1870 may be configured to convert between media streams of different telecommunications technologies.
  • the media gateway 1870 may convert between different transmission protocols, different coding schemes, or both.
  • the media gateway 1870 may convert from PCM signals to Real-Time Transport Protocol (RTP) signals, as an illustrative, non-limiting example.
  • RTP Real-Time Transport Protocol
  • the media gateway 1870 may convert data between packet switched networks (e.g., a Voice Over Internet Protocol (VoIP) network, an IP Multimedia Subsystem (IMS), a fourth generation (4G) wireless network, such as LTE, WiMax, and UMB, etc.), circuit switched networks (e.g., a PSTN), and hybrid networks (e.g., a second generation (2G) wireless network, such as GSM, GPRS, and EDGE, a third generation (3G) wireless network, such as WCDMA, EV-DO, and HSPA, etc.).
  • VoIP Voice Over Internet Protocol
  • IMS IP Multimedia Subsystem
  • 4G wireless network such as LTE, WiMax, and UMB, etc.
  • 4G wireless network such as LTE, WiMax, and UMB, etc.
  • circuit switched networks e.g., a PSTN
  • hybrid networks e.g., a second generation (2G) wireless network, such as GSM, GPRS, and EDGE, a third generation (3G) wireless
  • the media gateway 1870 may include a transcode and may be configured to transcode data when codecs are incompatible.
  • the media gateway 1870 may transcode between an Adaptive Multi-Rate (AMR) codec and a G.711 codec, as an illustrative, non-limiting example.
  • the media gateway 1870 may include a router and a plurality of physical interfaces.
  • the media gateway 1870 may also include a controller (not shown).
  • the media gateway controller may be external to the media gateway 1870, external to the base station 1800, or both.
  • the media gateway controller may control and coordinate operations of multiple media gateways.
  • the media gateway 1870 may receive control signals from the media gateway controller and may function to bridge between different transmission technologies and may add service to end-user capabilities and connections.
  • the base station 1800 may include a demodulator 1862 that is coupled to the transceivers 1852, 1854, the receiver data processor 1864, and the processor 1806, and the receiver data processor 1864 may be coupled to the processor 1806.
  • the demodulator 1862 may be configured to demodulate modulated signals received from the transceivers 1852, 1854 and to provide demodulated data to the receiver data processor 1864.
  • the receiver data processor 1864 may be configured to extract a message or audio data from the demodulated data and send the message or the audio data to the processor 1806.
  • the base station 1800 may include a transmission data processor 1882 and a transmission multiple input-multiple output (MIMO) processor 1884.
  • the transmission data processor 1882 may be coupled to the processor 1806 and the transmission MIMO processor 1884.
  • the transmission MIMO processor 1884 may be coupled to the transceivers 1852, 1854 and the processor 1806. In some implementations, the transmission MIMO processor 1884 may be coupled to the media gateway 1870.
  • the transmission data processor 1882 may be configured to receive the messages or the audio data from the processor 1806 and to code the messages or the audio data based on a coding scheme, such as CDMA or orthogonal frequency-division multiplexing (OFDM), as an illustrative, non-limiting examples.
  • the transmission data processor 1882 may provide the coded data to the transmission MIMO processor 1884.
  • the coded data may be multiplexed with other data, such as pilot data, using CDMA or OFDM techniques to generate multiplexed data.
  • the multiplexed data may then be modulated (i.e., symbol mapped) by the transmission data processor 1882 based on a particular modulation scheme (e.g., Binary phase-shift keying ("BPSK”), Quadrature phase-shift keying (“QSPK”), M-ary phase-shift keying (“M-PSK”), M-ary Quadrature amplitude modulation (“M-QAM”), etc.) to generate modulation symbols.
  • BPSK Binary phase-shift keying
  • QSPK Quadrature phase-shift keying
  • M-PSK M-ary phase-shift keying
  • M-QAM M-ary Quadrature amplitude modulation
  • the coded data and other data may be modulated using different modulation schemes.
  • the data rate, coding, and modulation for each data stream may be determined by instructions executed by processor 1806.
  • the transmission MIMO processor 1884 may be configured to receive the modulation symbols from the transmission data processor 1882 and may further process the modulation symbols and may perform beamforming on the data. For example, the transmission MIMO processor 1884 may apply beamforming weights to the modulation symbols. The beamforming weights may correspond to one or more antennas of the array of antennas from which the modulation symbols are transmitted.
  • the second antenna 1844 of the base station 1800 may receive a data stream 1814.
  • the second transceiver 1854 may receive the data stream 1814 from the second antenna 1844 and may provide the data stream 1814 to the demodulator 1862.
  • the demodulator 1862 may demodulate modulated signals of the data stream 1814 and provide demodulated data to the receiver data processor 1864.
  • the receiver data processor 1864 may extract audio data from the demodulated data and provide the extracted audio data to the processor 1806.
  • the processor 1806 may provide the audio data to the transcoder 1810 for transcoding.
  • the decoder 1838 of the transcoder 1810 may decode the audio data from a first format into decoded audio data and the encoder 1836 may encode the decoded audio data into a second format.
  • the encoder 1836 may encode the audio data using a higher data rate (e.g., up-convert) or a lower data rate (e.g., down-convert) than received from the wireless device.
  • the audio data may not be transcoded.
  • transcoding e.g., decoding and encoding
  • the transcoding operations may be performed by multiple components of the base station 1800.
  • decoding may be performed by the receiver data processor 1864 and encoding may be performed by the transmission data processor 1882.
  • the processor 1806 may provide the audio data to the media gateway 1870 for conversion to another transmission protocol, coding scheme, or both.
  • the media gateway 1870 may provide the converted data to another base station or core network via the network connection 1860.
  • Encoded audio data generated at the encoder 1836 may be provided to the transmission data processor 1882 or the network connection 1860 via the processor 1806.
  • the transcoded audio data from the transcoder 1810 may be provided to the transmission data processor 1882 for coding according to a modulation scheme, such as OFDM, to generate the modulation symbols.
  • the transmission data processor 1882 may provide the modulation symbols to the transmission MIMO processor 1884 for further processing and beamforming.
  • the transmission MIMO processor 1884 may apply beamforming weights and may provide the modulation symbols to one or more antennas of the array of antennas, such as the first antenna 1842 via the first transceiver 1852.
  • the base station 1800 may provide a transcoded data stream 1816, that corresponds to the data stream 1814 received from the wireless device, to another wireless device.
  • the transcoded data stream 1816 may have a different encoding format, data rate, or both, than the data stream 1814.
  • the transcoded data stream 1816 may be provided to the network connection 1860 for transmission to another base station or a core network.
  • one or more components of the systems and devices disclosed herein may be integrated into a decoding system or apparatus (e.g., an electronic device, a CODEC, or a processor therein), into an encoding system or apparatus, or both.
  • a decoding system or apparatus e.g., an electronic device, a CODEC, or a processor therein
  • one or more components of the systems and devices disclosed herein may be integrated into a wireless telephone, a tablet computer, a desktop computer, a laptop computer, a set top box, a music player, a video player, an entertainment unit, a television, a game console, a navigation device, a communication device, a personal digital assistant (PDA), a fixed location data unit, a personal media player, or another type of device.
  • PDA personal digital assistant
  • a first apparatus includes means for receiving an audio signal.
  • the means for receiving may include the encoder 200 of FIGS. 1 , 2A , or 17 , the filterbank 290 of FIG. 2A , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the first apparatus may also include means for generating a high band signal based on the received audio signal.
  • the means for generating the high band signal based on the received audio signal may include the encoder 200 of FIGS. 1 , 2A , or 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the first apparatus may also include means for determining a first flag value indicating a harmonic metric of the high band signal.
  • the means for determining the first flag value may include the encoder 200 of FIGS. 1 , 2A , and 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the non harmonic high band detector 906 of FIG. 9 , the non harmonic high band flag modifier 922 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the first apparatus may also include means for transmitting an encoded version of the high band signal.
  • the means for transmitting may include the transmitter 110 of FIGS. 1 and 17 , the first transceiver 1852 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • a second apparatus includes means for determining a gain frame parameter corresponding to a frame of a high-band signal.
  • the means for receiving may include the encoder 200 of FIGS. 1 , 2A , or 17 , the filterbank 290 of FIG. 2A , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the high-band gain frame estimator 263 of FIG. 2B or FIG. 9 , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the second apparatus may also include means for comparing a gain frame parameter to a threshold.
  • the means for comparing a gain frame parameter to a threshold may include the encoder 200 of FIGS. 1 , 2A , or 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the non harmonic high band flag modifier 922 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the second apparatus may also include means for modifying a flag in response to the gain frame parameter being greater than the threshold, the flag corresponding to the frame and indicating a harmonic metric of the high band signal.
  • the means for modifying the flag may include the encoder 200 of FIGS. 1 , 2A , or 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the non harmonic high band flag modifier 922 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the second apparatus may also include means for transmitting an encoded version of the high band signal.
  • the means for transmitting may include the transmitter 110 of FIGS. 1 and 17 , the first transceiver 1852 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • a third apparatus includes means for receiving at least a first audio signal and a second audio signal.
  • the means for receiving may include the encoder 200 of FIGS. 1 , 2A , or 17 , the down-mixer 202, the filterbank 290 of FIG. 2A , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the third apparatus may also include means performing a downmix operation on the first audio signal and the second audio signal to generate a mid signal.
  • the means for performing the downmix operation may include the encoder 200 of FIGS. 1 , 2A , or 17 , the down-mixer 202 of FIG. 2A , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the third apparatus may also include means for generating a low-band mid and a high-band mid signal based on the mid signal.
  • the means for generating the low-band mid signal and the high-band mid signal may include the encoder 200 of FIGS. 1 , 2A , or 17 , the filterbank 290 of FIG. 2A , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the third apparatus may also include means for determining, based at least partially on a voicing value of the low band signal and a gain value corresponding to the high-band mid signal, a value of a multi-source flag associated with the high-band mid signal.
  • the means for determining the value of the multi-source flag may include the encoder 200 of FIGS. 1 , 2A , and 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the non harmonic high band detector 906 of FIG. 9 , the non harmonic high band flag modifier 922 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the third apparatus may also include means for generating a high-band mid excitation signal based at least in part on the multi-source flag.
  • the means for generating the high-band mid excitation signal may include the encoder 200 of FIGS. 1 , 2A , and 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , high-band excitation generator 299 of FIG. 2B or FIG. 9 , the multiplier 255, the multiplier 258, the summer 257, the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the third apparatus may also include means for generating a bitstream based at least in part on the high-band mid excitation signal.
  • the means for generating the bitstream may include the encoder 200 of FIGS. 1 , 2A , and 17 , the mid channel BWE encoder 206 of FIGS. 2A or 2B , the ICBWE encoder 204 of FIGS. 1 or 2A , the encoder 900 of FIG. 9 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the encoder 1836 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the third apparatus may also include means for transmitting the bitstream and the multi-source flag to a device.
  • the means for transmitting may include the transmitter 110 of FIGS. 1 and 17 , the first transceiver 1852 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • a fourth apparatus includes means for receiving a bitstream corresponding to an encoded version of an audio signal.
  • the means for receiving may include the decoder 300 of FIGS. 1 , 3A , or 17 , the mid channel BWE decoder 302 of FIGs. 3A or 3B , the ICBWE decoder 306 of FIGS. 3A or 6 , the decoder 1000 of FIG. 10 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the decoder 1838 of FIG. 18 , one or more other devices, circuits, or any combination thereof.
  • the fourth apparatus may also include means for generating a high band excitation signal based on a low band excitation signal and further based on a first flag value indicating a harmonic metric of a high band signal, where the high band signal corresponds to a high band portion of the audio signal.
  • the means for generating the high band excitation signal may include the decoder 300 of FIGS. 1 , 3A , or 17 , the mid channel BWE decoder 302 of FIGs. 3A or 3B , the ICBWE decoder 306 of FIGS. 3A or 6 , the decoder 1000 of FIG. 10 , the high-band excitation generator 362 of FIGs. 3B or 10 , the CODEC 1708 of FIG. 17 , the processor 1706 of FIG. 17 , the instructions 191 executable by a processor, the CODEC 1808 or the decoder 1838 of FIG. 18 , one or more other devices, circuits, or any combination thereof
  • RAM random access memory
  • MRAM magnetoresistive random access memory
  • STT-MRAM spin-torque transfer MRAM
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • registers hard disk, a removable disk, or a compact disc read-only memory (CD-ROM).
  • An exemplary memory device is coupled to the processor such that the processor can read information from, and write information to, the memory device.
  • the memory device may be integral to the processor.
  • the processor and the storage medium may reside in an application-specific integrated circuit (ASIC).
  • the ASIC may reside in a computing device or a user terminal.
  • the processor and the storage medium may reside as discrete components in a computing device or a user terminal.

Claims (15)

  1. Gehäuse, umfassend:
    einen Mehrkanal-Codierer, der konfiguriert ist zum:
    Empfangen mindestens eines ersten Audiosignals und eines zweiten Audiosignals;
    Durchführen eines Heruntermischvorgangs auf dem ersten Audiosignal und dem zweiten Audiosignal, um ein Mittensignal zu erzeugen;
    Erzeugen eines Niederbandmittensignals und eines Hochbandmittensignals basierend auf dem Mittensignal, wobei das Niederbandmittensignal einem Niederfrequenzabschnitt des Mittensignals entspricht und das Hochbandmittensignal einem Hochfrequenzabschnitt des Mittensignals entspricht;
    Bestimmen, mindestens teilweise basierend auf einem Stimmungswert, der dem Niederbandmittensignal entspricht, und einem Verstärkungswert, der dem Hochbandmittensignal entspricht, einem Wert eines Flags eines nichtharmonischen Hochbands (HB), der dem Hochbandmittensignal zugeordnet ist, wobei das Flag des nichtharmonischen HB eine harmonische Metrik des Hochbandmittensignals angibt;
    Erzeugen eines Hochbandmittenanregungssignals basierend mindestens teilweise auf dem Flag des nichtharmonischen HB; und
    Erzeugen eines Bitstroms basierend mindestens teilweise auf dem Hochbandmittenanregungssignal; und
    einen Sender, der konfiguriert ist, um den Bitstrom und das Flag des nichtharmonischen HB an eine zweite Vorrichtung zu senden.
  2. Vorrichtung nach Anspruch 1, wobei das Flag des nichtharmonischen HB entspricht, ob mehrere Audioquellen dem Hochbandmittensignal zugeordnet sind; oder wobei der Wert des Flags des nichtharmonischen HB ferner auf einer Energiemetrik eines Frames des Hochbandmittensignals und einer Mehrframe-Energiemetrik des Hochbandmittensignals basiert.
  3. Vorrichtung nach Anspruch 1, wobei der Mehrkanal-Codierer ferner konfiguriert ist zum:
    Erzeugen einer nichtlinearen harmonischen Anregung basierend auf einem Niederbandanregungssignal, das Niederbandanregungssignal basierend auf dem Niederbandmittensignal;
    Erzeugen eines modulierten Rauschens basierend auf der nichtlinearen harmonischen Anregung; und Steuern, basierend auf dem Flag des nichtharmonischen HB, Mischen der nichtlinearen harmonischen Anregung und des modulierten Rauschens, um das Hochbandmittenanregungssignal zu erzeugen.
  4. Vorrichtung nach Anspruch 3, wobei der Mehrkanal-Codierer ferner konfiguriert ist, um das modulierte Rauschen durch Bestimmen einer Hüllkurve basierend auf der nichtlinearen harmonischen Anregung und einem oder mehreren Filterparametern und Anwenden der Hüllkurve auf ein Rauschsignal zu erzeugen, um das modulierte Rauschen zu erzeugen.
  5. Vorrichtung nach Anspruch 4, wobei der eine oder die mehreren Filterparameter auf dem Flag des nichtharmonischen HB und einem oder mehreren Niederbandstimmfaktoren basieren; oder
    wobei der Mehrkanal-Codierer konfiguriert ist, um das Hochbandmittenanregungssignal durch Kombinieren der nichtlinearen harmonischen Anregung und des modulierten Rauschens zu erzeugen.
  6. Vorrichtung nach Anspruch 4, wobei der Mehrkanal-Codierer konfiguriert ist, um die Hüllkurve auf das Rauschsignal durch Anwenden eines Niederpasses auf das Rauschsignal anzuwenden, und wobei Koeffizienten des Niederpasses mindestens teilweise auf dem einen oder den mehreren Filterparametern basieren.
  7. Vorrichtung nach Anspruch 4, wobei der Mehrkanal-Codierer konfiguriert ist, um das Hochbandmittenanregungssignal durch Kombinieren der nichtlinearen harmonischen Anregung und des modulierten Rauschens zu erzeugen, und wobei der Mehrkanal-Codierer ferner konfiguriert ist, um eine erste Verstärkung an die nichtlineare harmonische Anregung vor dem Erzeugen des Hochbandmittenanregungssignals anzuwenden, und wobei die erste Verstärkung auf dem Flag des nichtharmonischen HB und einem oder mehreren Niederbandstimmfaktoren basiert.
  8. Vorrichtung nach Anspruch 7, wobei der Mehrkanal-Codierer ferner konfiguriert ist, um eine zweite Verstärkung an das modulierte Rauschen anzuwenden, bevor das Hochbandmittenanregungssignal erzeugt wird, und wobei die zweite Verstärkung auf dem Flag des nichtharmonischen HB und dem einen oder den mehreren Niederbandstimmfaktoren basiert.
  9. Vorrichtung nach Anspruch 1, wobei der Mehrkanal-Codierer ferner konfiguriert ist zum:
    Bestimmen eines Verstärkungsframe-Parameters, der einem Frame des Hochbandmittensignals entspricht;
    Vergleichen des Verstärkungsframe-Parameters mit einem Schwellenwert; und
    als Reaktion darauf, dass der Verstärkungsframe-Parameter über dem Schwellenwert liegt, Modifizieren des Werts des Flags des nichtharmonischen HB.
  10. Vorrichtung nach Anspruch 9, wobei der Mehrkanal-Codierer ferner konfiguriert ist zum:
    Erzeugen einer synthetisierten Version des Hochbandmittensignals basierend auf dem Hochbandmittenanregungssignal; und
    Vergleichen des Frames des Hochbandmittensignals mit einem Frame der synthetisierten Version des Hochbandmittensignals, um den Verstärkungsframe-Parameter zu erzeugen.
  11. Vorrichtung nach Anspruch 1, wobei der Mehrkanal-Codierer einen stereo-Codierer beinhaltet, der ein Nicht-Referenz-Hochbandanregungssignal mindestens teilweise basierend auf dem Flag des nichtharmonischen HB während einer Zwischenkanalbandbreitenerweiterungs(inter-channel band width extension - ICBWE)-Codierungsoperation erzeugt.
  12. Vorrichtung nach Anspruch 1, wobei der Mehrkanal-Codierer und der Sender in eine mobile Vorrichtung integriert sind; oder Vorrichtung nach Anspruch 1, wobei der Mehrkanal-Codierer und der Sender in eine Basisstation integriert sind.
  13. Gehäuse, umfassend:
    Empfangen mindestens eines ersten Audiosignals und eines zweiten Audiosignals an einem Mehrkanal-Codierer;
    Durchführen eines Heruntermischvorgangs auf dem ersten Audiosignal und dem zweiten Audiosignal, um ein Mittensignal zu erzeugen;
    Erzeugen eines Niederbandmittensignals und eines Hochbandmittensignals basierend auf dem Mittensignal, wobei das Niederbandmittensignal einem Niederfrequenzabschnitt des Mittensignals entspricht und das Hochbandmittensignal einem Hochfrequenzabschnitt des Mittensignals entspricht;
    Bestimmen, mindestens teilweise basierend auf einem Stimmungswert, der dem Niederbandmittensignal entspricht, und einem Verstärkungswert, der dem Hochbandmittensignal entspricht, einen Wert eines Flags des nichtharmonischen HB, der dem Hochbandmittensignal zugeordnet ist, wobei das Flag des nichtharmonischen HB eine harmonische Metrik des Hochbandmittensignals angibt;
    Erzeugen eines Hochbandmittenanregungssignals basierend mindestens teilweise auf dem Flag des nichtharmonischen HB;
    Erzeugen eines Bitstroms basierend mindestens teilweise auf dem Hochbandmittenanregungssignal; und
    Senden des Bitstroms und des Flags des nichtharmonischen HB von dem Mehrkanal-Codierer an eine Vorrichtung.
  14. Verfahren nach Anspruch 13, ferner umfassend:
    Erzeugen einer nichtlinearen harmonischen Anregung basierend auf einem Niederbandanregungssignal, das Niederbandanregungssignal basierend auf dem Niederbandmittensignal;
    Erzeugen des modulierten Rauschens basierend auf der nichtlinearen harmonischen Anregung; und Steuern, basierend auf dem Flag des nichtharmonischen HB, Mischen der nichtlinearen harmonischen Anregung und des modulierten Rauschens, um das Hochbandmittenanregungssignal zu erzeugen,
    wobei das Erzeugen des modulierten Rauschens umfasst:
    Bestimmen einer Hüllkurve basierend auf der nichtlinearen harmonischen Anregung und einem oder mehreren Filterparametern; und
    Anwenden der Hüllkurve auf ein Rauschsignal, um das modulierte Rauschen zu erzeugen.
  15. Verfahren nach Anspruch 13, wobei das Bestimmen des Werts des Flags des nichtharmonischen HB, das Erzeugen des Hochbandmittenanregungssignals und das Erzeugen des Bitstroms an einer mobilen Vorrichtung durchgeführt wird; oder wobei das Bestimmen des Werts des Flags des nichtharmonischen HB, das Erzeugen des Hochbandmittenanregungssignals und das Erzeugen des Bitstroms an einer Basisstation durchgeführt wird.
EP18724649.1A 2017-04-21 2018-04-19 Nicht-harmonische spracherkennung und bandbreitenerweiterung in einer mehrquellenumgebung Active EP3613042B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201762488654P 2017-04-21 2017-04-21
US15/956,645 US10825467B2 (en) 2017-04-21 2018-04-18 Non-harmonic speech detection and bandwidth extension in a multi-source environment
PCT/US2018/028338 WO2018195299A1 (en) 2017-04-21 2018-04-19 Non-harmonic speech detection and bandwidth extension in a multi-source environment

Publications (2)

Publication Number Publication Date
EP3613042A1 EP3613042A1 (de) 2020-02-26
EP3613042B1 true EP3613042B1 (de) 2022-09-21

Family

ID=63852843

Family Applications (1)

Application Number Title Priority Date Filing Date
EP18724649.1A Active EP3613042B1 (de) 2017-04-21 2018-04-19 Nicht-harmonische spracherkennung und bandbreitenerweiterung in einer mehrquellenumgebung

Country Status (9)

Country Link
US (1) US10825467B2 (de)
EP (1) EP3613042B1 (de)
KR (1) KR102308966B1 (de)
CN (1) CN110537222B (de)
AU (1) AU2018256414B2 (de)
BR (1) BR112019021903A2 (de)
SG (1) SG11201908390UA (de)
TW (1) TWI775838B (de)
WO (1) WO2018195299A1 (de)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10957331B2 (en) 2018-12-17 2021-03-23 Microsoft Technology Licensing, Llc Phase reconstruction in a speech decoder
US10847172B2 (en) * 2018-12-17 2020-11-24 Microsoft Technology Licensing, Llc Phase quantization in a speech encoder
KR102570480B1 (ko) * 2019-01-04 2023-08-25 삼성전자주식회사 오디오 신호 처리 방법 및 이를 지원하는 전자 장치
JP2022543292A (ja) * 2019-08-05 2022-10-11 シュアー アクイジッション ホールディングス インコーポレイテッド 送信アンテナダイバーシティ無線オーディオシステム
US10978083B1 (en) 2019-11-13 2021-04-13 Shure Acquisition Holdings, Inc. Time domain spectral bandwidth replication
KR20210073975A (ko) * 2019-12-11 2021-06-21 삼성전자주식회사 화자를 인식하는 방법 및 장치
CN112562686B (zh) * 2020-12-10 2022-07-15 青海民族大学 一种使用神经网络的零样本语音转换语料预处理方法
CN113763980B (zh) * 2021-10-30 2023-05-12 成都启英泰伦科技有限公司 一种回声消除方法

Family Cites Families (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7330814B2 (en) * 2000-05-22 2008-02-12 Texas Instruments Incorporated Wideband speech coding with modulated noise highband excitation system and method
SE519976C2 (sv) * 2000-09-15 2003-05-06 Ericsson Telefon Ab L M Kodning och avkodning av signaler från flera kanaler
SE0004163D0 (sv) * 2000-11-14 2000-11-14 Coding Technologies Sweden Ab Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering
ATE331280T1 (de) * 2001-11-23 2006-07-15 Koninkl Philips Electronics Nv Bandbreitenvergrösserung für audiosignale
BRPI0517780A2 (pt) * 2004-11-05 2011-04-19 Matsushita Electric Ind Co Ltd aparelho de decodificação escalável e aparelho de codificação escalável
KR100707174B1 (ko) * 2004-12-31 2007-04-13 삼성전자주식회사 광대역 음성 부호화 및 복호화 시스템에서 고대역 음성부호화 및 복호화 장치와 그 방법
MX2007012187A (es) * 2005-04-01 2007-12-11 Qualcomm Inc Sistemas, metodos y aparatos para deformacion en tiempo de banda alta.
ES2358125T3 (es) * 2005-04-01 2011-05-05 Qualcomm Incorporated Procedimiento y aparato para un filtrado de antidispersión de una señal ensanchada de excitación de predicción de velocidad de ancho de banda.
TWI324336B (en) * 2005-04-22 2010-05-01 Qualcomm Inc Method of signal processing and apparatus for gain factor smoothing
JP5100380B2 (ja) * 2005-06-29 2012-12-19 パナソニック株式会社 スケーラブル復号装置および消失データ補間方法
CN101273404B (zh) * 2005-09-30 2012-07-04 松下电器产业株式会社 语音编码装置以及语音编码方法
US8135047B2 (en) * 2006-07-31 2012-03-13 Qualcomm Incorporated Systems and methods for including an identifier with a packet associated with a speech signal
US8005678B2 (en) * 2006-08-15 2011-08-23 Broadcom Corporation Re-phasing of decoder states after packet loss
CN101548318B (zh) * 2006-12-15 2012-07-18 松下电器产业株式会社 编码装置、解码装置以及其方法
KR101355376B1 (ko) * 2007-04-30 2014-01-23 삼성전자주식회사 고주파수 영역 부호화 및 복호화 방법 및 장치
KR100970446B1 (ko) * 2007-11-21 2010-07-16 한국전자통신연구원 주파수 확장을 위한 가변 잡음레벨 결정 장치 및 그 방법
RU2443028C2 (ru) * 2008-07-11 2012-02-20 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Устройство и способ расчета параметров расширения полосы пропускания посредством управления фреймами наклона спектра
ES2539304T3 (es) * 2008-07-11 2015-06-29 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Un aparato y un método para generar datos de salida por ampliación de ancho de banda
EP2144230A1 (de) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audiokodierungs-/Audiodekodierungsschema geringer Bitrate mit kaskadierten Schaltvorrichtungen
EP2144231A1 (de) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audiokodierungs-/-dekodierungschema geringer Bitrate mit gemeinsamer Vorverarbeitung
US9037474B2 (en) * 2008-09-06 2015-05-19 Huawei Technologies Co., Ltd. Method for classifying audio signal into fast signal or slow signal
US8532998B2 (en) * 2008-09-06 2013-09-10 Huawei Technologies Co., Ltd. Selective bandwidth extension for encoding/decoding audio/speech signal
CN101763856B (zh) * 2008-12-23 2011-11-02 华为技术有限公司 信号分类处理方法、分类处理装置及编码系统
CO6440537A2 (es) * 2009-04-09 2012-05-15 Fraunhofer Ges Forschung Aparato y metodo para generar una señal de audio de sintesis y para codificar una señal de audio
TWI556227B (zh) * 2009-05-27 2016-11-01 杜比國際公司 從訊號的低頻成份產生該訊號之高頻成份的系統與方法,及其機上盒、電腦程式產品、軟體程式及儲存媒體
MX2012010415A (es) * 2010-03-09 2012-10-03 Fraunhofer Ges Forschung Aparato y metodo para procesar una señal de audio de entrada utilizando bancos de filtro en cascada.
US20120029926A1 (en) * 2010-07-30 2012-02-02 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for dependent-mode coding of audio signals
KR20120016709A (ko) * 2010-08-17 2012-02-27 삼성전자주식회사 휴대용 단말기에서 통화 품질을 향상시키기 위한 장치 및 방법
WO2012040897A1 (en) * 2010-09-28 2012-04-05 Huawei Technologies Co., Ltd. Device and method for postprocessing decoded multi-channel audio signal or decoded stereo signal
CN102737636B (zh) * 2011-04-13 2014-06-04 华为技术有限公司 一种音频编码方法及装置
WO2013035257A1 (ja) * 2011-09-09 2013-03-14 パナソニック株式会社 符号化装置、復号装置、符号化方法および復号方法
JP5817499B2 (ja) * 2011-12-15 2015-11-18 富士通株式会社 復号装置、符号化装置、符号化復号システム、復号方法、符号化方法、復号プログラム、及び符号化プログラム
US9129600B2 (en) * 2012-09-26 2015-09-08 Google Technology Holdings LLC Method and apparatus for encoding an audio signal
ES2753228T3 (es) * 2012-11-05 2020-04-07 Panasonic Ip Corp America Dispositivo de codificación de audio de voz, dispositivo de decodificación de audio de voz, procedimiento de codificación de audio de voz y procedimiento de decodificación de audio de voz
CN105976830B (zh) * 2013-01-11 2019-09-20 华为技术有限公司 音频信号编码和解码方法、音频信号编码和解码装置
EP2950308B1 (de) * 2013-01-22 2020-02-19 Panasonic Corporation Generator für bandbreitenerweiterungsparameter, codierer, decodierer, verfahren zur generierung von bandbreitenerweiterungsparametern, codierverfahren und decodierungsverfahren
BR112015017632B1 (pt) * 2013-01-29 2022-06-07 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Aparelho e método para gerar um sinal melhorado da frequência utilizando nivelamento temporal de sub-bandas
KR101732059B1 (ko) * 2013-05-15 2017-05-04 삼성전자주식회사 오디오 신호의 부호화, 복호화 방법 및 장치
FR3007563A1 (fr) * 2013-06-25 2014-12-26 France Telecom Extension amelioree de bande de frequence dans un decodeur de signaux audiofrequences
FR3008533A1 (fr) * 2013-07-12 2015-01-16 Orange Facteur d'echelle optimise pour l'extension de bande de frequence dans un decodeur de signaux audiofrequences
US9620134B2 (en) * 2013-10-10 2017-04-11 Qualcomm Incorporated Gain shape estimation for improved tracking of high-band temporal characteristics
US10083708B2 (en) * 2013-10-11 2018-09-25 Qualcomm Incorporated Estimation of mixing factors to generate high-band excitation signal
CN105765655A (zh) * 2013-11-22 2016-07-13 高通股份有限公司 高频带译码中的选择性相位补偿
US10163447B2 (en) * 2013-12-16 2018-12-25 Qualcomm Incorporated High-band signal modeling
US9564141B2 (en) * 2014-02-13 2017-02-07 Qualcomm Incorporated Harmonic bandwidth extension of audio signals
US9542955B2 (en) * 2014-03-31 2017-01-10 Qualcomm Incorporated High-band signal coding using multiple sub-bands
US9583115B2 (en) * 2014-06-26 2017-02-28 Qualcomm Incorporated Temporal gain adjustment based on high-band signal characteristic
US9984699B2 (en) * 2014-06-26 2018-05-29 Qualcomm Incorporated High-band signal coding using mismatched frequency ranges
US9886963B2 (en) * 2015-04-05 2018-02-06 Qualcomm Incorporated Encoder selection
US10341770B2 (en) * 2015-09-30 2019-07-02 Apple Inc. Encoded audio metadata-based loudness equalization and dynamic equalization during DRC
US10109284B2 (en) 2016-02-12 2018-10-23 Qualcomm Incorporated Inter-channel encoding and decoding of multiple high-band audio signals

Also Published As

Publication number Publication date
WO2018195299A1 (en) 2018-10-25
KR20190139872A (ko) 2019-12-18
SG11201908390UA (en) 2019-11-28
TWI775838B (zh) 2022-09-01
US10825467B2 (en) 2020-11-03
KR102308966B1 (ko) 2021-10-05
EP3613042A1 (de) 2020-02-26
TW201842494A (zh) 2018-12-01
AU2018256414A1 (en) 2019-10-03
BR112019021903A2 (pt) 2020-05-26
CN110537222B (zh) 2023-07-28
US20180308505A1 (en) 2018-10-25
AU2018256414B2 (en) 2022-05-19
CN110537222A (zh) 2019-12-03

Similar Documents

Publication Publication Date Title
EP3613042B1 (de) Nicht-harmonische spracherkennung und bandbreitenerweiterung in einer mehrquellenumgebung
US10872613B2 (en) Inter-channel bandwidth extension spectral mapping and adjustment
EP3692525B1 (de) Decodierung von audiosignalen
US11430452B2 (en) Encoding or decoding of audio signals
US10593341B2 (en) Coding of multiple audio signals
EP3692527B1 (de) Decodierung von audiosignalen
EP3649639B1 (de) Zeitbereichsvorhersage zwischen kanälen
CN110800051B (zh) 具有时域信道间带宽延展的高频带残值预测
EP3692528B1 (de) Decodierung von audiosignalen
US10573326B2 (en) Inter-channel bandwidth extension

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20191121

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20210316

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20220407

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602018040881

Country of ref document: DE

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 1520353

Country of ref document: AT

Kind code of ref document: T

Effective date: 20221015

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG9D

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20220921

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221221

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1520353

Country of ref document: AT

Kind code of ref document: T

Effective date: 20220921

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20221222

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20230123

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20230223

Year of fee payment: 6

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20230121

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230315

Year of fee payment: 6

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602018040881

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20230223

Year of fee payment: 6

26N No opposition filed

Effective date: 20230622

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230419

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20230430

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20220921

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230430

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230430

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230430

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230419

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230419

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20240314

Year of fee payment: 7