US20040138876A1 - Method and apparatus for artificial bandwidth expansion in speech processing - Google Patents
Method and apparatus for artificial bandwidth expansion in speech processing Download PDFInfo
- Publication number
- US20040138876A1 US20040138876A1 US10/341,332 US34133203A US2004138876A1 US 20040138876 A1 US20040138876 A1 US 20040138876A1 US 34133203 A US34133203 A US 34133203A US 2004138876 A1 US2004138876 A1 US 2004138876A1
- Authority
- US
- United States
- Prior art keywords
- speech
- signal
- speech signals
- segments
- sibilant
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000012545 processing Methods 0.000 title description 3
- 238000001228 spectrum Methods 0.000 claims abstract description 86
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 21
- 238000005070 sampling Methods 0.000 claims abstract description 10
- 230000003595 spectral effect Effects 0.000 claims description 52
- 238000007635 classification algorithm Methods 0.000 claims description 19
- 230000002238 attenuated effect Effects 0.000 claims description 14
- 238000012935 Averaging Methods 0.000 claims description 10
- 230000002708 enhancing effect Effects 0.000 claims description 5
- 238000009499 grossing Methods 0.000 claims description 4
- 238000000638 solvent extraction Methods 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 abstract description 4
- 230000001131 transforming effect Effects 0.000 abstract 1
- 230000004048 modification Effects 0.000 description 13
- 238000012986 modification Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 10
- 238000003780 insertion Methods 0.000 description 7
- 230000037431 insertion Effects 0.000 description 7
- 230000003321 amplification Effects 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000001965 increasing effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000007493 shaping process Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
Definitions
- the present invention relates generally to a method and device for quality improvement in an electrically reproduced speech signal and, more particularly, to the quality improvement by expanding the bandwidth of sound.
- Speech signals are traditionally transmitted in a telecommunications system in narrowband, containing frequencies in the range of 300 Hz to 3.4 kHz with a sampling rate of 8 kHz, in accordance with the Nyquist theorem.
- humans perceive speech more naturally if the bandwidth of the transmitted sound is wider (e.g., up to 8 kHz). Because of the limited frequency range, the quality of speech so transmitted is undesirable as the sound is somewhat unnatural.
- the new wideband transmission standards such as the AMR (adaptive multi-rate) wideband speech codec, can carry frequencies up to 7 kHz.
- the wideband-capable terminal or the wideband network will not offer any advantages regarding the naturalness of the transmitted speech because the upper frequency content is already missing in the transmission.
- H. Yasukawa (“Quality Enhancement of Band Limited Speech by Filtering and Multirate Techniques”, Proc. Int. Conf. on Spoken Language Proc., pp.
- EP10064648 discloses a method of speech bandwidth expansion wherein the missing frequency components of the upper band of speech (e.g., between 4 kHz and 8 kHz) are generated at the receiver using a codebook.
- the codebook contains frequency vectors of different spectral characteristics, all of which cover the same upper band. Expanding the frequency range corresponds to selecting the optimal vector and adding into it the received spectral components of lower band (e.g., from 0 to 4 kHz).
- the first aspect of the present invention there is provided a method of improving speech in a plurality of signal segments having speech signals in a time domain.
- the method is characterized by
- the upsampling is carried out by inserting a value between adjacent signal samples in the signal segment, and the inserted value is zero.
- the speech signals include a time waveform having a plurality of crossing points on a time axis, and said at least one characteristic of the speech signals is indicative of the number of crossing points in a signal segment.
- each of the signal segments comprises a number of signal samples, and said at least one characteristic of the signal segments is indicative of a ratio of the number of crossing points in the signal segment and the number of signal samples in said signal segment.
- At least one signal characteristic of the speech signals is indicative of a ratio of an energy of a second derivative of the speech signals and an energy in the speech signals.
- the plurality of classes include a voiced sound and a stop consonant
- the speech signals are classified as the voiced sound if the ratio is smaller than a predetermined value and
- the speech signals are classified as the stop consonant if the ratio is greater than the predetermined value.
- the plurality of classes include a sibilant class and a non-sibilant class, and
- the speech signals are classified as the sibilant class if the ratio is greater than a predetermined value
- the speech signals are classified as the non-sibilant class if the ratio is smaller than or equal to the predetermined value.
- said at least one signal characteristic of the speech signals is indicative of a further ratio of an energy of a second derivative of the speech signals and an energy in the speech signals, and the speech signals are classified as the sibilant class if the further ratio is also greater than a further predetermined value.
- each of the speech spectra has a first spectral portion in a lower frequency range and a second spectral portion in a higher frequency range, and the second spectral portion is enhanced for providing the modified transformed segments if the speech signals are classified as the sibilant class and the second spectral portion is attenuated for providing the modified transformed segments if the speech signals are classified as the non-sibilant class.
- each of the speech spectra has a first spectral portion in a lower frequency range and a second spectral portion in a higher frequency range, and smoothing the second spectral portion by an averaging operation prior to converting the modified transformed segments into the speech data in the time domain.
- a network device in a telecommunications network wherein the network device is capable of
- the network device receives data indicative of speech, and partitioning the received data into a plurality of signal segments having speech signals in a time domain.
- the network device is characterized by
- an upsampling module for upsampling the signal segments for providing upsampled segments in the time domain
- a transform module for converting the upsampled segments into a plurality of transformed segments having speech spectra in a frequency domain
- a classification algorithm for classifying the speech signals into a plurality of classes based on at least one signal characteristic of the speech signals
- an inverse transform module for converting the modified transformed segments into speech data in the time domain.
- each of the signal segments comprises a number of signal samples for sampling a waveform having a plurality of crossing points on a time axis
- the classification algorithm is adapted to classify the speech signals based on a ratio of the number of crossing points and the number of signal samples in at least one signal segment.
- the classification algorithm is also adapted to classify the speech signals based on a ratio of an energy of a second derivative in the speech signal and an energy in at least one signal segment.
- the plurality of classes include a sibilant class and a non-sibilant class, and each of the speech spectra has a first spectral portion in a lower frequency range and a second spectral portion in a higher frequency range, said device characterized in that the adjustment algorithm is adapted to
- the adjustment algorithm is also adapted to smooth the second spectral portion by an averaging operation.
- a sound classification algorithm for use in a speech decoder, wherein speech data in the speech decoder is partitioned into a plurality of signal segments having speech signals in a time domain and each signal segment includes a number of signal samples, and wherein the speech signals include a time waveform having a plurality of crossing points on a time axis.
- the classification algorithm is characterized by
- the speech signals are classified into a sibilant class and a non-sibilant class, and the speech signals are classified as the sibilant class if the ratio is greater than a predetermined value.
- the classifying is also based on a further ratio of an energy of a second derivative of a second derivative of the speech signal and an energy in said at least one signal segment.
- the speech signals are classified into a sibilant class and a non-sibilant class, and the speech signals are classified as the sibilant class if the ratio is greater than a first predetermined value and the further ratio is greater than a second predetermined value.
- the the first predetermined value can be substantially equal to 0.6
- the second predetermined value can be substantially equal to 8.
- a spectral adjustment algorithm for use in a speech decoder capable of
- the speech signals in at least two consecutive signal segments are classified as the sibilant class, said at least two consecutive signal segments including a leading segment and at least one following segment, wherein the second speech spectral portion in the leading segment is enhanced by a first factor, and the second speech spectral portion in said at least one following segment is enhanced by a second factor smaller than the first factor.
- FIG. 1 is a block diagram showing part of the speech decoder, according to the present invention.
- FIG. 2 is a plot showing an enhanced FFT spectrum of a speech frame after zero insertion.
- FIG. 3 a is a plot showing an FFT spectrum of a voiced-sound frame after zero insertion.
- FIG. 3 b is a plot showing an attenuation curve for modifying the FFT spectrum of a voiced-sound frame.
- FIG. 3 c is a plot showing the FFT spectrum of FIG. 3 a after being attenuated according the attenuation curve as shown in FIG. 3 b.
- FIG. 4 a is a plot showing an FFT spectrum of a stop-consonant frame after zero insertion.
- FIG. 4 b is a plot showing an attenuation curve for modifying the FFT spectrum of a stop-consonant frame.
- FIG. 4 c is a plot showing the FFT spectrum of FIG. 4 a after being attenuated according the attenuation curve as shown in FIG. 4 b.
- FIG. 5 a is a plot showing a different attenuation curve for modifying the FFT spectrum of a stop-consonant frame.
- FIG. 5 b is a plot showing the FFT spectrum of FIG. 4 a after being attenuated according to the attenuation curve as shown in FIG. 5 a.
- FIG. 6 is a plot showing two different amplification curves for enhancing the amplitude of a first sibilant frame and that of the following sibilant frames.
- FIG. 7 a is a plot showing an FFT spectrum of a sibilant frame after zero insertion.
- FIG. 7 b is a plot showing the FFT spectrum of FIG. 6 a after being amplified by an amplification curve similar to the curve as shown in FIG. 6.
- FIG. 8 a is a plot showing an FFT spectrum of a non-sibilant frame after attenuation.
- FIG. 8 b is a plot showing the attenuated spectrum of FIG. 8 a after being modified by a moving average operation.
- FIG. 9 a is a schematic representation showing three windowed frames being processed by a frame cascading process.
- FIG. 9 b is a schematic representation showing a continuous sequence of frames as the result of frame cascading.
- FIG. 10 is a flowchart illustrating the method of speech sound quality improvement, according to the present invention.
- FIG. 11 is a block diagram showing a mobile terminal having a speech signal modification module, according to the present invention.
- FIG. 12 is a block diagram showing a telecommunications network including a plurality of base stations each of which uses a speech signal modification module, according to the present invention.
- the present invention makes use of the original narrowband speech signal (0-4 kHz) that is received by a receiver, and generates a new speech signal by artificially expanding the bandwidth of the received speech in order to improve the naturalness of the speech sound, based on the new speech signal. With no additional information to be transmitted, the present invention generates new upper frequency components based on the characteristics of the transmitted speech signal.
- FIG. 1 shows a part of a speech decoder 10 , according to the present invention.
- the input signal comprises a continuous sequence of samples at a typical sample frequency of 8 kHz.
- the input signal is divided by a framing block 12 into windows or frames, the edges of which are overlapping.
- the default size of the frame is 20 ms.
- each frame is windowed with a Hamming window of 30 ms (240 samples) so that each end of a frame overlaps with an adjacent frame by 5 ms.
- the aliasing block 14 zeros are inserted between samples—typically one zero between two samples.
- the sampling frequency is doubled from 8 kHz to 16 kHz.
- an FFT fast Fourier Transform
- the length of the FFT is 1024. It should be noted that, after zero insertion, the enhanced FFT power spectrum has the original narrowband component in the range of 0-4 kHz and the mirror image of the same spectrum in the frequency range of 4 kHz to 8 kHz, as shown in FIG. 2.
- the enhanced FFT spectrum is modified by a speech signal modification module 20 , which comprises a sound classification algorithm 22 and a spectrum adjustment algorithm 24 .
- the sound classification algorithm 22 is used to classify the speech signals into a plurality of classes and then the spectrum adjustment algorithm 24 is used to modify the enhanced FFT spectrum based on the classification.
- the speech signals in the frames are first classified into two basic types: sibilant and non-sibilant.
- Sibilants are fricatives, such as /s/, /sh/ and /z/ that contain considerably more high frequency components than other phonemes.
- a fricative is a consonant characterized by the frictional passage of the expired breath through a narrowing at some point in a vocal tract.
- the non-sibilants are further classified into a voiced-sound type and a stop-consonant type.
- the spectrum envelope of a voiced-sound in the lower frequency band (0-4 kHz) decays with frequency whereas the spectrum envelope of a sibilant rises with frequencies in the same frequency band.
- the spectrum of a voiced-sound such as a vowel differs sufficiently from the spectrum of a sibilant, rendering it possible to separate sibilants from non-sibilants.
- the speech signal in each frame is separated based on two quotients, q 1 and q 2 :
- N Z is the number of zero-crossings in the speech signal frame or window in the time domain
- N S is the number of samples in the frame
- D E is the energy of the second derivative of the speech signal in the time domain
- E S is the energy of the speech signal, which is the squared sum of the signal in the frame.
- q 1 is a measure indicative of the frequency content of the frame and q 2 is a measure related to the energy distribution with respect to frequencies in the frame.
- the quotients q 1 and q 2 are simple to compute.
- the quotients are compared with two separate limiting values c 1 and c 2 in order to distinguish a sibilant from a non-sibilant. If q 1 >c 1 and q 2 >c 2 , then the frame is considered as that of a sibilant. Otherwise, the frame is considered as that of a non-sibilant.
- the limiting values c 1 and c 2 can be chosen as 0.6 and 8, respectively.
- the duration of a fricative is longer than the duration of other consonants in speech.
- the duration of a sibilant is usually longer than the duration of a fricative (such as /f/ and /h/) that is not a sibilant.
- a third criterion is used to sort out sibilants from the speech signal: only a speech segment that has at least two consecutive frames that are considered as fricatives is processed as a sibilant. In that end, when one frame meets the requirement of q 1 >c 1 and q 2 >c 2 , the sound classification algorithm 22 further examines at least one following frame to determine whether the requirement of q 1 >c 1 and q 2 >c 2 is also met.
- the non-sibilant frames are further separated into frames with a voiced-sound and frames with a stop consonant based on the quotient q 1 .
- Stop consonants are unvoiced consonants such as /k/, /p/ and /t/. For example, if q 1 is greater than 0.4, then the frame can be considered as that of a stop consonant. Otherwise, the frame is that of a voiced sound.
- the criteria used for sound classification as described above are based on experimental facts, and they can be varied somewhat to change the recognition characteristics of the method. For example, if q 1 and/or q 2 are made smaller, e.g. 0.3 and 5, the method is less likely to detect all sibilants, but at the same time there are fewer false sibilants detected. Respectively, if q 1 and/or q 2 are made larger, e.g. 0.9 and 12, the method is more likely to detect all sibilants, but at the same time there are more false sibilants detected.
- the duration D threshold can also be varied with similar consequences, e.g., between 30 ms and 90 ms.
- the spectrum adjustment algorithm 24 is used to modify the amplitude of the enhanced FFT spectrum in the corresponding zero-inserted frames.
- the enhanced FFT spectrum covers a frequency range of 0 to 8 kHz.
- the lower half of the frequency range has the original narrowband FFT spectrum and the higher half of the frequency range has the mirror image of the same spectrum. It is preferred that only the spectrum in the higher frequency band is modified and the lower frequency band is left unaltered.
- the FFT spectrum in the higher frequency range is modified such that the amplitude is attenuated more as the frequency increases.
- the amplitude of the enhanced FFT spectrum of a voiced sound frame is attenuated based two parameters: attnlg and kx, which are calculated as follows:
- L max is the maximum level of the spectrum from 0-4 kHz and L ave is the average level of the spectrum from 2-3.4 kHz. From these two parameters a step function having steps at intervals of 1 kHz can be formed in order to attenuate the amplitude spectrum from 4-8 kHz, and each step is obtained by increasing the attenuation gradually to the maximum attenuation given by
- w is a weigh factor that is proportional to the frequency of the maximal spectral component.
- the amplitude of the step function between 0-4 kHz is 0 dB.
- FIG. 3 a a typical amplitude spectrum of a voiced-sound frame is shown in FIG. 3 a and an exemplary attenuation step function is shown in FIG. 3 b. After attenuated by the step function, the amplitude spectrum is shown in FIG. 3 c.
- the amplitude spectrum of each frame is attenuated in a similar fashion except that
- FIG. 4 a A typical amplitude spectrum of a stop-consonant frame is shown in FIG. 4 a.
- An exemplary attenuation step function is shown in FIG. 4 b. After attenuated by the step function, the amplitude spectrum is shown in FIG. 4 c .
- the attenuation is carried out in a more gradual manner, as shown in FIGS. 5 a - 5 b.
- FIG. 5 a the attenuation of the amplitude of the spectrum starts at 4 kHz and the attenuation curve has the shape of a logarithmic function.
- FIG. 5 b is the amplitude spectrum of FIG. 4 a after being attenuated by the attenuation curve of FIG. 5 a.
- the envelope of the amplitude of the FFT spectrum after zero insertion of a sibilant frame increases from 0 to 4 kHz and decreases from 4 kHz to 8 kHz. It is desirable to modify the spectrum so that the amplitude of the spectrum in the higher frequency range is increased with frequencies.
- a speech segment that has at least two consecutive frames that meet the requirement of q 1 >c 1 and q 2 >c 2 is processed as a sibilant.
- the amplitude of the enhanced FFT spectrum between 0-4.8 kHz is kept unchanged while the amplitude of the spectrum between 4.8 kHz and 8 kHz is enhanced by a logarithmic function attslidelg as follows:
- UV is the dB-value of the difference in the amplitude spectrum in the frequency range 0.3 kHz-3 kHz (the difference can be calculated from the mean values of a number of samples at the two ends of the frequency range, for example)
- f is the frequency in Hz
- the amplified spectrum is shown in FIG. 7 c.
- the original spectrum is shown in FIG. 7 a and the used amplification curve is shown in FIG. 7 b.
- the purpose of using the moving average operation at the higher band (4 kHz-8 kHz) is to make the sound more natural by removing the harmonic structure.
- the moving average operation is the average of the amplitude spectrum over a number of samples and the number of samples is increased with the frequency range.
- the moving average is also carried out by the spectrum adjustment algorithm 24 . For example, in the frequency range of 4 kHz-5 kHz, no averaging is carried out. In the frequency range of 5 kHz-6 kHz, the amplitude of the spectrum is averaged over 5 samples. In the frequency range of 6 kHz-7 kHz, the amplitude of the spectrum is averaged over 9 samples.
- FIG. 8 a is an amplitude spectrum of a frame before moving average operation.
- FIG. 8 b is the amplitude spectrum after moving average operation.
- an inverse Fast Fourier Transform (IFFT) module 30 is used to convert the spectrum back to the time domain by inverse Fast Fourier Transform (IFFT).
- An IFFT having a length of 1024 is calculated from each frame. From the transform results, 480 first samples (30 ms) form the time domain representation of the frame. The energy of the each frame has changed after frequency expansion due to the addition of new spectral components to the signal Furthermore, the change of energy varies from frame to frame.
- an energy adjustment module 32 is used to adjust the energy of the wideband frame to the same level as it was in the original narrowband frame.
- an unwindowing module 34 is used to compensate the windowing that was carried out in the computation of the FFT by multiplying all the processed frames by an inverse Hamming window.
- the length of the inverse window is 30 ms, 480 samples.
- a frame cascading module 36 is used to put the frames together by overlapping. It should be noted that the length of the windowed frame at this stage is 30 ms with a sample frequency of 16 kHz as compared to the actual frame of 20 ms.
- the windowed frames are cascaded, it is preferred that the first 50 samples and last 50 samples of the 20 ms middle section of the windowed frame are averaged with samples in the adjacent frames, as shown in FIG. 9 a. The averaging operation is used to avoid sudden jumps between actual frames.
- the continuous sequence of frames comprises a continuous sequence of samples with a sample frequency of 16 kHz.
- the method of artificially expanding the bandwidth of a received speech signal is illustrated in the flowchart 100 , as shown in FIG. 10.
- the upsampled frames are converted at step 102 into transformed frames in the frequency domain by an FFT module (see FIG. 1). It is decided at step 104 whether the transformed frames are indicative of a sibilant or a non-sibilant by the sound classification module (see FIG. 1) using the zero crossings, duration and energy information in the corresponding speech frame in the time domain.
- a transformed frame is that of a non-sibilant, it is decided at step 120 whether the frame is that of a voiced sound or a stop-consonant. If the frame is that of a voiced sound, then the FFT spectrum of the speech frame is attenuated according to an attenuation curve at step 122 . If the frame is that of a stop-consonant, then the FFT spectrum is attenuated according to another attenuation curve at step 124 . However, if the speech segment associated with the transformed frames in the frequency domain is a sibilant as decided at step 104 , then the FFT spectrum of those transformed frames is modified at step 112 or 114 depending on whether the frame is a first frame, as decided at step 110 .
- the modified speech frames are converted back to a plurality of speech frames in the time domain by an inverse FFT module at step 130 , and the energy of these speech frames in the time domain is adjusted by an energy adjustment module at step 140 for further processing.
- the speech frames in the time domain are upsampled by inserting zeros between every other sample of the original signal, thereby doubling the sampling frequency and the bandwidth of the digital speech signal. Consequently, the aliased frequency components in the speech frames between 4 kHz and 8 kHz are created, if the original sampling frequency is 8 kHz.
- the level of the aliased frequency components is adjusted using an adaptive algorithm based on the classification of the speech segment. Adjustment of the aliased frequency components is computed from the original narrowband of the FFT spectrum of the up-sampled speech signal.
- inverse Fourier Transform is used to convert the adjusted spectrum into to the time domain in order to produce a new speech sound with a bandwidth of 300 kHz 7.7 kHz if the original speech signal is transmitted with frequency components between 300 Hz and 3.4 kHz.
- FIG. 11 shows a block diagram of a mobile terminal 200 according to one exemplary embodiment of the invention.
- the mobile terminal 200 comprises parts typical of the terminal, such as a microphone 201 , keypad 207 , display 206 , earphone 214 , transmit/receive switch 208 , antenna 209 and control unit 205 .
- FIG. 11 shows transmitter and receiver blocks 204 , 211 typical of a mobile terminal.
- the transmitter block 204 comprises a coder 221 for coding the speech signal.
- the transmitter block 204 also comprises operations required for channel coding, deciphering and modulation as well as RF functions, which have not been drawn in FIG. 11 for clarity.
- the receiver block 211 also comprises a decoding block 220 according to the invention.
- Decoding block 220 comprises a speech signal modification module 222 , similar to the speech signal modification module 20 shown in FIG. 1.
- the signal to be received is taken from the antenna via the transmit/receive switch 208 to the receiver block 211 , which demodulates the received signal and decodes the deciphering and the channel coding.
- the speech signal modification module 222 artificially expands the received signal in order to improve the quality of the speech.
- the resulting speech signal is taken via the D/A converter 212 to an amplifier 213 and further to an earphone 214 .
- the control unit 205 controls the operation of the mobile terminal 200 , reads the control commands given by the user from the keypad 207 and gives messages to the user by means of the display 206 .
- the speech signal modification module 20 can also be used in a telecommunication network 300 , such as an ordinary telephone network, or a mobile station network, such as the GSM network.
- FIG. 12 shows an example of a block diagram of such a telecommunication network.
- the telecommunication network 300 can comprise telephone exchanges or corresponding switching systems 360 , to which ordinary telephones 370 , base stations 340 , base station controllers 350 and other central devices 355 of telecommunication networks are coupled.
- Mobile terminal 330 can establish connection to the telecommunication network via the base stations 340 .
- a decoding block 320 which includes a speech signal modification module 322 similar to the modification module 20 shown in FIG.
- the speech signal modification module 322 can be applied at a transcoder which is used to transcode speech arriving from the PSTN (Public switched telephone network) or PLMN (Public land mobile network) like GSM or IS-95 to a 3G mobile network.
- the transcoding typically takes place from a narrowband signal representation in PCM (Pulse code modulation) to, e.g., WB-AMR (Wideband adaptive multirate), so that the mobile terminal 330 does not need to carry out the speech signal modification.
- the decoding block 320 can also be placed in the base station controller 350 or other central or switching device 355 , for example.
- the speech signal modification module 332 can be used to improve the quality of the speech by artificially expanding the bandwidth of received speech signals in the base station or the base station controller.
- the speech signal modification module 332 can also be used in personal computers, Voice-over-IP, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Telephone Function (AREA)
- Time-Division Multiplex Systems (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/341,332 US20040138876A1 (en) | 2003-01-10 | 2003-01-10 | Method and apparatus for artificial bandwidth expansion in speech processing |
KR1020057012616A KR100726960B1 (ko) | 2003-01-10 | 2004-01-09 | 음성 처리에서의 인위적인 대역폭 확장 방법 및 장치 |
CNA2004800019784A CN1735926A (zh) | 2003-01-10 | 2004-01-09 | 语音处理中用于人工扩展带宽的方法和设备 |
PCT/IB2004/000030 WO2004064039A2 (en) | 2003-01-10 | 2004-01-09 | Method and apparatus for artificial bandwidth expansion in speech processing |
EP04701060A EP1581929A4 (en) | 2003-01-10 | 2004-01-09 | METHOD AND APPARATUS FOR ARTIFICIALLY EXTENDING BANDWIDTH IN VOICE PROCESSING |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/341,332 US20040138876A1 (en) | 2003-01-10 | 2003-01-10 | Method and apparatus for artificial bandwidth expansion in speech processing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20040138876A1 true US20040138876A1 (en) | 2004-07-15 |
Family
ID=32711503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/341,332 Abandoned US20040138876A1 (en) | 2003-01-10 | 2003-01-10 | Method and apparatus for artificial bandwidth expansion in speech processing |
Country Status (5)
Country | Link |
---|---|
US (1) | US20040138876A1 (zh) |
EP (1) | EP1581929A4 (zh) |
KR (1) | KR100726960B1 (zh) |
CN (1) | CN1735926A (zh) |
WO (1) | WO2004064039A2 (zh) |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050267741A1 (en) * | 2004-05-25 | 2005-12-01 | Nokia Corporation | System and method for enhanced artificial bandwidth expansion |
US20060245565A1 (en) * | 2005-04-27 | 2006-11-02 | Cisco Technology, Inc. | Classifying signals at a conference bridge |
US20060280271A1 (en) * | 2003-09-30 | 2006-12-14 | Matsushita Electric Industrial Co., Ltd. | Sampling rate conversion apparatus, encoding apparatus decoding apparatus and methods thereof |
US20070014344A1 (en) * | 2005-07-14 | 2007-01-18 | Altera Corporation, A Corporation Of Delaware | Programmable receiver equalization circuitry and methods |
EP1801787A1 (en) * | 2005-12-23 | 2007-06-27 | QNX Software Systems (Wavemakers), Inc. | Bandwidth extension of narrowband speech |
US20080177532A1 (en) * | 2007-01-22 | 2008-07-24 | D.S.P. Group Ltd. | Apparatus and methods for enhancement of speech |
WO2008101324A1 (en) * | 2007-02-23 | 2008-08-28 | Qnx Software Systems (Wavemakers), Inc. | High-frequency bandwidth extension in the time domain |
US20080288094A1 (en) * | 2004-07-23 | 2008-11-20 | Mitsugi Fukushima | Auto Signal Output Device |
US20090030699A1 (en) * | 2007-03-14 | 2009-01-29 | Bernd Iser | Providing a codebook for bandwidth extension of an acoustic signal |
US20090110208A1 (en) * | 2007-10-30 | 2009-04-30 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
KR100915733B1 (ko) * | 2005-07-13 | 2009-09-04 | 지멘스 악티엔게젤샤프트 | 음성 신호들의 대역폭의 인공 확장을 위한 방법 및 장치 |
WO2010003539A1 (en) * | 2008-07-11 | 2010-01-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio signal synthesizer and audio signal encoder |
US20100114583A1 (en) * | 2008-09-25 | 2010-05-06 | Lg Electronics Inc. | Apparatus for processing an audio signal and method thereof |
US20110238426A1 (en) * | 2008-10-08 | 2011-09-29 | Guillaume Fuchs | Audio Decoder, Audio Encoder, Method for Decoding an Audio Signal, Method for Encoding an Audio Signal, Computer Program and Audio Signal |
US20110282675A1 (en) * | 2009-04-09 | 2011-11-17 | Frederik Nagel | Apparatus and Method for Generating a Synthesis Audio Signal and for Encoding an Audio Signal |
CN102307323A (zh) * | 2009-04-20 | 2012-01-04 | 华为技术有限公司 | 对多声道信号的声道延迟参数进行修正的方法 |
EP2407966A1 (en) * | 2010-07-15 | 2012-01-18 | Fujitsu Limited | Method and Apparatuses for bandwidth expansion for voice communication |
US20130275126A1 (en) * | 2011-10-11 | 2013-10-17 | Robert Schiff Lee | Methods and systems to modify a speech signal while preserving aural distinctions between speech sounds |
CN104269173A (zh) * | 2014-09-30 | 2015-01-07 | 武汉大学深圳研究院 | 切换模式的音频带宽扩展装置与方法 |
US8976971B2 (en) | 2009-04-20 | 2015-03-10 | Huawei Technologies Co., Ltd. | Method and apparatus for adjusting channel delay parameter of multi-channel signal |
US9025779B2 (en) | 2011-08-08 | 2015-05-05 | Cisco Technology, Inc. | System and method for using endpoints to provide sound monitoring |
US20150170655A1 (en) * | 2013-12-15 | 2015-06-18 | Qualcomm Incorporated | Systems and methods of blind bandwidth extension |
EP2806423A4 (en) * | 2012-01-20 | 2015-06-24 | Panasonic Ip Corp America | LANGUAGE DECODING DEVICE AND LANGUAGE DECODING METHOD |
US9076433B2 (en) | 2009-04-09 | 2015-07-07 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating a synthesis audio signal and for encoding an audio signal |
US9177569B2 (en) | 2007-10-30 | 2015-11-03 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
US20160372125A1 (en) * | 2015-06-18 | 2016-12-22 | Qualcomm Incorporated | High-band signal generation |
US9591121B2 (en) | 2014-08-28 | 2017-03-07 | Samsung Electronics Co., Ltd. | Function controlling method and electronic device supporting the same |
US9640192B2 (en) | 2014-02-20 | 2017-05-02 | Samsung Electronics Co., Ltd. | Electronic device and method of controlling electronic device |
US20170372719A1 (en) * | 2016-06-22 | 2017-12-28 | Dolby Laboratories Licensing Corporation | Sibilance Detection and Mitigation |
US10043534B2 (en) | 2013-12-23 | 2018-08-07 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US10043535B2 (en) | 2013-01-15 | 2018-08-07 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US10045135B2 (en) | 2013-10-24 | 2018-08-07 | Staton Techiya, Llc | Method and device for recognition and arbitration of an input connection |
US10522156B2 (en) | 2009-04-02 | 2019-12-31 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus, method and computer program for generating a representation of a bandwidth-extended signal on the basis of an input signal representation using a combination of a harmonic bandwidth-extension and a non-harmonic bandwidth-extension |
US10847170B2 (en) | 2015-06-18 | 2020-11-24 | Qualcomm Incorporated | Device and method for generating a high-band signal from non-linearly processed sub-ranges |
CN114534130A (zh) * | 2020-11-25 | 2022-05-27 | 深圳市安联消防技术有限公司 | 一种呼吸面具气流噪音消除方法 |
US12009003B2 (en) | 2022-08-19 | 2024-06-11 | Qualcomm Incorporated | Device and method for generating a high-band signal from non-linearly processed sub-ranges |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100905585B1 (ko) * | 2007-03-02 | 2009-07-02 | 삼성전자주식회사 | 음성신호의 대역폭 확장 제어 방법 및 장치 |
CN102629470B (zh) * | 2011-02-02 | 2015-05-20 | Jvc建伍株式会社 | 辅音区间检测装置及辅音区间检测方法 |
ES2659001T3 (es) * | 2013-01-29 | 2018-03-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Codificadores de audio, decodificadores de audio, sistemas, métodos y programas informáticos que utilizan una resolución temporal aumentada en la proximidad temporal de inicios o finales de fricativos o africados |
KR102483990B1 (ko) * | 2021-01-05 | 2023-01-04 | 국방과학연구소 | 적응 빔포밍 방법 및 이를 이용한 능동 소나 장치 |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5323337A (en) * | 1992-08-04 | 1994-06-21 | Loral Aerospace Corp. | Signal detector employing mean energy and variance of energy content comparison for noise detection |
US20010044722A1 (en) * | 2000-01-28 | 2001-11-22 | Harald Gustafsson | System and method for modifying speech signals |
US6336092B1 (en) * | 1997-04-28 | 2002-01-01 | Ivl Technologies Ltd | Targeted vocal transformation |
US6418412B1 (en) * | 1998-10-05 | 2002-07-09 | Legerity, Inc. | Quantization using frequency and mean compensated frequency input data for robust speech recognition |
US20020128839A1 (en) * | 2001-01-12 | 2002-09-12 | Ulf Lindgren | Speech bandwidth extension |
US6507820B1 (en) * | 1999-07-06 | 2003-01-14 | Telefonaktiebolaget Lm Ericsson | Speech band sampling rate expansion |
US20030050786A1 (en) * | 2000-08-24 | 2003-03-13 | Peter Jax | Method and apparatus for synthetic widening of the bandwidth of voice signals |
US20030093279A1 (en) * | 2001-10-04 | 2003-05-15 | David Malah | System for bandwidth extension of narrow-band speech |
US20030187663A1 (en) * | 2002-03-28 | 2003-10-02 | Truman Michael Mead | Broadband frequency translation for high frequency regeneration |
US6708145B1 (en) * | 1999-01-27 | 2004-03-16 | Coding Technologies Sweden Ab | Enhancing perceptual performance of sbr and related hfr coding methods by adaptive noise-floor addition and noise substitution limiting |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6311154B1 (en) * | 1998-12-30 | 2001-10-30 | Nokia Mobile Phones Limited | Adaptive windows for analysis-by-synthesis CELP-type speech coding |
-
2003
- 2003-01-10 US US10/341,332 patent/US20040138876A1/en not_active Abandoned
-
2004
- 2004-01-09 CN CNA2004800019784A patent/CN1735926A/zh active Pending
- 2004-01-09 EP EP04701060A patent/EP1581929A4/en not_active Ceased
- 2004-01-09 KR KR1020057012616A patent/KR100726960B1/ko not_active IP Right Cessation
- 2004-01-09 WO PCT/IB2004/000030 patent/WO2004064039A2/en active Application Filing
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5323337A (en) * | 1992-08-04 | 1994-06-21 | Loral Aerospace Corp. | Signal detector employing mean energy and variance of energy content comparison for noise detection |
US6336092B1 (en) * | 1997-04-28 | 2002-01-01 | Ivl Technologies Ltd | Targeted vocal transformation |
US6418412B1 (en) * | 1998-10-05 | 2002-07-09 | Legerity, Inc. | Quantization using frequency and mean compensated frequency input data for robust speech recognition |
US6708145B1 (en) * | 1999-01-27 | 2004-03-16 | Coding Technologies Sweden Ab | Enhancing perceptual performance of sbr and related hfr coding methods by adaptive noise-floor addition and noise substitution limiting |
US6507820B1 (en) * | 1999-07-06 | 2003-01-14 | Telefonaktiebolaget Lm Ericsson | Speech band sampling rate expansion |
US20010044722A1 (en) * | 2000-01-28 | 2001-11-22 | Harald Gustafsson | System and method for modifying speech signals |
US20030050786A1 (en) * | 2000-08-24 | 2003-03-13 | Peter Jax | Method and apparatus for synthetic widening of the bandwidth of voice signals |
US20020128839A1 (en) * | 2001-01-12 | 2002-09-12 | Ulf Lindgren | Speech bandwidth extension |
US20030093279A1 (en) * | 2001-10-04 | 2003-05-15 | David Malah | System for bandwidth extension of narrow-band speech |
US20030187663A1 (en) * | 2002-03-28 | 2003-10-02 | Truman Michael Mead | Broadband frequency translation for high frequency regeneration |
Cited By (78)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120221342A1 (en) * | 2003-09-30 | 2012-08-30 | Panasonic Corporation | Decoding apparatus and decoding method |
US20060280271A1 (en) * | 2003-09-30 | 2006-12-14 | Matsushita Electric Industrial Co., Ltd. | Sampling rate conversion apparatus, encoding apparatus decoding apparatus and methods thereof |
US8195471B2 (en) | 2003-09-30 | 2012-06-05 | Panasonic Corporation | Sampling rate conversion apparatus, coding apparatus, decoding apparatus and methods thereof |
US8374884B2 (en) * | 2003-09-30 | 2013-02-12 | Panasonic Corporation | Decoding apparatus and decoding method |
US7756711B2 (en) * | 2003-09-30 | 2010-07-13 | Panasonic Corporation | Sampling rate conversion apparatus, encoding apparatus decoding apparatus and methods thereof |
US8712768B2 (en) | 2004-05-25 | 2014-04-29 | Nokia Corporation | System and method for enhanced artificial bandwidth expansion |
US20050267741A1 (en) * | 2004-05-25 | 2005-12-01 | Nokia Corporation | System and method for enhanced artificial bandwidth expansion |
US8160887B2 (en) * | 2004-07-23 | 2012-04-17 | D&M Holdings, Inc. | Adaptive interpolation in upsampled audio signal based on frequency of polarity reversals |
US20080288094A1 (en) * | 2004-07-23 | 2008-11-20 | Mitsugi Fukushima | Auto Signal Output Device |
US20060245565A1 (en) * | 2005-04-27 | 2006-11-02 | Cisco Technology, Inc. | Classifying signals at a conference bridge |
US7852999B2 (en) * | 2005-04-27 | 2010-12-14 | Cisco Technology, Inc. | Classifying signals at a conference bridge |
KR100915733B1 (ko) * | 2005-07-13 | 2009-09-04 | 지멘스 악티엔게젤샤프트 | 음성 신호들의 대역폭의 인공 확장을 위한 방법 및 장치 |
US7697600B2 (en) * | 2005-07-14 | 2010-04-13 | Altera Corporation | Programmable receiver equalization circuitry and methods |
US20070014344A1 (en) * | 2005-07-14 | 2007-01-18 | Altera Corporation, A Corporation Of Delaware | Programmable receiver equalization circuitry and methods |
US7546237B2 (en) * | 2005-12-23 | 2009-06-09 | Qnx Software Systems (Wavemakers), Inc. | Bandwidth extension of narrowband speech |
EP1801787A1 (en) * | 2005-12-23 | 2007-06-27 | QNX Software Systems (Wavemakers), Inc. | Bandwidth extension of narrowband speech |
US20070150269A1 (en) * | 2005-12-23 | 2007-06-28 | Rajeev Nongpiur | Bandwidth extension of narrowband speech |
US20080177532A1 (en) * | 2007-01-22 | 2008-07-24 | D.S.P. Group Ltd. | Apparatus and methods for enhancement of speech |
WO2008090541A3 (en) * | 2007-01-22 | 2008-09-25 | Dsp Group Ltd | Apparatus and methods for enhancement of speech |
EP2144232A3 (en) * | 2007-01-22 | 2010-08-25 | DSP Group Ltd. | Apparatus and methods for enhancement of speech |
US8229106B2 (en) | 2007-01-22 | 2012-07-24 | D.S.P. Group, Ltd. | Apparatus and methods for enhancement of speech |
WO2008090541A2 (en) * | 2007-01-22 | 2008-07-31 | Dsp Group Ltd. | Apparatus and methods for enhancement of speech |
WO2008101324A1 (en) * | 2007-02-23 | 2008-08-28 | Qnx Software Systems (Wavemakers), Inc. | High-frequency bandwidth extension in the time domain |
US8190429B2 (en) * | 2007-03-14 | 2012-05-29 | Nuance Communications, Inc. | Providing a codebook for bandwidth extension of an acoustic signal |
US20090030699A1 (en) * | 2007-03-14 | 2009-01-29 | Bernd Iser | Providing a codebook for bandwidth extension of an acoustic signal |
US20090110208A1 (en) * | 2007-10-30 | 2009-04-30 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
US8321229B2 (en) | 2007-10-30 | 2012-11-27 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
US10255928B2 (en) | 2007-10-30 | 2019-04-09 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
EP2056294A3 (en) * | 2007-10-30 | 2010-02-17 | Samsung Electronics Co., Ltd. | Apparatus, Medium and Method to Encode and Decode High Frequency Signal |
US9818429B2 (en) | 2007-10-30 | 2017-11-14 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
US9177569B2 (en) | 2007-10-30 | 2015-11-03 | Samsung Electronics Co., Ltd. | Apparatus, medium and method to encode and decode high frequency signal |
US8731948B2 (en) | 2008-07-11 | 2014-05-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio signal synthesizer for selectively performing different patching algorithms |
US10522168B2 (en) | 2008-07-11 | 2019-12-31 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio signal synthesizer and audio signal encoder |
WO2010003539A1 (en) * | 2008-07-11 | 2010-01-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio signal synthesizer and audio signal encoder |
US10014000B2 (en) | 2008-07-11 | 2018-07-03 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio signal encoder and method for generating a data stream having components of an audio signal in a first frequency band, control information and spectral band replication parameters |
US20100114583A1 (en) * | 2008-09-25 | 2010-05-06 | Lg Electronics Inc. | Apparatus for processing an audio signal and method thereof |
US8831958B2 (en) * | 2008-09-25 | 2014-09-09 | Lg Electronics Inc. | Method and an apparatus for a bandwidth extension using different schemes |
US8494865B2 (en) | 2008-10-08 | 2013-07-23 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio decoder, audio encoder, method for decoding an audio signal, method for encoding an audio signal, computer program and audio signal |
US20110238426A1 (en) * | 2008-10-08 | 2011-09-29 | Guillaume Fuchs | Audio Decoder, Audio Encoder, Method for Decoding an Audio Signal, Method for Encoding an Audio Signal, Computer Program and Audio Signal |
US10909994B2 (en) | 2009-04-02 | 2021-02-02 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus, method and computer program for generating a representation of a bandwidth-extended signal on the basis of an input signal representation using a combination of a harmonic bandwidth-extension and a non-harmonic bandwidth-extension |
US9697838B2 (en) | 2009-04-02 | 2017-07-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus, method and computer program for generating a representation of a bandwidth-extended signal on the basis of an input signal representation using a combination of a harmonic bandwidth-extension and a non-harmonic bandwidth-extension |
US10522156B2 (en) | 2009-04-02 | 2019-12-31 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus, method and computer program for generating a representation of a bandwidth-extended signal on the basis of an input signal representation using a combination of a harmonic bandwidth-extension and a non-harmonic bandwidth-extension |
US20110282675A1 (en) * | 2009-04-09 | 2011-11-17 | Frederik Nagel | Apparatus and Method for Generating a Synthesis Audio Signal and for Encoding an Audio Signal |
US9076433B2 (en) | 2009-04-09 | 2015-07-07 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating a synthesis audio signal and for encoding an audio signal |
US8386268B2 (en) * | 2009-04-09 | 2013-02-26 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating a synthesis audio signal using a patching control signal |
CN102307323A (zh) * | 2009-04-20 | 2012-01-04 | 华为技术有限公司 | 对多声道信号的声道延迟参数进行修正的方法 |
US8976971B2 (en) | 2009-04-20 | 2015-03-10 | Huawei Technologies Co., Ltd. | Method and apparatus for adjusting channel delay parameter of multi-channel signal |
US9070372B2 (en) * | 2010-07-15 | 2015-06-30 | Fujitsu Limited | Apparatus and method for voice processing and telephone apparatus |
US20120016669A1 (en) * | 2010-07-15 | 2012-01-19 | Fujitsu Limited | Apparatus and method for voice processing and telephone apparatus |
EP2407966A1 (en) * | 2010-07-15 | 2012-01-18 | Fujitsu Limited | Method and Apparatuses for bandwidth expansion for voice communication |
US9025779B2 (en) | 2011-08-08 | 2015-05-05 | Cisco Technology, Inc. | System and method for using endpoints to provide sound monitoring |
US20130275126A1 (en) * | 2011-10-11 | 2013-10-17 | Robert Schiff Lee | Methods and systems to modify a speech signal while preserving aural distinctions between speech sounds |
EP2806423A4 (en) * | 2012-01-20 | 2015-06-24 | Panasonic Ip Corp America | LANGUAGE DECODING DEVICE AND LANGUAGE DECODING METHOD |
US9390721B2 (en) | 2012-01-20 | 2016-07-12 | Panasonic Intellectual Property Corporation Of America | Speech decoding device and speech decoding method |
US10622005B2 (en) | 2013-01-15 | 2020-04-14 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US10043535B2 (en) | 2013-01-15 | 2018-08-07 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US10425754B2 (en) | 2013-10-24 | 2019-09-24 | Staton Techiya, Llc | Method and device for recognition and arbitration of an input connection |
US11595771B2 (en) | 2013-10-24 | 2023-02-28 | Staton Techiya, Llc | Method and device for recognition and arbitration of an input connection |
US11089417B2 (en) | 2013-10-24 | 2021-08-10 | Staton Techiya Llc | Method and device for recognition and arbitration of an input connection |
US10820128B2 (en) | 2013-10-24 | 2020-10-27 | Staton Techiya, Llc | Method and device for recognition and arbitration of an input connection |
US10045135B2 (en) | 2013-10-24 | 2018-08-07 | Staton Techiya, Llc | Method and device for recognition and arbitration of an input connection |
US20150170655A1 (en) * | 2013-12-15 | 2015-06-18 | Qualcomm Incorporated | Systems and methods of blind bandwidth extension |
US9524720B2 (en) | 2013-12-15 | 2016-12-20 | Qualcomm Incorporated | Systems and methods of blind bandwidth extension |
US11551704B2 (en) | 2013-12-23 | 2023-01-10 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US10043534B2 (en) | 2013-12-23 | 2018-08-07 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US11741985B2 (en) | 2013-12-23 | 2023-08-29 | Staton Techiya Llc | Method and device for spectral expansion for an audio signal |
US10636436B2 (en) | 2013-12-23 | 2020-04-28 | Staton Techiya, Llc | Method and device for spectral expansion for an audio signal |
US9640192B2 (en) | 2014-02-20 | 2017-05-02 | Samsung Electronics Co., Ltd. | Electronic device and method of controlling electronic device |
US9591121B2 (en) | 2014-08-28 | 2017-03-07 | Samsung Electronics Co., Ltd. | Function controlling method and electronic device supporting the same |
CN104269173A (zh) * | 2014-09-30 | 2015-01-07 | 武汉大学深圳研究院 | 切换模式的音频带宽扩展装置与方法 |
US10847170B2 (en) | 2015-06-18 | 2020-11-24 | Qualcomm Incorporated | Device and method for generating a high-band signal from non-linearly processed sub-ranges |
US20160372125A1 (en) * | 2015-06-18 | 2016-12-22 | Qualcomm Incorporated | High-band signal generation |
US11437049B2 (en) | 2015-06-18 | 2022-09-06 | Qualcomm Incorporated | High-band signal generation |
US9837089B2 (en) * | 2015-06-18 | 2017-12-05 | Qualcomm Incorporated | High-band signal generation |
US10867620B2 (en) * | 2016-06-22 | 2020-12-15 | Dolby Laboratories Licensing Corporation | Sibilance detection and mitigation |
US20170372719A1 (en) * | 2016-06-22 | 2017-12-28 | Dolby Laboratories Licensing Corporation | Sibilance Detection and Mitigation |
CN114534130A (zh) * | 2020-11-25 | 2022-05-27 | 深圳市安联消防技术有限公司 | 一种呼吸面具气流噪音消除方法 |
US12009003B2 (en) | 2022-08-19 | 2024-06-11 | Qualcomm Incorporated | Device and method for generating a high-band signal from non-linearly processed sub-ranges |
Also Published As
Publication number | Publication date |
---|---|
EP1581929A4 (en) | 2007-10-31 |
KR20050089874A (ko) | 2005-09-08 |
WO2004064039A3 (en) | 2004-11-25 |
KR100726960B1 (ko) | 2007-06-14 |
WO2004064039A2 (en) | 2004-07-29 |
EP1581929A2 (en) | 2005-10-05 |
CN1735926A (zh) | 2006-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20040138876A1 (en) | Method and apparatus for artificial bandwidth expansion in speech processing | |
JP3653826B2 (ja) | 音声復号化方法及び装置 | |
US6704711B2 (en) | System and method for modifying speech signals | |
US8311842B2 (en) | Method and apparatus for expanding bandwidth of voice signal | |
EP0993670B1 (en) | Method and apparatus for speech enhancement in a speech communication system | |
US6889182B2 (en) | Speech bandwidth extension | |
US7813931B2 (en) | System for improving speech quality and intelligibility with bandwidth compression/expansion | |
CN1750124B (zh) | 带限音频信号的带宽扩展 | |
RU2146394C1 (ru) | Способ и устройство вокодирования переменной скорости при пониженной скорости кодирования | |
US8219389B2 (en) | System for improving speech intelligibility through high frequency compression | |
US8447617B2 (en) | Method and system for speech bandwidth extension | |
KR100574031B1 (ko) | 음성합성방법및장치그리고음성대역확장방법및장치 | |
WO2002056301A1 (en) | Speech bandwidth extension | |
JP4040126B2 (ja) | 音声復号化方法および装置 | |
EP1008984A2 (en) | Windband speech synthesis from a narrowband speech signal | |
WO2014129233A1 (ja) | 音声強調装置 | |
US20010027390A1 (en) | Speech decoder and a method for decoding speech | |
JP3183104B2 (ja) | ノイズ削減装置 | |
JP3360423B2 (ja) | 音声強調装置 | |
GB2343822A (en) | Using LSP to alter frequency characteristics of speech | |
JP3896654B2 (ja) | 音声信号区間検出方法及び装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NOKIA CORPORATION, FINLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KALLIO, L.;ALKU, P.;KAYHKO, K.;AND OTHERS;REEL/FRAME:014038/0727;SIGNING DATES FROM 20030422 TO 20030428 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION |