WO2016127506A1 - 语音处理方法、语音处理装置和终端 - Google Patents

语音处理方法、语音处理装置和终端 Download PDF

Info

Publication number
WO2016127506A1
WO2016127506A1 PCT/CN2015/078091 CN2015078091W WO2016127506A1 WO 2016127506 A1 WO2016127506 A1 WO 2016127506A1 CN 2015078091 W CN2015078091 W CN 2015078091W WO 2016127506 A1 WO2016127506 A1 WO 2016127506A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice information
voice
preset
collected
information
Prior art date
Application number
PCT/CN2015/078091
Other languages
English (en)
French (fr)
Inventor
尹宾
卢纯
Original Assignee
宇龙计算机通信科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宇龙计算机通信科技(深圳)有限公司 filed Critical 宇龙计算机通信科技(深圳)有限公司
Publication of WO2016127506A1 publication Critical patent/WO2016127506A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain

Definitions

  • the present invention relates to the field of terminal technologies, and in particular, to a voice processing method, a voice processing device, and a terminal.
  • the terminal generally uses the dual-mike noise reduction method to reduce the voice noise.
  • the terminal can use the noise-reduction microphone to filter out the collected noise, so that the voice quality can still be guaranteed in a noisy environment. Clear.
  • the dual mic noise reduction method can only filter the noisy background sound, and cannot effectively filter the vocal sound in the background voice information such as music in a quiet environment, and when the volume of the background voice is large, the call or voice recognition is usually used. Cause interference, for example, when driving in the car, turn on the sound in the car. At this time, if the user answers the call, the voice in the music in the car will cause great interference to the call.
  • the dual mic noise reduction method also reduces the volume of the voice due to its own characteristics, which may affect the quality of the call or voice recognition.
  • the present invention is based on the above problems, and proposes a new technical solution, which can reduce the noise in the voice while ensuring the voice quality.
  • an aspect of the present invention provides a voice processing method for a terminal, including: collecting voice information in a preset noise filtering mode; and determining whether the collected voice information has a preset voice. a portion matching the information; when it is determined that the voice information has a portion matching the preset voice information, synchronizing the preset voice information with the collected voice information, and The spectrum of the preset voice information is eliminated for performing noise reduction processing on the collected voice information.
  • voice information may be collected during a call or voice recognition, and it is determined whether the cloud in the terminal or connected to the terminal has preset voice information that matches the voice information, and if so, in a call or While performing speech recognition, the spectrum of the preset speech information is eliminated, thereby realizing the function of eliminating background noise. For example, when a user makes a call in a car playing a song, the mobile phone can collect a piece of song information and compare whether the song is stored in the mobile phone.
  • the spectrum of the song is synchronously eliminated, thereby eliminating only Eliminate the background noise caused by the song, avoid the problem that the normal voice is eliminated due to the noise caused by the double microphone noise reduction, improve the accuracy of the noise cancellation and the quality of the call, and improve the user experience.
  • the matching in the technical solution refers to being identical, or the similarity is greater than a predetermined value.
  • the method further includes: when the call command or the voice recognition command is detected, entering the preset noise filtering mode while the call function or the voice recognition function is turned on .
  • the preset noise filtering mode is enabled only when the terminal performs a call or voice recognition, so as to reduce the power consumption of the terminal, and avoid collecting and matching the voice information due to the long-term opening of the preset noise filtering mode, thereby avoiding the terminal.
  • the performance is reduced.
  • the determining whether the collected voice information has a portion that matches the preset voice information specifically includes: acquiring the preset voice information from a predetermined location, where The predetermined location includes a cloud connected to the terminal and/or a storage device of the terminal, the preset voice information includes at least one voice segment; and determining the at least one voice in the acquired preset voice information Whether any of the segments in the segment match the collected speech information.
  • the terminal may compare the collected voice information with the voice information stored by itself, or may compare with the voice information in the server such as the connected cloud.
  • the voice information collected by the terminal is only a voice segment, and the terminal can determine whether the preset voice information has the same segment. When determining that the preset voice message has the same segment, the collected voice is at the same segment as the preset voice. Synchronization, in addition to collecting voice information, while eliminating the spectrum of the preset voice information in the voice information, that is, the function of synchronous noise elimination is realized, and the user does not affect the normal call or voice recognition.
  • the method further includes: when it is determined that the collected voice information does not have a portion matching the preset voice information, re-collecting the voice information for re-collection according to The voice information determines whether the noise reduction process is performed on the re-collected voice information.
  • the voice information collected by the terminal cannot match the same or similar preset voice information in the terminal or the cloud, the voice information may be re-collected for matching until the matching is successful, or the call or voice recognition ends, and the whole
  • the matching process takes a very short time, and the longest is generally no more than 10s.
  • the method further includes: when the call end command or the voice recognition termination command is detected, the preset noise filtering mode is exited while the call function or the voice recognition function is turned off.
  • the preset noise filtering mode can be turned off to save energy and avoid performance degradation of the terminal.
  • a voice processing apparatus for a terminal, comprising: a voice collection unit, which collects voice information in a preset noise filtering mode; and a determining unit that determines whether the collected voice information has a portion matching the preset voice information; the noise reduction processing unit, when determining that the voice information has a portion matching the preset voice information, the preset voice information and the collected voice The information is synchronized, and the spectrum of the preset voice information is eliminated for performing noise reduction processing on the collected voice information.
  • voice information may be collected during a call or voice recognition, and it is determined whether the cloud in the terminal or connected to the terminal has preset voice information that matches the voice information, and if so, in a call or While performing speech recognition, the spectrum of the preset speech information is eliminated, thereby realizing the function of eliminating background noise. For example, when a user makes a call in a car playing a song, the mobile phone can collect a piece of song information and compare whether the song is stored in the mobile phone.
  • the spectrum of the song is synchronously eliminated, thereby eliminating only Eliminate the background noise caused by the song, avoid the problem that the normal voice is eliminated due to the noise caused by the double microphone noise reduction, improve the accuracy of the noise cancellation and the quality of the call, and improve the user experience.
  • the matching in the technical solution refers to being identical, or the similarity is greater than a predetermined value.
  • the method further includes: an opening unit, before the collecting the voice information, when the call command or the voice recognition command is detected, the call function or the voice recognition function is turned on At the same time, the preset noise filtering mode is entered.
  • the preset noise filtering mode is enabled only when the terminal performs a call or voice recognition, so as to reduce the power consumption of the terminal, and avoid collecting and matching the voice information due to the long-term opening of the preset noise filtering mode, thereby avoiding the terminal.
  • the performance is reduced.
  • the determining unit is specifically configured to: acquire the preset voice information from a predetermined location, where the predetermined location includes a cloud connected to the terminal and/or a storage of the terminal The device, the preset voice information includes at least one voice segment, and determining whether any of the at least one voice segment in the acquired preset voice information matches the collected voice information .
  • the terminal may compare the collected voice information with the voice information stored by itself, or may compare with the voice information in the server such as the connected cloud.
  • the voice information collected by the terminal is only a voice segment, and the terminal can determine whether the preset voice information has the same segment. When determining that the preset voice message has the same segment, the collected voice is at the same segment as the preset voice. Synchronization, in addition to collecting voice information, while eliminating the spectrum of the preset voice information in the voice information, that is, the function of synchronous noise elimination is realized, and the user does not affect the normal call or voice recognition.
  • the voice collection unit is further configured to: when it is determined that the collected voice information does not have a portion that matches the preset voice information, re-collect the voice information, And determining whether to perform the noise reduction processing on the re-collected voice information according to the re-collected voice information.
  • the voice information collected by the terminal cannot match the same or similar preset voice information in the terminal or the cloud, the voice information may be re-collected for matching until the matching is successful, or the call or voice recognition ends, and the whole
  • the matching process takes a very short time, and the longest is generally no more than 10s.
  • the method further includes: a closing unit, when the call end command or the voice recognition termination command is detected, exiting the preset noise filtering while turning off the call function or the voice recognition function mode.
  • the preset noise filtering mode can be turned off to save energy and avoid performance degradation of the terminal.
  • An embodiment of the third aspect of the present invention provides a terminal, the terminal including a communication bus, and receiving a device, a memory, and a processor, wherein:
  • the communication bus is configured to implement connection communication between the transceiver device, the memory, and the processor;
  • the program stores a set of program codes, and the processor calls program code stored in the memory to perform the following operations:
  • the transceiver device is configured to collect voice information in a preset noise filtering mode
  • the processor is configured to determine whether the collected voice information has a portion that matches the preset voice information
  • the processor is further configured to: when determining that the voice information has a portion that matches the preset voice information, synchronize the preset voice information with the collected voice information, and eliminate the Presetting the spectrum of the voice information for performing noise reduction processing on the collected voice information.
  • the processor is further configured to perform the following steps:
  • the preset noise filtering mode is entered while the call function or the voice recognition function is turned on.
  • the processor determines whether the collected voice information has a part that matches the preset voice information, and specifically includes:
  • the preset voice information from a predetermined location, where the predetermined location includes a cloud connected to the terminal and/or a storage device of the terminal, the preset voice information including at least one voice segment;
  • the processor is further configured to perform the following steps:
  • the voice information is re-collected for determining whether to re-collect the voice information according to the re-collected
  • the speech information is subjected to the noise reduction processing.
  • the processor is further configured to perform the following steps:
  • the preset noise filtering mode is exited while the call function or the voice recognition function is turned off.
  • the background noise caused by the preset voice information can be eliminated and only the problem that the normal voice other than the noise caused by the double microphone noise reduction is also eliminated, the accuracy of the noise cancellation and the call are improved. Quality improves the user experience.
  • FIG. 1 shows a flow chart of a voice processing method in accordance with one embodiment of the present invention
  • FIG. 2 shows a block diagram of a speech processing apparatus in accordance with one embodiment of the present invention
  • Figure 3 shows a block diagram of a terminal in accordance with one embodiment of the present invention
  • FIG. 4A shows a flowchart of a voice processing method in accordance with another embodiment of the present invention.
  • FIG. 4B is a schematic diagram showing song matching from the cloud in FIG. 4A;
  • FIG. 4B is a schematic diagram showing song matching from the cloud in FIG. 4A;
  • Figure 4C shows a schematic diagram of song matching from a local terminal in Figure 4A.
  • FIG. 1 shows a flow chart of a speech processing method in accordance with one embodiment of the present invention.
  • a voice processing method is used for a terminal, including:
  • Step 102 Collect voice information in a preset noise filtering mode.
  • Step 104 Determine whether the collected voice information has a portion that matches the preset voice information.
  • Step 106 When it is determined that the voice information has a portion matching the preset voice information, the preset voice information is synchronized with the collected voice information, and the spectrum of the preset voice information is eliminated, for the collected voice information. Perform noise reduction processing.
  • voice information can be collected during a call or voice recognition. And determining whether the cloud in the terminal or connected to the terminal has preset voice information that matches the voice information, and if so, canceling the spectrum of the preset voice information while calling or performing voice recognition, thereby eliminating the background
  • the function of noise For example, when a user makes a call in a car playing a song, the mobile phone can collect a piece of song information and compare whether the song is stored in the mobile phone.
  • the spectrum of the song is synchronously eliminated, thereby eliminating only Eliminate the background noise caused by the song, avoid the problem that the normal voice is eliminated due to the noise caused by the double microphone noise reduction, improve the accuracy of the noise cancellation and the quality of the call, and improve the user experience.
  • the matching in the technical solution refers to being identical, or the similarity is greater than a predetermined value.
  • the method further includes: when the call command or the voice recognition command is detected, entering the preset noise filtering mode while the call function or the voice recognition function is turned on.
  • the preset noise filtering mode is enabled only when the terminal performs a call or voice recognition, so as to reduce the power consumption of the terminal, and avoid collecting and matching the voice information due to the long-term opening of the preset noise filtering mode, thereby avoiding the terminal.
  • the performance is reduced.
  • the step 104 includes: acquiring preset voice information from a predetermined location, where the predetermined location includes a cloud and/or a storage device of the terminal connected to the terminal, and the preset voice information includes at least one voice segment. And determining whether any of the at least one of the acquired preset voice information matches the collected voice information.
  • the terminal may compare the collected voice information with the voice information stored by itself, or may compare with the voice information in the server such as the connected cloud.
  • the voice information collected by the terminal is only a voice segment, and the terminal can determine whether the preset voice information has the same segment. When determining that the preset voice message has the same segment, the collected voice is at the same segment as the preset voice. Synchronization, in addition to collecting voice information, while eliminating the spectrum of the preset voice information in the voice information, that is, the function of synchronous noise elimination is realized, and the user does not affect the normal call or voice recognition.
  • the voice information is re-collected for determining whether to re-recover according to the re-collected voice information.
  • the collected voice information is subjected to noise reduction processing.
  • the voice information collected by the terminal cannot be matched in the terminal or the cloud, To the same or similar preset voice information, the voice information can be re-collected until the match is successful, or the call or voice recognition ends.
  • the entire matching process takes a short time, and the longest time generally does not exceed 10s.
  • the method further includes: when the call end command or the voice recognition termination command is detected, the preset noise filtering mode is exited while the call function or the voice recognition function is turned off.
  • the preset noise filtering mode can be turned off to save energy and avoid performance degradation of the terminal.
  • FIG. 2 shows a block diagram of a speech processing device in accordance with one embodiment of the present invention.
  • voice information may be collected during a call or voice recognition, and it is determined whether the cloud in the terminal or connected to the terminal has preset voice information that matches the voice information, and if so, in a call or While performing speech recognition, the spectrum of the preset speech information is eliminated, thereby realizing the function of eliminating background noise. For example, when a user makes a call in a car playing a song, the mobile phone can collect a piece of song information and compare whether the song is stored in the mobile phone.
  • the spectrum of the song is synchronously eliminated, thereby eliminating only Eliminate the background noise caused by the song, avoid the problem that the normal voice is eliminated due to the noise caused by the double microphone noise reduction, improve the accuracy of the noise cancellation and the quality of the call, and improve the user experience.
  • the matching in the technical solution refers to being identical, or the similarity is greater than a predetermined value.
  • the method further includes: the opening unit 208, when the call command or the voice recognition command is detected before the voice information is collected, entering the preset noise filtering mode while the call function or the voice recognition function is turned on .
  • the preset noise filtering mode is enabled only when the terminal performs a call or voice recognition, so as to reduce the power consumption of the terminal and avoid the long-term opening of the preset noise filtering mode. Collect and match voice information to avoid performance degradation of the terminal.
  • the determining unit 204 is specifically configured to: acquire preset voice information from a predetermined location, where the predetermined location includes a storage device of the cloud and/or the terminal connected to the terminal, and the preset voice information includes at least one The voice segment, and determining whether any of the at least one of the acquired preset voice information matches the collected voice information.
  • the terminal may compare the collected voice information with the voice information stored by itself, or may compare with the voice information in the server such as the connected cloud.
  • the voice information collected by the terminal is only a voice segment, and the terminal can determine whether the preset voice information has the same segment. When determining that the preset voice message has the same segment, the collected voice is at the same segment as the preset voice. Synchronization, in addition to collecting voice information, while eliminating the spectrum of the preset voice information in the voice information, that is, the function of synchronous noise elimination is realized, and the user does not affect the normal call or voice recognition.
  • the voice collection unit 202 is further configured to: when it is determined that the collected voice information does not have a portion that matches the preset voice information, re-collect the voice information for the voice according to the re-collection The information determines whether noise reduction processing is performed on the re-collected voice information.
  • the voice information collected by the terminal cannot match the same or similar preset voice information in the terminal or the cloud, the voice information may be re-collected for matching until the matching is successful, or the call or voice recognition ends, and the whole
  • the matching process takes a very short time, and the longest is generally no more than 10s.
  • the method further includes: the closing unit 210, when detecting the call end command or the voice recognition termination command, exiting the preset noise filtering mode while the call function or the voice recognition function is turned off.
  • the preset noise filtering mode can be turned off to save energy and avoid performance degradation of the terminal.
  • Figure 3 shows a block diagram of a terminal in accordance with one embodiment of the present invention.
  • a terminal includes: at least one transceiver 303, at least one processor 301, such as a CPU, a memory 304, and at least one communication bus. 302.
  • the communication bus 302 is used to connect the transceiver 303, the processor 301, and the memory 304.
  • the above memory 304 may be a high speed RAM memory or a non-volatile memory such as a disk memory.
  • the memory 304 is further configured to store a set of program codes, and the transceiver 303 and the processor 301 are configured to call the program code stored in the memory 304 to perform the following operations:
  • the transceiver device 303 is configured to collect voice information in a preset noise filtering mode.
  • the processor 301 is configured to determine whether the collected voice information has a portion that matches the preset voice information.
  • the processor 301 is further configured to: when determining that the voice information has a portion that matches the preset voice information, synchronize the preset voice information with the collected voice information, and eliminate the Decoding the spectrum of the voice information for performing noise reduction processing on the collected voice information.
  • the processor 301 is further configured to perform the following steps:
  • the preset noise filtering mode is entered while the call function or the voice recognition function is turned on.
  • the processor 301 determines whether the collected voice information has a part that matches the preset voice information, and specifically includes:
  • the preset voice information from a predetermined location, where the predetermined location includes a cloud connected to the terminal and/or a storage device of the terminal, the preset voice information including at least one voice segment;
  • the processor 301 is further configured to perform the following steps:
  • the voice information is re-collected for determining whether to re-collect the voice information according to the re-collected
  • the speech information is subjected to the noise reduction processing.
  • the processor 301 is further configured to perform the following steps:
  • the preset noise filtering mode is exited while the call function or the voice recognition function is turned off.
  • the voice information may be collected in a preset noise filtering mode, and it is determined whether the collected voice information has a portion matching the preset voice information, and when it is determined that the voice information has a portion matching the preset voice information. And synchronizing the preset voice information with the collected voice information, and eliminating the spectrum of the preset voice information, so as to perform noise reduction processing on the collected voice information.
  • voice information may be collected during a call or voice recognition, and it is determined whether the cloud in the terminal or connected to the terminal has preset voice information that matches the voice information, and if so, in a call or While performing speech recognition, the spectrum of the preset speech information is eliminated, thereby realizing the function of eliminating background noise. For example, when a user makes a call in a car playing a song, the mobile phone can collect a piece of song information and compare whether the song is stored in the mobile phone.
  • the spectrum of the song is synchronously eliminated, thereby eliminating only Eliminate the background noise caused by the song, avoid the problem that the normal voice is eliminated due to the noise caused by the double microphone noise reduction, improve the accuracy of the noise cancellation and the quality of the call, and improve the user experience.
  • the matching in the technical solution refers to being identical, or the similarity is greater than a predetermined value.
  • FIG. 4A shows a flow chart of a speech processing method in accordance with another embodiment of the present invention.
  • a voice processing method is applied to an application scenario in which a call or a voice recognition is performed while playing background music, and includes:
  • Step 402 Whether the preset noise filtering mode is enabled. When the determination result is yes, the process proceeds to step 404, and when the determination result is negative, the process ends.
  • Step 404 detecting a call or starting speech recognition.
  • step 406 it is detected whether there is background music currently.
  • the process proceeds to step 408, and when the detection result is no, the process ends.
  • Step 408 collecting background music, and matching in the terminal or in the cloud connected to the terminal, wherein FIG. 4B shows a schematic diagram of song matching from the cloud in FIG. 4A, and FIG. 4C shows the local terminal in FIG. 4A.
  • the terminal can determine whether the cloud or the cloud connected to the terminal has the preset voice information that matches the voice information, and if so, eliminate the spectrum of the preset voice information while the call or the voice recognition is performed. , thus achieving the function of eliminating background noise.
  • step 410 it is determined whether the matching is successful.
  • the process proceeds to step 412.
  • the determination result is negative, the process returns to step 408 to collect the background music.
  • the voice information collected by the terminal cannot match the same or similar preset voice information in the terminal or the cloud, the voice information may be re-collected for matching until the matching is successful, or the call or voice recognition ends, and the entire matching process consumes The time is very short, and the longest is generally no more than 10s.
  • Step 412 Synchronize the background music with the matched music, and eliminate the spectrum of the matched music in the collected audio. Synchronously eliminating the spectrum of the background music, eliminating and eliminating only the background noise caused by the background music, avoiding the problem that the normal voice is eliminated due to the noise caused by the double microphone noise reduction, improving the accuracy of the noise cancellation and the call The quality has improved the user experience.
  • the matching in the technical solution refers to being identical, or the similarity is greater than a predetermined value.
  • the technical solution of the present invention can eliminate and eliminate only the background noise caused by the preset voice information, and avoid the normal voice other than the noise caused by the double microphone noise reduction.
  • the eliminated problem improves the accuracy of noise cancellation and the quality of the call, improving the user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)

Abstract

一种语音处理方法、一种语音处理装置和一种终端,其中的语音处理方法包括:在预设噪音过滤模式下,收集语音信息(102);确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分(104);当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理(106)。该语音处理方法可以消除且仅消除预设语音信息带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。

Description

语音处理方法、语音处理装置和终端
本申请要求于2015年2月9日提交中国专利局、申请号为201510066942.4,发明名称为“语音处理方法、语音处理装置和终端”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及终端技术领域,具体而言,涉及一种语音处理方法、一种语音处理装置和一种终端。
背景技术
目前的终端普遍使用双麦克降噪方式来降低语音噪音,在通话或执行其他语音识别命令时,终端可以使用降噪麦克风将收录到的噪音滤掉,从而在嘈杂的环境中依旧能够保证语音质量的清晰。
然而,双麦克降噪方式只能过滤嘈杂的背景音,对于安静环境下的音乐等背景语音信息中的人声无法有效过滤,而当背景语音的音量较大时,通常会对通话或者语音识别造成干扰,比如,在汽车行驶中,开启车内的音响,此时用户如果接听电话,车内的音乐中的人声就会给通话造成极大的干扰。另外,双麦克降噪方式由于其本身的特性还会降低语音的音量,从而会影响通话或者语音识别的质量。
因此需要一种新的技术方案,可以在保证语音质量的同时降低语音中的噪音。
发明内容
本发明正是基于上述问题,提出了一种新的技术方案,可以在保证语音质量的同时降低语音中的噪音。
有鉴于此,本发明的一方面提出了一种语音处理方法,用于终端,包括:在预设噪音过滤模式下,收集语音信息;确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并 消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
在该技术方案中,可以在进行通话或进行语音识别时采集语音信息,并判断终端中或与终端相连的云端中是否具有与该语音信息相符的预设语音信息,若有,则在通话或进行语音识别的同时,消除预设语音信息的频谱,从而实现了消除背景噪音的功能。比如,用户在播放歌曲的汽车中进行通话时,手机可采集一段歌曲信息,并比对手机中是否存储有该歌曲,如果手机存储有该歌曲,则同步消除该歌曲的频谱,从而消除且仅消除歌曲带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。另外,本技术方案所说的相匹配指的是完全相同,或相似度大于预定值。
在上述技术方案中,优选地,在所述收集语音信息之前,还包括:当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入所述预设噪音过滤模式。
在该技术方案中,只有在终端进行通话或语音识别时才开启预设噪音过滤模式,以降低终端的能耗,避免因预设噪音过滤模式长期开启而一直采集、匹配语音信息,进而避免终端的性能降低。
在上述技术方案中,优选地,所述确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分,具体包括:从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段;以及确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
在该技术方案中,终端可以将收集的语音信息与自身存储的语音信息进行比对,也可以与连接的云端等服务器中的语音信息进行比对。终端收集的语音信息只是一个语音片段,终端可以判断预设语音信息中是否具有相同的片段,在确定预设语音信息中具有相同的片段时,将收集的语音与预设语音在相同的片段处同步,进而一边收集语音信息,一边在语音信息中消除预设语音信息的频谱,即实现了同步消除噪音的功能,并且不会影响用户进行正常的通话或语音识别。
在上述技术方案中,优选地,还包括:当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
在该技术方案中,如果终端收集的语音信息在终端或云端中无法匹配到相同或相近的预设语音信息,则可以重新收集语音信息进行匹配,直至匹配成功,或者通话或语音识别结束,整个匹配过程耗时很短,最长一般不超过10s。
在上述技术方案中,优选地,还包括:当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。
在该技术方案中,在结束通话后,或在停止进行语音识别后,可以关闭预设噪音过滤模式,以节省能耗,避免终端的性能降低。
本发明的另一方面提出了一种语音处理装置,用于终端,包括:语音收集单元,在预设噪音过滤模式下,收集语音信息;确定单元,确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;降噪处理单元,当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
在该技术方案中,可以在进行通话或进行语音识别时采集语音信息,并判断终端中或与终端相连的云端中是否具有与该语音信息相符的预设语音信息,若有,则在通话或进行语音识别的同时,消除预设语音信息的频谱,从而实现了消除背景噪音的功能。比如,用户在播放歌曲的汽车中进行通话时,手机可采集一段歌曲信息,并比对手机中是否存储有该歌曲,如果手机存储有该歌曲,则同步消除该歌曲的频谱,从而消除且仅消除歌曲带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。另外,本技术方案所说的相匹配指的是完全相同,或相似度大于预定值。
在上述技术方案中,优选地,还包括:开启单元,在所述收集语音信息之前,当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的 同时,进入所述预设噪音过滤模式。
在该技术方案中,只有在终端进行通话或语音识别时才开启预设噪音过滤模式,以降低终端的能耗,避免因预设噪音过滤模式长期开启而一直采集、匹配语音信息,进而避免终端的性能降低。
在上述技术方案中,优选地,所述确定单元具体用于:从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段,以及确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
在该技术方案中,终端可以将收集的语音信息与自身存储的语音信息进行比对,也可以与连接的云端等服务器中的语音信息进行比对。终端收集的语音信息只是一个语音片段,终端可以判断预设语音信息中是否具有相同的片段,在确定预设语音信息中具有相同的片段时,将收集的语音与预设语音在相同的片段处同步,进而一边收集语音信息,一边在语音信息中消除预设语音信息的频谱,即实现了同步消除噪音的功能,并且不会影响用户进行正常的通话或语音识别。
在上述技术方案中,优选地,所述语音收集单元还用于:当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
在该技术方案中,如果终端收集的语音信息在终端或云端中无法匹配到相同或相近的预设语音信息,则可以重新收集语音信息进行匹配,直至匹配成功,或者通话或语音识别结束,整个匹配过程耗时很短,最长一般不超过10s。
在上述技术方案中,优选地,还包括:关闭单元,当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。
在该技术方案中,在结束通话后,或在停止进行语音识别后,可以关闭预设噪音过滤模式,以节省能耗,避免终端的性能降低。
本发明的第三方面的实施例提出了一种终端,所述终端包括通信总线、收 发装置、存储器以及处理器,其中:
所述通信总线,用于实现所述收发装置、所述存储器以及所述处理器之间的连接通信;
所述存储器中存储一组程序代码,且所述处理器调用所述存储器中存储的程序代码,用于执行以下操作:
所述收发装置,用于在预设噪音过滤模式下,收集语音信息;
所述处理器,用于确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;
所述处理器,还用于当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
在上述技术方案中,优选地,所述处理器还用于执行如下步骤:
当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入所述预设噪音过滤模式。
在上述技术方案中,优选地,所述处理器确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分,具体包括:
从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段;以及
确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
在上述技术方案中,优选地,所述处理器还用于执行如下步骤:
当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
在上述技术方案中,优选地,所述处理器还用于执行如下步骤:
当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。
通过以上技术方案,可以消除且仅消除预设语音信息带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。
附图说明
图1示出了根据本发明的一个实施例的语音处理方法的流程图;
图2示出了根据本发明的一个实施例的语音处理装置的框图;
图3示出了根据本发明的一个实施例的终端的框图;
图4A示出了根据本发明的另一个实施例的语音处理方法的流程图;
图4B示出了图4A中从云端进行歌曲匹配的示意图;
图4C示出了图4A中从本地终端进行歌曲匹配的示意图。
具体实施方式
为了能够更清楚地理解本发明的上述目的、特征和优点,下面结合附图和具体实施方式对本发明进行进一步的详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是,本发明还可以采用其他不同于在此描述的其他方式来实施,因此,本发明的保护范围并不受下面公开的具体实施例的限制。
图1示出了根据本发明的一个实施例的语音处理方法的流程图。
如图1所示,根据本发明的一个实施例的语音处理方法,用于终端,包括:
步骤102,在预设噪音过滤模式下,收集语音信息。
步骤104,确定收集到的语音信息中是否具有与预设语音信息相匹配的部分。
步骤106,当确定语音信息中具有与预设语音信息相匹配的部分时,将预设语音信息与收集到的语音信息同步,并消除预设语音信息的频谱,以供对收集到的语音信息进行降噪处理。
在该技术方案中,可以在进行通话或进行语音识别时采集语音信息, 并判断终端中或与终端相连的云端中是否具有与该语音信息相符的预设语音信息,若有,则在通话或进行语音识别的同时,消除预设语音信息的频谱,从而实现了消除背景噪音的功能。比如,用户在播放歌曲的汽车中进行通话时,手机可采集一段歌曲信息,并比对手机中是否存储有该歌曲,如果手机存储有该歌曲,则同步消除该歌曲的频谱,从而消除且仅消除歌曲带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。另外,本技术方案所说的相匹配指的是完全相同,或相似度大于预定值。
在上述技术方案中,优选地,在步骤102之前,还包括:当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入预设噪音过滤模式。
在该技术方案中,只有在终端进行通话或语音识别时才开启预设噪音过滤模式,以降低终端的能耗,避免因预设噪音过滤模式长期开启而一直采集、匹配语音信息,进而避免终端的性能降低。
在上述技术方案中,优选地,步骤104具体包括:从预定位置获取预设语音信息,其中,预定位置包括与终端相连的云端和/或终端的存储装置,预设语音信息包括至少一个语音片段;以及确定获取到的预设语音信息中的至少一个语音片段中是否有任一语音片段与收集到的语音信息相匹配。
在该技术方案中,终端可以将收集的语音信息与自身存储的语音信息进行比对,也可以与连接的云端等服务器中的语音信息进行比对。终端收集的语音信息只是一个语音片段,终端可以判断预设语音信息中是否具有相同的片段,在确定预设语音信息中具有相同的片段时,将收集的语音与预设语音在相同的片段处同步,进而一边收集语音信息,一边在语音信息中消除预设语音信息的频谱,即实现了同步消除噪音的功能,并且不会影响用户进行正常的通话或语音识别。
在上述技术方案中,优选地,还包括:当确定收集到的语音信息中不具有与预设语音信息相匹配的部分时,重新收集语音信息,以供根据重新收集的语音信息确定是否对重新收集的语音信息进行降噪处理。
在该技术方案中,如果终端收集的语音信息在终端或云端中无法匹配 到相同或相近的预设语音信息,则可以重新收集语音信息进行匹配,直至匹配成功,或者通话或语音识别结束,整个匹配过程耗时很短,最长一般不超过10s。
在上述技术方案中,优选地,还包括:当检测到通话结束命令或语音识别终止命令时,在关闭通话功能或语音识别功能的同时,退出预设噪音过滤模式。
在该技术方案中,在结束通话后,或在停止进行语音识别后,可以关闭预设噪音过滤模式,以节省能耗,避免终端的性能降低。
图2示出了根据本发明的一个实施例的语音处理装置的框图。
如图2所示,根据本发明的一个实施例的语音处理装置200,用于终端,包括:语音收集单元202,在预设噪音过滤模式下,收集语音信息;确定单元204,确定收集到的语音信息中是否具有与预设语音信息相匹配的部分;降噪处理单元206,当确定语音信息中具有与预设语音信息相匹配的部分时,将预设语音信息与收集到的语音信息同步,并消除预设语音信息的频谱,以供对收集到的语音信息进行降噪处理。
在该技术方案中,可以在进行通话或进行语音识别时采集语音信息,并判断终端中或与终端相连的云端中是否具有与该语音信息相符的预设语音信息,若有,则在通话或进行语音识别的同时,消除预设语音信息的频谱,从而实现了消除背景噪音的功能。比如,用户在播放歌曲的汽车中进行通话时,手机可采集一段歌曲信息,并比对手机中是否存储有该歌曲,如果手机存储有该歌曲,则同步消除该歌曲的频谱,从而消除且仅消除歌曲带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。另外,本技术方案所说的相匹配指的是完全相同,或相似度大于预定值。
在上述技术方案中,优选地,还包括:开启单元208,在收集语音信息之前,当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入预设噪音过滤模式。
在该技术方案中,只有在终端进行通话或语音识别时才开启预设噪音过滤模式,以降低终端的能耗,避免因预设噪音过滤模式长期开启而一直 采集、匹配语音信息,进而避免终端的性能降低。
在上述技术方案中,优选地,确定单元204具体用于:从预定位置获取预设语音信息,其中,预定位置包括与终端相连的云端和/或终端的存储装置,预设语音信息包括至少一个语音片段,以及确定获取到的预设语音信息中的至少一个语音片段中是否有任一语音片段与收集到的语音信息相匹配。
在该技术方案中,终端可以将收集的语音信息与自身存储的语音信息进行比对,也可以与连接的云端等服务器中的语音信息进行比对。终端收集的语音信息只是一个语音片段,终端可以判断预设语音信息中是否具有相同的片段,在确定预设语音信息中具有相同的片段时,将收集的语音与预设语音在相同的片段处同步,进而一边收集语音信息,一边在语音信息中消除预设语音信息的频谱,即实现了同步消除噪音的功能,并且不会影响用户进行正常的通话或语音识别。
在上述技术方案中,优选地,语音收集单元202还用于:当确定收集到的语音信息中不具有与预设语音信息相匹配的部分时,重新收集语音信息,以供根据重新收集的语音信息确定是否对重新收集的语音信息进行降噪处理。
在该技术方案中,如果终端收集的语音信息在终端或云端中无法匹配到相同或相近的预设语音信息,则可以重新收集语音信息进行匹配,直至匹配成功,或者通话或语音识别结束,整个匹配过程耗时很短,最长一般不超过10s。
在上述技术方案中,优选地,还包括:关闭单元210,当检测到通话结束命令或语音识别终止命令时,在关闭通话功能或语音识别功能的同时,退出预设噪音过滤模式。
在该技术方案中,在结束通话后,或在停止进行语音识别后,可以关闭预设噪音过滤模式,以节省能耗,避免终端的性能降低。
图3示出了根据本发明的一个实施例的终端的框图。
如图3所示,根据本发明的一个实施例的终端,包括:至少一个收发装置303,至少一个处理器301,例如CPU,存储器304和至少一个通信总线 302。
其中,上述通信总线302用于连接上述收发装置303、处理器301和存储器304。
上述存储器304可以是高速RAM存储器,也可为非不稳定的存储器(non-volatile memory),例如磁盘存储器。上述存储器304还用于存储一组程序代码,上述收发装置303和处理器301用于调用存储器304中存储的程序代码,执行如下操作:
所述收发装置303,用于在预设噪音过滤模式下,收集语音信息;
所述处理器301,用于确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;
所述处理器301,还用于当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
在上述技术方案中,优选地,所述处理器301还用于执行如下步骤:
当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入所述预设噪音过滤模式。
在上述技术方案中,优选地,所述处理器301确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分,具体包括:
从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段;以及
确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
在上述技术方案中,优选地,所述处理器301还用于执行如下步骤:
当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
在上述技术方案中,优选地,所述处理器301还用于执行如下步骤:
当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。可以在预设噪音过滤模式下,收集语音信息,并确定收集到的语音信息中是否具有与预设语音信息相匹配的部分,以及当确定语音信息中具有与预设语音信息相匹配的部分时,将预设语音信息与收集到的语音信息同步,并消除预设语音信息的频谱,以供对收集到的语音信息进行降噪处理。
在该技术方案中,可以在进行通话或进行语音识别时采集语音信息,并判断终端中或与终端相连的云端中是否具有与该语音信息相符的预设语音信息,若有,则在通话或进行语音识别的同时,消除预设语音信息的频谱,从而实现了消除背景噪音的功能。比如,用户在播放歌曲的汽车中进行通话时,手机可采集一段歌曲信息,并比对手机中是否存储有该歌曲,如果手机存储有该歌曲,则同步消除该歌曲的频谱,从而消除且仅消除歌曲带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。另外,本技术方案所说的相匹配指的是完全相同,或相似度大于预定值。
图4A示出了根据本发明的另一个实施例的语音处理方法的流程图。
如图4所示,根据本发明的另一个实施例的语音处理方法,应用于在播放背景音乐的同时进行通话或语音识别的应用场景,其包括:
步骤402,是否启用预设噪音过滤模式,当判断结果为是时,进入步骤404,当判断结果为否时,结束进程。
步骤404,检测到通话或者开始进行语音识别。
步骤406,检测当前是否有背景音乐,当检测结果为是时,进入步骤408,当检测结果为否时,结束进程。
步骤408,收集背景音乐,并在终端中或与终端相连的云端中进行匹配,其中,图4B示出了图4A中从云端进行歌曲匹配的示意图,图4C示出了图4A中从本地终端进行歌曲匹配的示意图,终端可以判断自身或与终端相连的云端中是否具有与该语音信息相符的预设语音信息,若有,则在通话或进行语音识别的同时,消除预设语音信息的频谱,从而实现了消除背景噪音的功能。
步骤410,判断是否匹配成功,当判断结果为是时,进入步骤412,当判断结果为否时,返回步骤408,重新收集背景音乐。具体地,如果终端收集的语音信息在终端或云端中无法匹配到相同或相近的预设语音信息,则可以重新收集语音信息进行匹配,直至匹配成功,或者通话或语音识别结束,整个匹配过程耗时很短,最长一般不超过10s。
步骤412,同步背景音乐与匹配到的音乐,在收集到的音频中消除匹配到的音乐的频谱。同步消除该背景音乐的频谱,可消除且仅消除背景音乐带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。另外,本技术方案所说的相匹配指的是完全相同,或相似度大于预定值。
以上结合附图详细说明了本发明的技术方案,通过本发明的技术方案,可以消除且仅消除预设语音信息带来的背景噪音,避免因双麦克降噪带来的噪音外的正常语音也被消除的问题,提升了噪音消除的准确性和通话的质量,提升了用户体验。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种语音处理方法,用于终端,其特征在于,包括:
    在预设噪音过滤模式下,收集语音信息;
    确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;
    当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
  2. 根据权利要求1所述的语音处理方法,其特征在于,在所述收集语音信息之前,还包括:
    当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入所述预设噪音过滤模式。
  3. 根据权利要求2所述的语音处理方法,其特征在于,所述确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分,具体包括:
    从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段;以及
    确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
  4. 根据权利要求3所述的语音处理方法,其特征在于,还包括:
    当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
  5. 根据权利要求2至4中任一项所述的语音处理方法,其特征在于,还包括:
    当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。
  6. 一种语音处理装置,用于终端,其特征在于,包括:
    语音收集单元,在预设噪音过滤模式下,收集语音信息;
    确定单元,确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;
    降噪处理单元,当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
  7. 根据权利要求6所述的语音处理装置,其特征在于,还包括:
    开启单元,在所述收集语音信息之前,当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入所述预设噪音过滤模式。
  8. 根据权利要求7所述的语音处理装置,其特征在于,所述确定单元具体用于:
    从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段,以及确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
  9. 根据权利要求8所述的语音处理装置,其特征在于,所述语音收集单元还用于:
    当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
  10. 根据权利要求7至9中任一项所述的语音处理装置,其特征在于,还包括:
    关闭单元,当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。
  11. 一种终端,其特征在于,所述终端包括通信总线、收发装置、存储器以及处理器,其中:
    所述通信总线,用于实现所述收发装置、所述存储器以及所述处理器之间的连接通信;
    所述存储器中存储一组程序代码,且所述处理器调用所述存储器中存储的程序代码,用于执行以下操作:
    所述收发装置,用于在预设噪音过滤模式下,收集语音信息;
    所述处理器,用于确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分;
    所述处理器,还用于当确定所述语音信息中具有与所述预设语音信息相匹配的部分时,将所述预设语音信息与收集到的所述语音信息同步,并消除所述预设语音信息的频谱,以供对收集到的所述语音信息进行降噪处理。
  12. 根据权利要求11所述的终端,其特征在于,所述处理器还用于执行如下步骤:
    当检测到通话命令或语音识别命令时,在开启通话功能或语音识别功能的同时,进入所述预设噪音过滤模式。
  13. 根据权利要求12所述的终端,其特征在于,所述处理器确定收集到的所述语音信息中是否具有与预设语音信息相匹配的部分,具体包括:
    从预定位置获取所述预设语音信息,其中,所述预定位置包括与所述终端相连的云端和/或所述终端的存储装置,所述预设语音信息包括至少一个语音片段;以及
    确定获取到的所述预设语音信息中的所述至少一个语音片段中是否有任一语音片段与收集到的所述语音信息相匹配。
  14. 根据权利要求13所述的终端,其特征在于,所述处理器还用于执行如下步骤:
    当确定收集到的所述语音信息中不具有与所述预设语音信息相匹配的部分时,重新收集所述语音信息,以供根据重新收集的所述语音信息确定是否对重新收集的所述语音信息进行所述降噪处理。
  15. 根据权利要求12至14中任一项所述的终端,其特征在于,所述处理器还用于执行如下步骤:
    当检测到通话结束命令或语音识别终止命令时,在关闭所述通话功能或所述语音识别功能的同时,退出所述预设噪音过滤模式。
PCT/CN2015/078091 2015-02-09 2015-04-30 语音处理方法、语音处理装置和终端 WO2016127506A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510066942.4A CN104599675A (zh) 2015-02-09 2015-02-09 语音处理方法、语音处理装置和终端
CN201510066942.4 2015-02-09

Publications (1)

Publication Number Publication Date
WO2016127506A1 true WO2016127506A1 (zh) 2016-08-18

Family

ID=53125408

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/078091 WO2016127506A1 (zh) 2015-02-09 2015-04-30 语音处理方法、语音处理装置和终端

Country Status (2)

Country Link
CN (1) CN104599675A (zh)
WO (1) WO2016127506A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2020017517A1 (ja) * 2018-07-20 2021-08-02 株式会社ソニー・インタラクティブエンタテインメント 音声信号処理システム、及び音声信号処理装置

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105338170A (zh) * 2015-09-23 2016-02-17 广东小天才科技有限公司 一种滤除背景噪声的方法及装置
CN107028524A (zh) * 2015-12-08 2017-08-11 太琦科技股份有限公司 语音控制型洗浴系统及其操作方法
CN107240403B (zh) * 2016-03-28 2021-08-27 阿里巴巴集团控股有限公司 声波传输方法及装置
CN106328137A (zh) * 2016-08-19 2017-01-11 镇江惠通电子有限公司 语音控制方法、装置及系统
CN106453761B (zh) * 2016-10-31 2019-10-15 北京小米移动软件有限公司 语音信号的处理方法及装置
CN107819964B (zh) * 2017-11-10 2021-04-06 Oppo广东移动通信有限公司 提高通话质量的方法、装置、终端和计算机可读存储介质
CN108173740A (zh) * 2017-11-30 2018-06-15 维沃移动通信有限公司 一种语音通信的方法和装置
CN108881652B (zh) * 2018-07-11 2021-02-26 北京大米科技有限公司 回音检测方法、存储介质和电子设备
CN109215688B (zh) * 2018-10-10 2020-12-22 麦片科技(深圳)有限公司 同场景音频处理方法、装置、计算机可读存储介质及系统
CN109389979B (zh) * 2018-12-05 2022-05-20 广东美的制冷设备有限公司 语音交互方法、语音交互系统以及家用电器

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10124084A (ja) * 1996-10-18 1998-05-15 Oki Electric Ind Co Ltd 音声処理装置
JP2000194392A (ja) * 1998-12-25 2000-07-14 Sharp Corp 騒音適応型音声認識装置及び騒音適応型音声認識プログラムを記録した記録媒体
EP0996110B1 (en) * 1998-10-20 2005-08-24 Canon Kabushiki Kaisha Method and apparatus for speech activity detection
CN101859567A (zh) * 2009-04-10 2010-10-13 比亚迪股份有限公司 一种语音背景噪声的消除方法和装置
CN102354499A (zh) * 2011-07-25 2012-02-15 中兴通讯股份有限公司 降低噪音的方法和设备
CN102969003A (zh) * 2012-11-15 2013-03-13 东莞宇龙通信科技有限公司 摄像声音提取方法及装置
US20130332165A1 (en) * 2012-06-06 2013-12-12 Qualcomm Incorporated Method and systems having improved speech recognition
WO2014000658A1 (zh) * 2012-06-28 2014-01-03 腾讯科技(深圳)有限公司 消除噪音的方法和装置、以及移动终端
CN103514884A (zh) * 2012-06-26 2014-01-15 华为终端有限公司 通话音降噪方法及终端
CN103888580A (zh) * 2014-03-31 2014-06-25 宇龙计算机通信科技(深圳)有限公司 一种终端录音过程中降噪处理方法及终端
CN104517607A (zh) * 2014-12-16 2015-04-15 佛山市顺德区美的电热电器制造有限公司 滤除语音控制电器中的噪声的方法及语音控制电器

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10124084A (ja) * 1996-10-18 1998-05-15 Oki Electric Ind Co Ltd 音声処理装置
EP0996110B1 (en) * 1998-10-20 2005-08-24 Canon Kabushiki Kaisha Method and apparatus for speech activity detection
JP2000194392A (ja) * 1998-12-25 2000-07-14 Sharp Corp 騒音適応型音声認識装置及び騒音適応型音声認識プログラムを記録した記録媒体
CN101859567A (zh) * 2009-04-10 2010-10-13 比亚迪股份有限公司 一种语音背景噪声的消除方法和装置
CN102354499A (zh) * 2011-07-25 2012-02-15 中兴通讯股份有限公司 降低噪音的方法和设备
US20130332165A1 (en) * 2012-06-06 2013-12-12 Qualcomm Incorporated Method and systems having improved speech recognition
CN103514884A (zh) * 2012-06-26 2014-01-15 华为终端有限公司 通话音降噪方法及终端
WO2014000658A1 (zh) * 2012-06-28 2014-01-03 腾讯科技(深圳)有限公司 消除噪音的方法和装置、以及移动终端
CN102969003A (zh) * 2012-11-15 2013-03-13 东莞宇龙通信科技有限公司 摄像声音提取方法及装置
CN103888580A (zh) * 2014-03-31 2014-06-25 宇龙计算机通信科技(深圳)有限公司 一种终端录音过程中降噪处理方法及终端
CN104517607A (zh) * 2014-12-16 2015-04-15 佛山市顺德区美的电热电器制造有限公司 滤除语音控制电器中的噪声的方法及语音控制电器

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2020017517A1 (ja) * 2018-07-20 2021-08-02 株式会社ソニー・インタラクティブエンタテインメント 音声信号処理システム、及び音声信号処理装置
JP7158480B2 (ja) 2018-07-20 2022-10-21 株式会社ソニー・インタラクティブエンタテインメント 音声信号処理システム、及び音声信号処理装置
US11694705B2 (en) 2018-07-20 2023-07-04 Sony Interactive Entertainment Inc. Sound signal processing system apparatus for avoiding adverse effects on speech recognition

Also Published As

Publication number Publication date
CN104599675A (zh) 2015-05-06

Similar Documents

Publication Publication Date Title
WO2016127506A1 (zh) 语音处理方法、语音处理装置和终端
CN106463112B (zh) 语音识别方法、语音唤醒装置、语音识别装置及终端
US9406313B2 (en) Adaptive microphone sampling rate techniques
US9704478B1 (en) Audio output masking for improved automatic speech recognition
US9721560B2 (en) Cloud based adaptive learning for distributed sensors
US9412373B2 (en) Adaptive environmental context sample and update for comparing speech recognition
EP2994910B1 (en) Method and apparatus for detecting a target keyword
WO2017031846A1 (zh) 噪声消除、语音识别方法、装置、设备及非易失性计算机存储介质
US20150063575A1 (en) Acoustic Sound Signature Detection Based on Sparse Features
WO2014182459A1 (en) Adaptive audio frame processing for keyword detection
CN103903612B (zh) 一种实时语音识别数字的方法
US20150066495A1 (en) Robust Feature Extraction Using Differential Zero-Crossing Countes
WO2014117722A1 (zh) 语音处理方法、装置及终端设备
CN110232933B (zh) 音频检测方法、装置、存储介质及电子设备
WO2014000476A1 (zh) 移动终端语音降噪的方法及装置
CN108053822B (zh) 一种语音信号处理方法、装置、终端设备及介质
CN106791244B (zh) 回声消除方法、装置以及通话设备
CN108806707B (zh) 语音处理方法、装置、设备及存储介质
US20200312305A1 (en) Performing speaker change detection and speaker recognition on a trigger phrase
US10425746B2 (en) Method for operating a hearing apparatus, and hearing apparatus
CN112951243A (zh) 语音唤醒方法、装置、芯片、电子设备及存储介质
CN111402880A (zh) 一种数据处理方法、装置及电子设备
CN111128166A (zh) 连续唤醒识别功能的优化方法和装置
CN109427336B (zh) 语音对象识别方法及装置
CN111739515B (zh) 语音识别方法、设备、电子设备和服务器、相关系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15881667

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.01.2018)

122 Ep: pct application non-entry in european phase

Ref document number: 15881667

Country of ref document: EP

Kind code of ref document: A1