WO2016201767A1 - 一种语音控制方法、装置及计算机存储介质 - Google Patents

一种语音控制方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2016201767A1
WO2016201767A1 PCT/CN2015/085287 CN2015085287W WO2016201767A1 WO 2016201767 A1 WO2016201767 A1 WO 2016201767A1 CN 2015085287 W CN2015085287 W CN 2015085287W WO 2016201767 A1 WO2016201767 A1 WO 2016201767A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
voice signal
terminal
user
preset
Prior art date
Application number
PCT/CN2015/085287
Other languages
English (en)
French (fr)
Inventor
陈建江
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016201767A1 publication Critical patent/WO2016201767A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously

Definitions

  • the present invention relates to the field of intelligent terminals, and in particular, to a voice control method, apparatus, and computer storage medium.
  • more and more intelligent terminals use voice control to open various applications of the terminal.
  • the existing voice control method if the user needs to open an application of the terminal, the user needs to start the voice recognition function by inputting the preset voice wake-up word or touching the screen button; and then inputting the preset application by entering the preset Voice to open the corresponding application.
  • the user needs to open the map through voice control, first start the voice recognition function by entering “Hello, Xiaoxing” (preset voice wake-up word); after hearing the terminal answering tone, confirm that the terminal turns on the voice recognition function, and then enter "Open Map" (the default wake-up word for the open map app), the terminal opens the map.
  • “Open Map” the default wake-up word for the open map app
  • the embodiments of the present invention are expected to provide a voice control method and apparatus, which simplify the process of starting a terminal application through voice control, and improve the user experience.
  • an embodiment of the present invention provides a voice control method, including: a terminal acquiring a voice signal of a user; the terminal detecting a voice strength of the voice signal, and determining a voice strength to which the voice strength of the voice signal belongs The terminal performs a corresponding operation according to the voice intensity interval to which the voice signal belongs.
  • the voice intensity interval includes at least one of the following: strong voice strength a range of speech, a general speech intensity interval, and a weaker speech intensity interval, wherein the speech intensity of the strong speech intensity interval is greater than the speech intensity of the general speech intensity interval, and the speech intensity of the general speech intensity interval is greater than the comparison The speech intensity of the weak speech intensity interval.
  • the terminal when the voice intensity interval to which the voice signal belongs is the strong voice strength interval, the terminal performs a corresponding operation according to the voice strength interval to which the voice signal belongs, including: the terminal determining Whether the voice signal is a preset voice command, and if the voice signal is a preset voice command, the terminal performs an operation corresponding to the voice signal.
  • the method before the terminal determines whether the voice signal is a preset voice command, the method further includes: when the terminal determines, by using noise detection, that the voice signal is not noise, the terminal determines Whether the voice signal is a preset voice command.
  • the method before the terminal determines whether the voice signal is a preset voice command, the method further includes: determining, by the terminal, whether a processing time of the voice signal is less than a preset time threshold, And when the processing time is less than the time threshold, the terminal determines whether the voice signal is a preset voice command.
  • the terminal when the voice intensity interval to which the voice signal belongs is the normal voice strength interval, the terminal performs a corresponding operation according to the voice strength interval to which the voice signal belongs, including: the terminal determines whether it is in the If the terminal is in the short-distance operation state of the user, and the terminal determines that the voice signal is a preset voice command, the terminal performs an operation corresponding to the voice signal.
  • the method further includes: when the terminal determines that the voice signal is not noise through noise detection, the terminal determines whether the user is in the The user operates in a close range.
  • the method further includes: the terminal determining whether the processing time of the voice signal is small And a preset time threshold, and when the processing time is less than the time threshold, the terminal determines whether it is in the user close operating state.
  • the terminal determines whether the user is in the short-distance operation state of the user, and includes: the terminal collecting current picture information, and if the face feature is identified according to the current picture information, the terminal determines that the user is in the user Close-range operating status.
  • the terminal determines whether the user is in the short-distance operation state of the user, and the terminal determines whether the angle between the vertical line of the terminal display screen and the gravity line is greater than a preset angle by using the gravity sensor, if the terminal is If the angle between the vertical line of the display screen and the gravity line is greater than a preset angle, the terminal determines that it is in the close operating state of the user.
  • the terminal when the voice intensity interval to which the voice signal belongs is the weak voice strength interval, the terminal performs a corresponding operation according to the voice strength interval to which the voice signal belongs, including: the terminal pair The voice signal is not processed.
  • an embodiment of the present invention provides a voice control apparatus, including: an acquiring unit, a detecting unit, and an executing unit, where: the acquiring unit is configured to acquire a voice signal of a user; and the detecting unit is configured to detect a voice strength interval of the voice signal acquired by the acquiring unit, and determining a voice intensity interval to which the voice strength of the voice signal belongs; the executing unit configured to be determined according to the voice signal determined by the detecting unit The voice intensity interval performs the corresponding operation.
  • the speech intensity interval includes at least one of the following: a strong speech intensity interval, a general speech intensity interval, and a weak speech intensity interval, wherein the speech intensity of the strong speech intensity interval is greater than the general The speech intensity of the speech intensity interval, the speech intensity of the general speech intensity interval being greater than the speech intensity of the weaker speech intensity interval.
  • the executing unit is further configured to: determine whether the voice signal is a preset Voice command, if the voice signal is a preset voice command, then An operation corresponding to the voice signal is performed.
  • the executing unit is further configured to: determine, by the noise detection, that the voice signal is not a noise, determine whether the voice signal is a preset voice command, if the voice signal is preset The voice command performs an operation corresponding to the voice signal.
  • the executing unit is further configured to: determine whether a processing time of the voice signal is less than a preset time threshold, and determine the voice signal when the processing time is less than the time threshold. Whether it is a preset voice command, if the voice signal is a preset voice command, performing an operation corresponding to the voice signal.
  • the executing unit is further configured to: determine whether the terminal is in a close operation of the user. a state, if the terminal is in the short-distance operation state of the user, and it is determined that the voice signal is a preset voice command, performing an operation corresponding to the voice signal.
  • the executing unit is further configured to: determine, by the noise detection, that the voice signal is not a noise, determine whether the terminal is in a short-distance operation state of the user, if the terminal is in the user The short-distance operation state, and determining that the voice signal is a preset voice command, performs an operation corresponding to the voice signal.
  • the executing unit is further configured to: determine whether the processing time of the voice signal is less than a preset time threshold, and determine, when the processing time is less than the time threshold, whether the terminal is In the short-distance operation state of the user, if the terminal is in the short-distance operation state of the user, and it is determined that the voice signal is a preset voice command, an operation corresponding to the voice signal is performed.
  • the executing unit is further configured to: collect current screen information, and if the facial feature is identified according to the current screen information, determine that the terminal is in the user close-range operating state.
  • the executing unit is further configured to: determine, by the gravity sensor, whether an angle between the vertical line of the terminal display and the gravity line is greater than a preset angle, if the vertical line of the terminal display and the gravity line are clamped If the angle is greater than the preset angle, the terminal is determined to be in the close operating state of the user.
  • the executing unit when the detecting unit determines that the voice intensity interval to which the voice signal belongs is the weak voice intensity interval, the executing unit is further configured to: not process the voice signal.
  • an embodiment of the present invention provides a computer storage medium.
  • the computer storage medium provided by the embodiment of the present invention stores a computer program, where the computer program is used to execute the voice control method.
  • Embodiments of the present invention provide a voice control method, apparatus, and computer storage medium, which acquire a user's voice signal through a terminal; the terminal detects a voice signal's voice strength, and determines a voice intensity interval to which the voice signal's voice strength belongs; the terminal according to the voice The voice intensity interval to which the signal belongs performs a corresponding operation, omitting the process of the user starting the voice recognition function by using the voice wake-up word or the voice recognition function to open the button, simplifying the process of starting the terminal application through voice control, and improving the user experience.
  • FIG. 1 is a schematic flowchart of a voice control method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a voice strength interval according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart diagram of a detailed embodiment of a voice control method according to an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of a voice control apparatus according to an embodiment of the present invention.
  • FIG. 1 shows a voice control method according to an embodiment of the present invention.
  • the method includes:
  • S101 The terminal acquires a voice signal of the user.
  • the voice control method provided by the embodiment of the present invention is applied to a scenario in which a user closely operates a terminal, such as a scenario in which a user holds a terminal, or a scenario in which a user places a terminal on a desktop and operates close to the terminal.
  • the terminal detects a voice strength of the voice signal, and determines a voice intensity interval to which the voice strength of the voice signal belongs;
  • the voice strength of the voice signal detected by the terminal changes within a fixed range. For example, if the user speaks 5-20 cm away from the terminal microphone, the voice signal detected by the terminal has a speech intensity of about 80-100 dB. In this way, the terminal detects the voice strength of the voice signal, and determines which voice intensity interval of the voice signal belongs to which voice intensity interval is preset, so that the terminal accurately analyzes the voice signal for different voice intensity intervals, and provides a basis for the user to quickly open the terminal application. .
  • the speech intensity interval includes at least one of the following: a strong speech intensity interval, a general speech intensity interval, and a weak speech intensity interval, wherein the speech intensity of the strong speech intensity interval is greater than the speech intensity of the general speech intensity interval, generally The speech intensity of the speech intensity interval is greater than the speech intensity of the weaker speech intensity interval.
  • FIG. 2 shows a schematic diagram of a speech intensity interval.
  • a strong speech intensity interval is an interval in which the speech intensity is greater than V1;
  • a general speech intensity interval is a speech intensity less than or equal to V1 and greater than V2;
  • the weak speech intensity interval is that the speech intensity is less than or equal to V2.
  • V1 and V2 are preset voice intensity thresholds for determining whether to trigger terminal voice control, and V1 is greater than V2.
  • V1 and V2 are set by the terminal according to the range of the voice intensity variation of the voice signal detected when the user operates at a close distance. For example, the voice intensity of the voice signal detected by the terminal is changed. The range is 80-100dB, then V1 can be set between 85-95dB, and V2 can be set between 70-80dB.
  • the setting of the specific values of V1 and V2 can be set according to the actual situation. Make specific restrictions.
  • S103 The terminal performs a corresponding operation according to the voice intensity interval to which the voice signal belongs.
  • the terminal presets operations corresponding to different voice strength intervals. For example, for a speech intensity interval with strong speech intensity, since the speech intensity is large enough, the user can speak the voice closer to the terminal microphone, and can directly trigger the voice control, perform the operation corresponding to the voice signal, and omit the user to wake up through the voice.
  • the word or speech recognition feature turns on the button to initiate the process of speech recognition, making it faster for users to open applications with speech recognition.
  • an auxiliary judgment is needed, that is, when the terminal is further in the user's close-range operation state, the voice control is triggered and the operation corresponding to the voice signal is performed, so that the terminal can improve the accuracy of the voice control. Sex. For speech intensity intervals where the speech intensity is very weak, the speech signal can be ignored.
  • the terminal when the voice intensity interval to which the voice signal belongs is a strong voice strength interval, the terminal performs a corresponding operation according to the voice intensity interval to which the voice signal belongs, including: determining whether the voice signal is a preset voice command, and if the voice The signal is a preset voice command, and the terminal performs an operation corresponding to the voice signal.
  • the preset voice command is used by the terminal to open the corresponding application.
  • the user presets a voice command "Open Mobile Map" for the terminal to open the map application.
  • the user preset voice command includes the contact name stored in the terminal address book, and when the user speaks the contact name in the address book such as “Zhang San” to the terminal microphone, the terminal retrieves the “Zhang San”. Information such as the phone number for the user to make outgoing calls and the like.
  • the method further includes: when the terminal determines, by the noise detection, that the voice signal is not noise, the terminal determines whether the voice signal is a preset voice command.
  • the terminal performs noise detection on the voice signal, and determines whether the voice signal is a typical known noise or a human voice, and is a conventional technical means in the communication field, so the implementation process will not be described here.
  • the terminal performs noise detection on the voice signal, which can eliminate the situation that the terminal is triggered by the noise of the environment, thereby avoiding misoperation and improving the accuracy of the terminal for voice control.
  • the method further includes: determining, by the terminal, whether the processing time of the voice signal is less than a preset time threshold, and when the processing time is less than a preset time threshold, the terminal Determine whether the voice signal is a preset voice command.
  • the terminal when the processing time is less than the preset time threshold, the terminal performs the operation corresponding to the voice signal, satisfies the requirements of the user voice control terminal, and improves the user experience.
  • the terminal when the voice intensity interval to which the voice signal belongs is a normal voice strength interval, the terminal performs a corresponding operation according to the voice strength interval to which the voice signal belongs, including: the terminal determines whether the user is in the user's close-range operation state, and if the terminal is in the user The short-distance operation state, and the terminal determines that the voice signal is a preset voice command, and the terminal performs an operation corresponding to the voice signal.
  • the short-distance operation state of the user is a state in which the user operates the terminal closer to the terminal, and is used to assist in determining whether the terminal triggers voice control.
  • the method further includes: when the terminal determines that the voice signal is not noise through noise detection, the terminal determines whether the user is in the user close-range operation state.
  • the method further includes: determining, by the terminal, whether the processing time of the voice signal is less than a preset time threshold, and when the processing time is less than the preset time threshold, the terminal determines Whether it is in the user's close-range operation state.
  • the terminal determines whether the user is in the short-distance operation state of the user, and the terminal collects the current picture information. If the face feature is identified according to the current picture information, the terminal determines that the user is in the user's close-range operation state.
  • the terminal collects the current picture information and recognizes the face feature according to the current picture information, which is an existing face recognition technology, and therefore the implementation process thereof will not be described herein. .
  • the terminal determines whether the user is in the short-distance operation state of the user, and the terminal determines whether the angle between the vertical line of the terminal display screen and the gravity line is greater than a preset angle by using the gravity sensor, if the vertical line of the terminal display screen and the gravity line are clamped If the angle is greater than the preset angle, the terminal determines that it is in the user's close-range operation state.
  • the vertical line of the terminal display is perpendicular to the display screen of the terminal and the direction is outward of the terminal; the gravity line is vertically downward; the preset angle is used to determine whether the display of the terminal is horizontal or inclined facing vertically upward.
  • the direction of the preset angle can be set according to the actual situation.
  • the preset angle is 135 degrees, which is not specifically limited in this embodiment of the present invention.
  • the terminal determines through the gravity sensor that the angle between the vertical line of the terminal display and the gravity line is greater than the preset angle, it can be determined that the terminal display screen is horizontal or inclined facing the vertical upward direction, and most In the case where the user operates the terminal, the terminal display screen is horizontally or obliquely facing in a vertical upward direction, and it can be determined that the terminal is in a user close-range operating state.
  • the terminal when the voice intensity interval to which the voice signal belongs is a weaker voice strength interval, the terminal performs a corresponding operation according to the voice strength interval to which the voice signal belongs, including: the terminal does not process the voice signal.
  • FIG. 3 is a flowchart of a detailed embodiment of a voice control method according to an embodiment of the present invention. Referring to FIG. 3, the method includes:
  • S301 The terminal acquires a voice signal of the user.
  • S302 The terminal detects a voice strength V of the voice signal
  • step S303 The terminal determines the voice intensity interval to which the voice signal belongs according to the voice strength V; if the voice signal belongs to the weak voice intensity interval, proceed to step S301; if the voice signal belongs to the normal voice strength interval, step S304 is performed; if the voice signal belongs to If the voice intensity interval is strong, step S305 is performed;
  • the terminal presets that V1 is 90 dB and V2 is 70 dB. If the speech intensity V ⁇ V2, the speech signal belongs to a weaker speech intensity interval; if V2 ⁇ V ⁇ V1, the speech signal belongs to a general speech intensity interval; if the speech intensity V >V1, the speech signal belongs to the strong speech intensity interval.
  • step S304 The terminal collects the current screen information, and determines whether the current screen information can recognize the face feature, and if so, proceeds to step S305; if not, proceeds to step S301;
  • step S305 The terminal determines whether the voice signal is noise through noise detection, and if so, proceeds to step S301; if not, proceeds to step S306;
  • step S306 The terminal determines whether the processing time of the voice signal is less than a preset time threshold, and if not, proceeds to step S301; if yes, proceeds to step S307;
  • step S307 The terminal determines whether the voice signal is a preset voice command, and if not, proceeds to step S301; if yes, proceeds to step S308;
  • S308 The terminal parses the voice signal, and performs a corresponding operation according to the parsed result.
  • the embodiment of the invention provides a voice control method, which acquires a voice signal of a user through a terminal; the terminal detects the voice strength of the voice signal, and determines a voice intensity interval to which the voice strength of the voice signal belongs; the voice intensity interval to which the terminal belongs according to the voice signal Performing the corresponding operation omits the process of the user starting the voice recognition function by using the voice wake-up word or the voice recognition function open button, so that the user can use the voice recognition function to open the application program more quickly and improve the user experience.
  • FIG. 4 is a schematic structural diagram of a voice control apparatus according to an embodiment of the present invention.
  • the voice control apparatus 40 includes: an obtaining unit 401, a detecting unit 402, and an executing unit. 403, where:
  • the obtaining unit 401 is configured to acquire a voice signal of the user
  • the detecting unit 402 is configured to detect a voice strength of the voice signal acquired by the acquiring unit 401, and determine a voice intensity interval to which the voice strength of the voice signal belongs;
  • the executing unit 403 is configured to perform a corresponding operation according to the voice intensity interval to which the voice signal determined by the detecting unit 402 belongs.
  • the speech intensity interval includes at least one of the following: a strong speech intensity interval, a general speech intensity interval, and a weak speech intensity interval, wherein the speech intensity of the strong speech intensity interval is greater than the speech intensity of the general speech intensity interval, generally The speech intensity of the speech intensity interval is greater than the speech intensity of the weaker speech intensity interval.
  • the executing unit 403 is further configured to: determine whether the voice signal is a preset voice command, if the voice signal is a preset The voice command performs an operation corresponding to the voice signal.
  • the executing unit 403 is further configured to: determine, by using noise detection, that the voice signal is not a noise, determine whether the voice signal is a preset voice command, and if the voice signal is a preset voice command, perform the corresponding to the voice signal. Operation.
  • the executing unit 403 is further configured to: determine whether the processing time of the voice signal is less than a preset time threshold, and determine whether the voice signal is a preset voice command, if the processing time is less than the time threshold, if the voice signal It is a preset voice command, and an operation corresponding to the voice signal is performed.
  • the executing unit 403 is further configured to: determine whether the terminal is in the user close-range operating state, and if the terminal is in the user close-range operating state And determining that the voice signal is a preset voice command, performing an operation corresponding to the voice signal.
  • the executing unit 403 is further configured to: determine, by using noise detection, that the voice signal is not When it is noise, it is judged whether the terminal is in the user's short-distance operation state. If the terminal is in the user's close-range operation state, and it is determined that the voice signal is a preset voice command, the operation corresponding to the voice signal is performed.
  • the executing unit 403 is further configured to: determine whether the processing time of the voice signal is less than a preset time threshold, and determine whether the terminal is in a user close-range operation state when the processing time is less than the time threshold, if the terminal is in the user The short-distance operation state, and determining that the voice signal is a preset voice command, performs an operation corresponding to the voice signal.
  • the executing unit 403 is further configured to: collect current screen information, and if the facial feature is identified according to the current screen information, determine that the terminal is in a user close-range operating state.
  • the executing unit 403 is further configured to: determine, by the gravity sensor, whether an angle between the vertical line of the terminal display and the gravity line is greater than a preset angle, if the angle between the vertical line of the terminal display and the gravity line is greater than a preset angle Then, it is judged that the terminal is in the user's close-range operation state.
  • the executing unit 403 is further configured to: not process the voice signal.
  • each unit module in the voice control device may be a central processing unit (CPU) in a voice control device, or a digital signal processor (DSP), or a programmable gate array (FPGA). , Field-Programmable Gate Array) implementation.
  • CPU central processing unit
  • DSP digital signal processor
  • FPGA programmable gate array
  • the apparatus for tracking the service signaling may also be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a separate product.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, and a read only memory (ROM, Read Only Memory), a disk or a disc, and other media that can store program code.
  • ROM Read Only Memory
  • a disk or a disc and other media that can store program code.
  • the embodiment of the present invention further provides a computer storage medium, wherein a computer program is stored, and the computer program is used to execute the voice control method of the embodiment of the present invention.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions are provided to implement the work specified in one or more blocks of a flow or a flow and/or a block diagram of the flowchart The steps that can be made.
  • the terminal obtains the voice signal of the user; the terminal detects the voice strength of the voice signal, and determines the voice intensity interval to which the voice strength of the voice signal belongs; the terminal performs the corresponding operation according to the voice intensity interval to which the voice signal belongs.
  • the process of starting the voice recognition function by the user's voice wake-up word or voice recognition function open button is omitted, which simplifies the process of starting the terminal application through voice control, and improves the user experience.

Abstract

一种语音控制方法、装置及计算机存储介质,该方法包括:终端获取用户的语音信号(S101);终端检测语音信号的语音强度,并判断语音信号的语音强度所属的语音强度区间(S102);终端根据语音信号所属的语音强度区间执行对应的操作(S103)。

Description

一种语音控制方法、装置及计算机存储介质 技术领域
本发明涉及智能终端领域,尤其涉及一种语音控制方法、装置及计算机存储介质。
背景技术
目前,智能终端越来越多的采用语音控制打开终端各种应用程序。现有的语音控制方法中,若用户需要打开终端某一应用程序,用户需要先通过录入预设的语音唤醒词,或触摸屏幕按钮来启动语音识别功能;再通过录入预设的开启应用程序的语音来打开相应的应用程序。例如,用户需要通过语音控制打开地图时,首先通过录入“你好,小星”(预设的语音唤醒词)启动语音识别功能;听到终端应答提示音确认终端开启语音识别功能后,接着录入“打开地图”(预设的开启地图应用的语音唤醒词),终端打开地图。可以看出,现有的语音控制方法启动过程繁琐,用户体验差。
发明内容
为解决上述技术问题,本发明实施例期望提供一种语音控制方法及装置,简化通过语音控制来启动终端应用程序的过程,提高用户体验。
本发明实施例的技术方案是这样实现的:
第一方面,本发明实施例提供了一种语音控制方法,包括:终端获取用户的语音信号;所述终端检测所述语音信号的语音强度,并判断所述语音信号的语音强度所属的语音强度区间;所述终端根据所述语音信号所属的语音强度区间执行对应的操作。
在上述实施例中,所述语音强度区间包括以下至少一种:较强语音强 度区间,一般语音强度区间,较弱语音强度区间,其中,所述较强语音强度区间的语音强度大于所述一般语音强度区间的语音强度,所述一般语音强度区间的语音强度大于所述较弱语音强度区间的语音强度。
在上述实施例中,当所述语音信号所属的语音强度区间为所述较强语音强度区间时,所述终端根据所述语音信号所属的语音强度区间执行对应的操作,包括:所述终端判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则所述终端执行与所述语音信号对应的操作。
在上述实施例中,在所述终端判断所述语音信号是否是预设的语音指令之前,所述方法还包括:所述终端通过噪音检测判断出所述语音信号不是噪音时,所述终端判断所述语音信号是否是预设的语音指令。
在上述实施例中,在所述终端判断所述语音信号是否是预设的语音指令之前,所述方法还包括:所述终端判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,所述终端判断所述语音信号是否是预设的语音指令。
在上述实施例中,当语音信号所属的语音强度区间为所述一般语音强度区间时,所述终端根据所述语音信号所属的语音强度区间执行对应的操作,包括:所述终端判断自身是否处于用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且所述终端判断出所述语音信号是预设的语音指令,则所述终端执行与语音信号对应的操作。
在上述实施例中,在所述终端判断自身是否处于用户近距离操作状态之前,所述方法还包括:所述终端通过噪音检测判断出所述语音信号不是噪音时,所述终端判断自身是否处于所述用户近距离操作状态。
在上述实施例中,在所述终端判断自身是否处于用户近距离操作状态之前,所述方法还包括:所述终端判断对所述语音信号的处理时间是否小 于预设的时间阈值,并在所述处理时间小于所述时间阈值时,所述终端判断自身是否处于所述用户近距离操作状态。
在上述实施例中,所述终端判断自身是否处于用户近距离操作状态,包括:所述终端采集当前画面信息,若根据当前画面信息识别出人脸特征,则所述终端判断自身处于所述用户近距离操作状态。
在上述实施例中,所述终端判断自身是否处于用户近距离操作状态,包括:所述终端通过重力传感器来判定终端显示屏垂直线与重力线的夹角是否大于预设角度,若所述终端显示屏垂直线与重力线的夹角大于预设角度,则所述终端判断自身处于所述用户近距离操作状态。
在上述实施例中,当所述语音信号所属的语音强度区间为所述较弱语音强度区间时,所述终端根据所述语音信号所属的语音强度区间执行对应的操作,包括:所述终端对所述语音信号不做处理。
第二方面,本发明实施例提供了一种语音控制装置,包括:获取单元、检测单元、执行单元,其中:所述获取单元,配置为获取用户的语音信号;所述检测单元,配置为检测所述获取单元获取的所述语音信号的语音强度,并判断所述语音信号的语音强度所属的语音强度区间;所述执行单元,配置为根据所述检测单元判断出的所述语音信号所属的语音强度区间执行对应的操作。
在上述实施例中,所述语音强度区间包括以下至少一种:较强语音强度区间,一般语音强度区间,较弱语音强度区间,其中,所述较强语音强度区间的语音强度大于所述一般语音强度区间的语音强度,所述一般语音强度区间的语音强度大于所述较弱语音强度区间的语音强度。
在上述实施例中,当所述检测单元判断出所述语音信号所属的语音强度区间为所述较强语音强度区间时,所述执行单元,还配置为:判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则 执行与所述语音信号对应的操作。
在上述实施例中,所述执行单元,还配置为:通过噪音检测判断出所述语音信号不是噪音时,判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
在上述实施例中,所述执行单元,还配置为:判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
在上述实施例中,当所述检测单元判断出所述语音信号所属的语音强度区间为所述一般语音强度区间时,所述执行单元,还配置为:判断所述终端是否处于用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且判断出所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
在上述实施例中,所述执行单元,还配置为:通过噪音检测判断出所述语音信号不是噪音时,判断所述终端是否处于所述用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且判断出所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
在上述实施例中,所述执行单元,还配置为:判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,判断所述终端是否处于所述用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且判断出所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
在上述实施例中,所述执行单元,还配置为:采集当前画面信息,若根据当前画面信息识别出人脸特征,则判断所述终端处于所述用户近距离操作状态。
在上述实施例中,所述执行单元,还配置为:通过重力传感器来判定终端显示屏垂直线与重力线的夹角是否大于预设角度,若所述终端显示屏垂直线与重力线的夹角大于预设角度,则判断所述终端处于所述用户近距离操作状态。
在上述实施例中,当所述检测单元判断出所述语音信号所属的语音强度区间为所述较弱语音强度区间时,所述执行单元,还配置为:对所述语音信号不做处理。
第三方面,本发明实施例提供了一种计算机存储介质,本发明实施例提供的计算机存储介质存储有计算机程序,该计算机程序用于执行上述语音控制方法。
本发明实施例提供了一种语音控制方法、装置及计算机存储介质,通过终端获取用户的语音信号;终端检测语音信号的语音强度,并判断语音信号的语音强度所属的语音强度区间;终端根据语音信号所属的语音强度区间执行对应的操作,省略了用户通过语音唤醒词或语音识别功能开启按钮来启动语音识别功能的过程,简化了通过语音控制来启动终端应用程序的过程,提高用户体验。
附图说明
图1为本发明实施例提供的一种语音控制方法的流程示意图;
图2为本发明实施例提供的一种语音强度区间示意图;
图3为本发明实施例提供的一种语音控制方法详细实施例的流程示意图;
图4为本发明实施例提供的一种语音控制装置的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进 行清楚、完整地描述。
图1示出了本发明实施例提供的一种语音控制方法,参考图1,该方法包括:
S101:终端获取用户的语音信号;
需要说明的是,本发明实施例提供的语音控制方法,应用于用户近距离操作终端的场景下,例如用户手持终端的场景,或用户将终端放在桌面上并靠近终端进行操作的场景。
S102:终端检测语音信号的语音强度,并判断语音信号的语音强度所属的语音强度区间;
需要说明的是,若用户近距离操作终端,终端检测到的语音信号的语音强度是在一个固定范围内变化的。例如,若用户在距离终端话筒5-20cm处说话,终端检测到的语音信号的语音强度80-100dB左右。这样,终端通过检测语音信号的语音强度,并判断语音信号的语音强度属于预先设置哪个的语音强度区间,实现终端针对不同语音强度区间对语音信号进行准确分析,为用户快速开启终端应用程序提供基础。
示例性地,语音强度区间包括以下至少一种:较强语音强度区间,一般语音强度区间,较弱语音强度区间,其中,较强语音强度区间的语音强度大于一般语音强度区间的语音强度,一般语音强度区间的语音强度大于较弱语音强度区间的语音强度。
举例来说,图2示出了一种语音强度区间示意图,参见图2所示,较强语音强度区间为语音强度大于V1的区间;一般语音强度区间为语音强度小于等于V1且大于V2;较弱语音强度区间为语音强度小于等于V2。其中,V1和V2为预先设置的用于判断是否触发终端语音控制的语音强度阈值,V1大于V2。V1及V2是终端根据用户近距离操作时检测到的语音信号的语音强度变化范围来设置的。例如,终端检测到的语音信号的语音强度变 化范围为80-100dB,那么V1可以在85-95dB之间设置,V2可以在70-80dB之间设置,V1及V2的具体数值的设置可以根据实际情况进行设置,本发明实施例对此不做具体限制。
S103:终端根据语音信号所属的语音强度区间执行对应的操作。
需要说明的是,终端预先设置不同语音强度区间对应的操作。例如,对于语音强度较强的语音强度区间,由于语音强度足够大,说明用户在距离终端话筒较近处说话,便可以直接触发语音控制,执行与语音信号对应的操作,省略了用户通过语音唤醒词或语音识别功能开启按钮来启动语音识别功能的过程,使用户在使用语音识别功能开启应用程序时更加快捷。对于语音强度不太大的语音强度区间,需要进行辅助判断,即进一步确认终端处于用户近距离操作状态时,再触发语音控制,执行与语音信号对应的操作,这样可以提高终端对语音控制的准确性。对于语音强度非常弱的语音强度区间,可以忽略该语音信号。
示例性地,当语音信号所属的语音强度区间为较强语音强度区间时,终端根据语音信号所属的语音强度区间执行对应的操作,包括:终端判断语音信号是否是预设的语音指令,若语音信号是预设的语音指令,则终端执行与语音信号对应的操作。
其中,预设的语音指令,用于终端打开相应的应用程序。比如,用户预设语音指令“打开手机地图”,用于终端打开地图应用。又如,用户预设语音指令包括终端通信录里存储的联系人姓名,那么当用户对着终端话筒说出通信录里的联系人姓名如“张三”时,终端检索出关于“张三”的电话号码等信息供用户进行拨出电话等操作。
优选地,在终端判断语音信号是否是预设的语音指令之前,方法还包括:终端通过噪音检测判断出该语音信号不是噪音时,终端判断语音信号是否是预设的语音指令。
其中,终端对语音信号进行噪音检测,判断语音信号是典型的已知噪声还是人为语音,是通信领域的惯用技术手段,因此这里不再赘述其实现过程。
需要说明的是,终端对语音信号进行噪音检测,可以排除终端被所处环境的噪声所触发的情况,从而避免误操作,提高了终端对语音控制的准确性。
优选地,在终端判断语音信号是否是预设的语音指令之前,方法还包括:终端判断对语音信号的处理时间是否小于预设的时间阈值,并在处理时间小于预设的时间阈值时,终端判断语音信号是否是预设的语音指令。
需要说明的是,终端仅在处理时间小于预设的时间阈值时,执行与语音信号对应的操作,满足了用户语音控制终端的实效性要求,提高了用户体验。
示例性地,当语音信号所属的语音强度区间为一般语音强度区间时,终端根据语音信号所属的语音强度区间执行对应的操作,包括:终端判断自身是否处于用户近距离操作状态,若终端处于用户近距离操作状态,且终端判断出语音信号是预设的语音指令,则终端执行与语音信号对应的操作。
其中,用户近距离操作状态,为用户在距离终端较近处操作终端的状态,用于辅助判断终端是否触发语音控制。
优选地,在终端判断自身是否处于用户近距离操作状态之前,方法还包括:终端通过噪音检测判断出该语音信号不是噪音时,终端判断自身是否处于用户近距离操作状态。
优选地,在终端判断自身是否处于用户近距离操作状态之前,方法还包括:终端判断对语音信号的处理时间是否小于预设的时间阈值,并在处理时间小于预设的时间阈值时,终端判断自身是否处于用户近距离操作状 态。
优选地,终端判断自身是否处于用户近距离操作状态,包括:终端采集当前画面信息,若根据当前画面信息识别出人脸特征,则终端判断自身处于用户近距离操作状态。
需要说明的是,终端采集当前画面信息并根据当前画面信息识别出人脸特征,是现有的一项人脸识别技术,因此这里不再赘述其实现过程。。
优选地,终端判断自身是否处于用户近距离操作状态,包括:终端通过重力传感器来判定终端显示屏垂直线与重力线的夹角是否大于预设角度,若终端显示屏垂直线与重力线的夹角大于预设角度,则终端判断自身处于用户近距离操作状态。
其中,终端显示屏垂直线,为垂直于终端显示屏且方向面向终端外侧;重力线,方向为竖直向下;预设角度,用于判断终端显示屏是水平或倾斜的面向竖直向上的方向的,预设角度可以根据实际情况进行设置,例如,预设角度为135度,本发明实施例对此不做具体限制。
需要说明的是,终端通过重力传感器判断出终端显示屏垂直线与重力线的夹角大于预设角度时,可以判断出终端显示屏是水平或倾斜的面向竖直向上的方向,而由于大多数情况下用户对终端进行操作时,终端显示屏是水平或倾斜的面向竖直向上的方向,进而可以判断出终端处于用户近距离操作状态。
示例性地,当语音信号所属的语音强度区间为较弱语音强度区间时,终端根据语音信号所属的语音强度区间执行对应的操作,包括:终端对语音信号不做处理。
图3示出了本发明实施例提供的一种语音控制方法详细实施例的流程图,参考图3,该方法包括:
S301:终端获取用户的语音信号;
S302:终端检测语音信号的语音强度V;
S303:终端根据语音强度V判断语音信号所属的语音强度区间;若语音信号属于较弱语音强度区间,则继续执行步骤S301;若语音信号属于一般语音强度区间,则执行步骤S304;若语音信号属于较强语音强度区间,则执行步骤S305;
其中,终端预先设置V1为90dB,V2为70dB,若语音强度V≤V2,则语音信号属于较弱语音强度区间;若V2<V≤V1,则语音信号属于一般语音强度区间;若语音强度V>V1,则语音信号属于较强语音强度区间。
S304:终端采集当前画面信息,并判断当前画面信息是否能够识别出人脸特征,若是,则执行步骤S305;若否,则继续执行步骤S301;
S305:终端通过噪音检测判断语音信号是否噪音,若是,则继续执行步骤S301;若否,则执行步骤S306;
S306:终端判断对语音信号的处理时间是否小于预设的时间阈值,若否,则继续执行步骤S301;若是,则执行步骤S307;
S307:终端判断语音信号是否是预设的语音指令,若否,则继续执行步骤S301;若是,则执行步骤S308;
S308:终端解析语音信号,并根据解析的结果执行相应的操作。
本发明实施例提供了一种语音控制方法,通过终端获取用户的语音信号;终端检测语音信号的语音强度,并判断语音信号的语音强度所属的语音强度区间;终端根据语音信号所属的语音强度区间执行对应的操作,省略了用户通过语音唤醒词或语音识别功能开启按钮来启动语音识别功能的过程,使用户在使用语音识别功能开启应用程序时更加快捷,提高用户体验。
图4示出了本发明实施例提供的一种语音控制装置的结构示意图,参考图4,该语音控制装置40包括:获取单元401、检测单元402、执行单元 403,其中:
获取单元401,配置为获取用户的语音信号;
检测单元402,配置为检测获取单元401获取的语音信号的语音强度,并判断语音信号的语音强度所属的语音强度区间;
执行单元403,配置为根据检测单元402判断出的语音信号所属的语音强度区间执行对应的操作。
示例性地,语音强度区间包括以下至少一种:较强语音强度区间,一般语音强度区间,较弱语音强度区间,其中,较强语音强度区间的语音强度大于一般语音强度区间的语音强度,一般语音强度区间的语音强度大于较弱语音强度区间的语音强度。
示例性地,当检测单元402判断出语音信号所属的语音强度区间为较强语音强度区间时,执行单元403,还配置为:判断语音信号是否是预设的语音指令,若语音信号是预设的语音指令,则执行与语音信号对应的操作。
示例性地,执行单元403,还配置为:通过噪音检测判断出语音信号不是噪音时,判断语音信号是否是预设的语音指令,若语音信号是预设的语音指令,则执行与语音信号对应的操作。
示例性地,执行单元403,还配置为:判断对语音信号的处理时间是否小于预设的时间阈值,并在处理时间小于时间阈值时,判断语音信号是否是预设的语音指令,若语音信号是预设的语音指令,则执行与语音信号对应的操作。
示例性地,当检测单元402判断出语音信号所属的语音强度区间为一般语音强度区间时,执行单元403,还配置为:判断终端是否处于用户近距离操作状态,若终端处于用户近距离操作状态,且判断出语音信号是预设的语音指令,则执行与语音信号对应的操作。
示例性地,执行单元403,还配置为:通过噪音检测判断出语音信号不 是噪音时,判断终端是否处于用户近距离操作状态,若终端处于用户近距离操作状态,且判断出语音信号是预设的语音指令,则执行与语音信号对应的操作。
示例性地,执行单元403,还配置为:判断对语音信号的处理时间是否小于预设的时间阈值,并在处理时间小于时间阈值时,判断终端是否处于用户近距离操作状态,若终端处于用户近距离操作状态,且判断出语音信号是预设的语音指令,则执行与语音信号对应的操作。
示例性地,执行单元403,还配置为:采集当前画面信息,若根据当前画面信息识别出人脸特征,则判断终端处于用户近距离操作状态。
示例性地,执行单元403,还配置为:通过重力传感器来判定终端显示屏垂直线与重力线的夹角是否大于预设角度,若终端显示屏垂直线与重力线的夹角大于预设角度,则判断终端处于用户近距离操作状态。
示例性地,当检测单元402判断出语音信号所属的语音强度区间为较弱语音强度区间时,执行单元403,还配置为:对语音信号不做处理。
实际应用中,所述语音控制装置中各个单元模块可由语音控制装置中的中央处理器(CPU,Central Processing Unit)、或数字信号处理器(DSP,Digital Signal Processor)、或可编程门阵列(FPGA,Field-Programmable Gate Array)实现。
本发明实施例上述业务信令跟踪的装置如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM, Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。
相应地,本发明实施例还提供一种计算机存储介质,其中存储有计算机程序,该计算机程序用于执行本发明实施例的语音控制方法。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功 能的步骤。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。
工业实用性
本发明实施例的技术方案,通过终端获取用户的语音信号;终端检测语音信号的语音强度,并判断语音信号的语音强度所属的语音强度区间;终端根据语音信号所属的语音强度区间执行对应的操作,省略了用户通过语音唤醒词或语音识别功能开启按钮来启动语音识别功能的过程,简化了通过语音控制来启动终端应用程序的过程,提高用户体验。

Claims (23)

  1. 一种语音控制方法,包括:
    终端获取用户的语音信号;
    所述终端检测所述语音信号的语音强度,并判断所述语音信号的语音强度所属的语音强度区间;
    所述终端根据所述语音信号所属的语音强度区间执行对应的操作。
  2. 根据权利要求1所述的方法,其中,所述语音强度区间包括以下至少一种:较强语音强度区间,一般语音强度区间,较弱语音强度区间,其中,所述较强语音强度区间的语音强度大于所述一般语音强度区间的语音强度,所述一般语音强度区间的语音强度大于所述较弱语音强度区间的语音强度。
  3. 根据权利要求2所述的方法,其中,当所述语音信号所属的语音强度区间为所述较强语音强度区间时,所述终端根据所述语音信号所属的语音强度区间执行对应的操作,包括:
    所述终端判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则所述终端执行与所述语音信号对应的操作。
  4. 根据权利要求3所述的方法,其中,在所述终端判断所述语音信号是否是预设的语音指令之前,所述方法还包括:
    所述终端通过噪音检测判断出所述语音信号不是噪音时,所述终端判断所述语音信号是否是预设的语音指令。
  5. 根据权利要求3所述的方法,其中,在所述终端判断所述语音信号是否是预设的语音指令之前,所述方法还包括:
    所述终端判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,所述终端判断所述语音信号是否是预设的语音指令。
  6. 根据权利要求2所述的方法,其中,当语音信号所属的语音强度区间为所述一般语音强度区间时,所述终端根据所述语音信号所属的语音强度区间执行对应的操作,包括:
    所述终端判断自身是否处于用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且所述终端判断出所述语音信号是预设的语音指令,则所述终端执行与语音信号对应的操作。
  7. 根据权利要求6所述的方法,其中,在所述终端判断自身是否处于用户近距离操作状态之前,所述方法还包括:
    所述终端通过噪音检测判断出所述语音信号不是噪音时,所述终端判断自身是否处于所述用户近距离操作状态。
  8. 根据权利要求6所述的方法,其中,在所述终端判断自身是否处于用户近距离操作状态之前,所述方法还包括:
    所述终端判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,所述终端判断自身是否处于所述用户近距离操作状态。
  9. 根据权利要求6所述的方法,其中,所述终端判断自身是否处于用户近距离操作状态,包括:
    所述终端采集当前画面信息,若根据当前画面信息识别出人脸特征,则所述终端判断自身处于所述用户近距离操作状态。
  10. 根据权利要求6所述的方法,其中,所述终端判断自身是否处于用户近距离操作状态,包括:
    所述终端通过重力传感器来判定终端显示屏垂直线与重力线的夹角是否大于预设角度,若所述终端显示屏垂直线与重力线的夹角大于预设角度,则所述终端判断自身处于所述用户近距离操作状态。
  11. 根据权利要求2所述的方法,其中,当所述语音信号所属的语音 强度区间为所述较弱语音强度区间时,所述终端根据所述语音信号所属的语音强度区间执行对应的操作,包括:
    所述终端对所述语音信号不做处理。
  12. 一种语音控制装置,包括:获取单元、检测单元、执行单元,其中:
    所述获取单元,配置为获取用户的语音信号;
    所述检测单元,配置为检测所述获取单元获取的所述语音信号的语音强度,并判断所述语音信号的语音强度所属的语音强度区间;
    所述执行单元,配置为根据所述检测单元判断出的所述语音信号所属的语音强度区间执行对应的操作。
  13. 根据权利要求12所述的装置,其中,所述语音强度区间包括以下至少一种:较强语音强度区间,一般语音强度区间,较弱语音强度区间,其中,所述较强语音强度区间的语音强度大于所述一般语音强度区间的语音强度,所述一般语音强度区间的语音强度大于所述较弱语音强度区间的语音强度。
  14. 根据权利要求13所述的装置,其中,当所述检测单元判断出所述语音信号所属的语音强度区间为所述较强语音强度区间时,所述执行单元,还配置为:
    判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则执行与所述语音信号对应的操作。
  15. 根据权利要求14所述的装置,其中,所述执行单元,还配置为:
    通过噪音检测判断出所述语音信号不是噪音时,判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
  16. 根据权利要求14所述的装置,其中,所述执行单元,还配置为:
    判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,判断所述语音信号是否是预设的语音指令,若所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
  17. 根据权利要求13所述的装置,其中,当所述检测单元判断出所述语音信号所属的语音强度区间为所述一般语音强度区间时,所述执行单元,还配置为:
    判断所述终端是否处于用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且判断出所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
  18. 根据权利要求17所述的装置,其中,所述执行单元,还配置为:
    通过噪音检测判断出所述语音信号不是噪音时,判断所述终端是否处于所述用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且判断出所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
  19. 根据权利要求17所述的装置,其中,所述执行单元,还配置为:
    判断对所述语音信号的处理时间是否小于预设的时间阈值,并在所述处理时间小于所述时间阈值时,判断所述终端是否处于所述用户近距离操作状态,若所述终端处于所述用户近距离操作状态,且判断出所述语音信号是预设的语音指令,则执行与语音信号对应的操作。
  20. 根据权利要求17所述的装置,其中,所述执行单元,还配置为:
    采集当前画面信息,若根据当前画面信息识别出人脸特征,则判断所述终端处于所述用户近距离操作状态。
  21. 根据权利要求17所述的装置,其中,所述执行单元,还配置为:
    通过重力传感器来判定终端显示屏垂直线与重力线的夹角是否大于预设角度,若所述终端显示屏垂直线与重力线的夹角大于预设角度,则判断所述终端处于所述用户近距离操作状态。
  22. 根据权利要求13所述的装置,其中,当所述检测单元判断出所述语音信号所属的语音强度区间为所述较弱语音强度区间时,所述执行单元,还配置为:对所述语音信号不做处理。
  23. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行权利要求1-11任一项所述的语音控制方法。
PCT/CN2015/085287 2015-06-15 2015-07-28 一种语音控制方法、装置及计算机存储介质 WO2016201767A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510330981.0 2015-06-15
CN201510330981.0A CN106254612A (zh) 2015-06-15 2015-06-15 一种语音控制方法及装置

Publications (1)

Publication Number Publication Date
WO2016201767A1 true WO2016201767A1 (zh) 2016-12-22

Family

ID=57544669

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/085287 WO2016201767A1 (zh) 2015-06-15 2015-07-28 一种语音控制方法、装置及计算机存储介质

Country Status (2)

Country Link
CN (1) CN106254612A (zh)
WO (1) WO2016201767A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109599103A (zh) * 2018-11-16 2019-04-09 广州小鹏汽车科技有限公司 车辆控制方法、装置、系统、计算机可读存储介质和汽车

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106653021B (zh) * 2016-12-27 2020-06-02 上海智臻智能网络科技股份有限公司 语音唤醒的控制方法、装置及终端
CN108447472B (zh) 2017-02-16 2022-04-05 腾讯科技(深圳)有限公司 语音唤醒方法及装置
CN107910003A (zh) * 2017-12-22 2018-04-13 智童时刻(厦门)科技有限公司 一种用于智能设备的语音交互方法及语音控制系统
CN109147764A (zh) * 2018-09-20 2019-01-04 百度在线网络技术(北京)有限公司 语音交互方法、装置、设备及计算机可读介质
CN109841214B (zh) 2018-12-25 2021-06-01 百度在线网络技术(北京)有限公司 语音唤醒处理方法、装置和存储介质
CN109584878A (zh) * 2019-01-14 2019-04-05 广东小天才科技有限公司 一种语音唤醒方法及系统
CN110097875B (zh) * 2019-06-03 2022-09-02 清华大学 基于麦克风信号的语音交互唤醒电子设备、方法和介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007243673A (ja) * 2006-03-09 2007-09-20 Nec Corp 車内情報提供システム及び番組提供方法
CN102355562A (zh) * 2011-09-16 2012-02-15 青岛海信移动通信技术股份有限公司 一种音量控制方法及其设备
CN103049186A (zh) * 2012-09-24 2013-04-17 共青城赛龙通信技术有限责任公司 具有声控按键功能的电子装置及声控按键方法
CN103472994A (zh) * 2013-09-06 2013-12-25 乐得科技有限公司 一种基于语音实现操作控制的方法、装置和系统
CN104200816A (zh) * 2014-07-31 2014-12-10 广东美的制冷设备有限公司 语音控制方法和系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101729664A (zh) * 2009-12-10 2010-06-09 深圳华为通信技术有限公司 一种启动终端的方法及终端
CN102868827A (zh) * 2012-09-15 2013-01-09 潘天华 一种利用语音命令控制手机应用程序启动的方法
JP6167605B2 (ja) * 2013-03-28 2017-07-26 株式会社デンソー 音声認識システム
CN104461597A (zh) * 2013-09-24 2015-03-25 腾讯科技(深圳)有限公司 应用程序的启动控制方法及装置
CN104660792A (zh) * 2013-11-21 2015-05-27 腾讯科技(深圳)有限公司 唤醒应用的方法及装置
CN104615359B (zh) * 2015-02-13 2018-05-29 小米科技有限责任公司 对应用软件进行语音操作的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007243673A (ja) * 2006-03-09 2007-09-20 Nec Corp 車内情報提供システム及び番組提供方法
CN102355562A (zh) * 2011-09-16 2012-02-15 青岛海信移动通信技术股份有限公司 一种音量控制方法及其设备
CN103049186A (zh) * 2012-09-24 2013-04-17 共青城赛龙通信技术有限责任公司 具有声控按键功能的电子装置及声控按键方法
CN103472994A (zh) * 2013-09-06 2013-12-25 乐得科技有限公司 一种基于语音实现操作控制的方法、装置和系统
CN104200816A (zh) * 2014-07-31 2014-12-10 广东美的制冷设备有限公司 语音控制方法和系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109599103A (zh) * 2018-11-16 2019-04-09 广州小鹏汽车科技有限公司 车辆控制方法、装置、系统、计算机可读存储介质和汽车
CN109599103B (zh) * 2018-11-16 2021-02-19 广州小鹏汽车科技有限公司 车辆控制方法、装置、系统、计算机可读存储介质和汽车

Also Published As

Publication number Publication date
CN106254612A (zh) 2016-12-21

Similar Documents

Publication Publication Date Title
WO2016201767A1 (zh) 一种语音控制方法、装置及计算机存储介质
US10943584B2 (en) Speech recognition method, speech wakeup apparatus, speech recognition apparatus, and terminal
KR101752119B1 (ko) 다수의 디바이스에서의 핫워드 검출
KR101981878B1 (ko) 스피치의 방향에 기초한 전자 디바이스의 제어
JP6489563B2 (ja) 音量調節方法、システム、デバイス及びプログラム
US20160019886A1 (en) Method and apparatus for recognizing whisper
WO2015180447A1 (zh) 一种告警方法和终端、存储介质
WO2017032030A1 (zh) 一种音量调节方法及用户终端
US9781240B2 (en) Method and user terminal for performing call using voice recognition
US20180152163A1 (en) Noise control method and device
TW201626365A (zh) 說話者識別及非監督式說話者自適應性技術
TW201337722A (zh) 音樂播放裝置及其控制方法
KR101559364B1 (ko) 페이스 투 페이스 인터랙션 모니터링을 수행하는 모바일 장치, 이를 이용하는 인터랙션 모니터링 방법, 이를 포함하는 인터랙션 모니터링 시스템 및 이에 의해 수행되는 인터랙션 모니터링 모바일 애플리케이션
WO2014183529A1 (zh) 切换移动终端通话模式的方法、装置及存储介质
US11178280B2 (en) Input during conversational session
WO2017008378A1 (zh) 信息提示方法、装置及终端
WO2014154077A1 (zh) 一种结束通话的方法、装置及移动终端
WO2016082344A1 (zh) 一种语音控制的方法、装置及存储介质
US9319513B2 (en) Automatic un-muting of a telephone call
CN105376418A (zh) 一种来电信息处理方法、装置和系统
CN106126179B (zh) 一种信息处理方法和电子设备
TW201423587A (zh) 被叫提示系統及方法
CN107680592A (zh) 一种移动终端语音识别方法、及移动终端及存储介质
CN104572007A (zh) 一种终端的音量调节方法
CN104571856A (zh) 一种终端

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15895340

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15895340

Country of ref document: EP

Kind code of ref document: A1