WO2018035986A1 - 语音控制方法、装置及计算机存储介质 - Google Patents

语音控制方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2018035986A1
WO2018035986A1 PCT/CN2016/105489 CN2016105489W WO2018035986A1 WO 2018035986 A1 WO2018035986 A1 WO 2018035986A1 CN 2016105489 W CN2016105489 W CN 2016105489W WO 2018035986 A1 WO2018035986 A1 WO 2018035986A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
call
voice information
module
information
Prior art date
Application number
PCT/CN2016/105489
Other languages
English (en)
French (fr)
Inventor
李腾飞
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to US16/319,950 priority Critical patent/US20190228770A1/en
Publication of WO2018035986A1 publication Critical patent/WO2018035986A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6016Substation equipment, e.g. for use by subscribers including speech amplifiers in the receiver circuit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6033Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
    • H04M1/6041Portable telephones adapted for handsfree use
    • H04M1/605Portable telephones adapted for handsfree use involving control of the receiver volume to provide a dual operational mode at close or far distance from the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present invention relates to the field of communications, and in particular, to a voice control method, apparatus, and computer storage medium.
  • wireless communication With the popularity of electronic devices with communication functions, wireless communication has become the normal behavior in life. However, the quality of wireless communication will be affected by various factors, such as equipment, network and environment, which may cause the voice signal of wireless communication to be louder. Or it is too small, so the user needs to perform the volume adjustment operation, and may also need to perform other processing operations while the call is made, so that the experience of the communication behavior is deteriorated.
  • the embodiments of the present invention provide a voice control method, apparatus, and computer storage medium, to at least solve the technical problem that voice control cannot be performed during a call in the related art.
  • a voice control method including: detecting whether a first voice message occurs during a call; and acquiring a second voice message after acquiring the first voice message; The second voice information controls the call process.
  • the method before detecting whether the first voice information is generated, the method further includes: collecting voiceprint information of the call user; determining voiceprint information of the call user and pre-stored voiceprint information. match.
  • At least one of the following control is performed on the call process according to the second voice information: adjusting a voice volume, turning on a voice recording, ending a voice recording, adjusting a voice volume, ending a current call, and a custom operation.
  • the method further includes: determining whether the third voice information is received within a preset time; and when the determining result is yes, stopping collecting the second voice information.
  • the method includes: instructing to stop controlling the call process according to the second voice information.
  • a voice control apparatus including: a detecting module, configured to detect whether a first voice message is generated during a call; and a first collecting module, configured to acquire at the detecting module After the first voice information is received, the second voice information is collected; and the control module is configured to control the call process according to the second voice information collected by the first collection module.
  • the device further includes: a second collection module, configured to collect voiceprint information of the call user before the detecting module detects whether the first voice information is generated; and a determining module, configured to determine the call The user's voiceprint information matches the pre-stored voiceprint information.
  • control module is configured to perform at least one of the following control on the call process according to the second voice information: adjusting the volume of the received call, turning on the recording, ending the recording, adjusting the volume of the delivered call, and ending the current call. , custom operations.
  • the device further includes: a determining module, configured to determine, after the first collecting module collects the second voice information, whether the third voice information is received within a preset time; the first processing module, And when the determining result is yes, controlling the first collecting module to stop collecting the second voice information.
  • a determining module configured to determine, after the first collecting module collects the second voice information, whether the third voice information is received within a preset time; the first processing module, And when the determining result is yes, controlling the first collecting module to stop collecting the second voice information.
  • the apparatus includes: a second processing module, configured to: When the module determines that the result is no, the control module is instructed to stop controlling the call process according to the second voice information.
  • a computer storage medium stores computer executable instructions for performing the voice control method according to the embodiment of the present invention.
  • the first voice information is detected.
  • the second voice information is collected, and the call process is controlled according to the second voice information. . It can solve the technical problem that the voice control cannot be performed during the call in the related art, and provides a better and more convenient communication experience.
  • FIG. 1 is a block diagram showing the hardware structure of a mobile terminal according to a voice control method according to an embodiment of the present invention
  • FIG. 2 is a flow chart of a voice control method according to an embodiment of the present invention.
  • FIG. 3 is a block diagram showing the structure of a voice control apparatus according to an embodiment of the present invention.
  • FIG. 4 is a schematic flow chart of a method according to this embodiment of the present invention.
  • FIG. 5 illustrates an embodiment of an apparatus system interaction in accordance with an embodiment of the present invention.
  • FIG. 1 is a hardware structural block diagram of a mobile terminal of a voice control method according to an embodiment of the present invention.
  • mobile terminal 10 may include at least one (only one shown) processor 102 (which may include, but is not limited to, a microprocessor (MCU) or a programmable logic device (FPGA).
  • MCU microprocessor
  • FPGA programmable logic device
  • a device 104, a memory 104 configured to store data, and a transmission device 106 configured as a communication function It will be understood by those skilled in the art that the structure shown in FIG. 1 is merely illustrative and does not limit the structure of the above electronic device.
  • the mobile terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.
  • the memory 104 can be configured as a software program and a module for storing application software, such as program instructions/modules corresponding to the voice control method in the embodiment of the present invention, and the processor 102 executes each by executing a software program and a module stored in the memory 104.
  • a functional application and data processing, that is, the above method is implemented.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic storage device, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be connected to mobile terminal 10 over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Transmission device 106 is configured to receive or transmit data via a network.
  • the above-described network specific example may include a wireless network provided by a communication provider of the mobile terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • FIG. 2 A flowchart of a voice control method according to an embodiment of the present invention. As shown in FIG. 2, the process includes the following steps:
  • Step S202 during the call, detecting whether the first voice information appears
  • Step S204 after acquiring the first voice information, collecting second voice information
  • Step S206 controlling the call process according to the second voice information.
  • the call process is controlled according to the second voice information. It can solve the technical problem that the voice control cannot be performed during the call in the related art, and provides a better and more convenient communication experience.
  • the execution body of the foregoing steps is a terminal that can perform human-computer interaction by voice, such as a mobile phone, etc., but is not limited thereto.
  • the voice control method in this embodiment further includes:
  • At least one of the following control is performed on the call process according to the second voice information: adjusting the volume of the received call, turning on the recording, ending the recording, adjusting the volume of the delivered call, ending the current call, and other customized controls, eg, Volume screens, screen captures, open applications, and more.
  • the method further includes:
  • the voice control method in this embodiment further includes: instructing to stop controlling the call process according to the second voice information.
  • the first voice information can continue to be collected.
  • the first voice information, the second voice information, and the third voice information in this embodiment may be specific statements that are set, for example, the first voice information may be preset as “HELLO”, “slightly, etc. Voice control, etc.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods of various embodiments of the present invention.
  • a voice control device is also provided, which is used to implement the foregoing embodiments and preferred embodiments, and has not been described again.
  • the term "module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 3 is a structural block diagram of a voice control apparatus according to an embodiment of the present invention. As shown in FIG. 3, the apparatus includes:
  • the detecting module 30 is configured to detect whether the first voice information is generated during the call;
  • the first collecting module 32 is configured to collect the second voice information after the detecting module 30 obtains the first voice information
  • the control module 34 is configured to perform a call according to the second voice information collected by the first collection module 32. The process is controlled.
  • the voice control apparatus of this embodiment further includes: a second collection module, configured to collect voiceprint information of the call user before the detecting module 30 detects whether the first voice information is generated;
  • the determining module is configured to determine that the voiceprint information of the call user matches the pre-stored voiceprint information.
  • control module 34 is configured to perform at least one of the following control on the call process according to the second voice information: adjusting the received volume, turning on the recording, ending the recording, adjusting the volume of the delivered voice, and ending the current call.
  • the voice control apparatus of this embodiment further includes: a determining module, configured to determine, after the first collecting module 32 collects the second voice information, whether the third voice information is received within a preset time;
  • the first processing module is configured to control the first collection module 32 to stop collecting the second voice information when the determination result of the determining module is YES.
  • the voice control device further includes a second processing module, configured to, when the determination result of the determining module is negative, instruct the control module 34 to stop controlling the call process according to the second voice information.
  • each of the above modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination.
  • the forms are located in different processors.
  • This embodiment is an optional embodiment according to the present invention, which is used to describe the present application in detail in conjunction with a specific scenario:
  • the embodiment provides a method and device for performing voice control during a call, and detects a “voice control command opener” and a “voice control command terminator” to obtain a “call process voice control command” of the user during the communication process. "To automatically adjust the call volume, you can provide more Excellent and more convenient communication experience.
  • the device mainly includes a main control subsystem, a wireless signal transceiver subsystem, a memory subsystem, a voice signal transmission subsystem, and a voice signal received.
  • Speech subsystem human-computer interaction interface subsystem, speech recognition control subsystem.
  • the main control subsystem is used for processing and encoding each signal, processing various operations on the device, and transmitting and receiving wireless signals, memory, voice signal, voice signal receiving, human-computer interaction interface, voice recognition control subsystem.
  • the wireless signal transceiver subsystem is configured to send and receive wireless radio frequency signals, and complete establishment and maintenance of communication links.
  • the memory subsystem is used to store data such as software configuration of the communication device and various function configuration parameters.
  • the voice signal sending subsystem is responsible for receiving the voice signal sent by the user.
  • the voice signal receiving subsystem is used to transmit the voice message of the communication partner to the user.
  • the human-computer interaction interface subsystem completes the user's operation on the device, such as making a call, answering a call, and the like.
  • the voice recognition control subsystem completes the voiceprint setting and recognizes the voice command sent by the user through the voice signal sending subsystem, and then feeds back the required response operation to the main control subsystem.
  • a method for automatically adjusting the volume of a call includes: acquiring a user voice through a voice signal sending subsystem and a voice recognition control subsystem in advance to obtain user voiceprint information, and setting a voiceprint information conforming to the voiceprint information.
  • the user is the master user, and at the same time, the "voice control command opener”, “voice control command terminator” and “call process voice control command” during the call are set as specific statements.
  • the "voice control command opener”, “voice control command terminator” and “call process voice control command” can all be a plurality of different specific statements, but they must be different from each other.
  • the response operation of the “call process voice control command” may be a plurality of preset function operations (such as adjusting the volume of the received call, turning on the recording, adjusting the volume of the sent voice, etc.), or a certain function operation defined by the user.
  • the "call process voice control commands" of different response operations must also be independent of each other.
  • FIG. 4 is a schematic flowchart of a method according to the embodiment of the present invention, including:
  • Step 1 The voice signal sending subsystem collects the user voice and sends it to the voice recognition control subsystem;
  • Step 2 The voice recognition control subsystem sets the voice of the master user according to the voiceprint of the user, and guides the user to set the "voice control command opener", "call process voice control command” and "voice control command terminator";
  • Step 3 The human-computer interaction interface subsystem accepts the user's communication request (including calling or answering) and transmits it to the main control subsystem;
  • Step 4 The main control subsystem responds to the user communication request, and controls the wireless signal transceiver subsystem to complete the establishment and maintenance of the wireless communication;
  • Step 5 The main control subsystem reads various configuration parameters in the memory subsystem, and sets the working state of each subsystem in the call process;
  • Step 6 The voice signal sending subsystem sends the received user voice to the call link, and also sends the voice to the voice recognition control subsystem;
  • Step 7 The voice recognition control subsystem locks the master user according to the voiceprint information, detects that the master user issues a “voice control command open character”, and starts to recognize the “talk process voice control command”;
  • Step 8 In the default time, if the voice recognition control subsystem detects that the master user issues a "voice control command terminator", it identifies the "call process voice control command” before the "voice control command terminator"; If the default user does not detect the default user "Voice control command terminator”, then end the “call process voice control command” and do not respond, continue to detect the "voice control command opener”;
  • Step 9 The voice recognition control subsystem reports the response operation required by the recognized “call process voice control command” to the master control subsystem;
  • Step 10 The main control subsystem adjusts and controls the working state of each subsystem, and completes the response operation corresponding to the “call control command”.
  • FIG. 5 is an embodiment of an apparatus system interaction according to an embodiment of the present invention, including:
  • the main control subsystem is used for the processing and coding of each signal, various operation processing of the device, and unified management of subsystems such as wireless signal transmission and reception, memory, voice signal transmission, voice signal reception, and human-computer interaction interface.
  • the wireless signal transceiving subsystem is used to transmit and receive radio frequency signals to complete the establishment and maintenance of communication links.
  • the memory subsystem is used to store software configuration, various parameters and other data of the communication device.
  • the human interface module receives the user's communication request processing to the device.
  • the voice signal delivery subsystem is responsible for receiving voice signals from the user.
  • the voice signal receiving subsystem is used to transmit the voice signal of the communication partner.
  • the voice recognition control subsystem completes the voiceprint setting and recognizes the voice command issued by the user through the voice signal sending subsystem, and then feeds back the required response operation to the main control subsystem.
  • the voice signal sending subsystem sends the user voice to the voice recognition control subsystem to win the voiceprint setting, locks it as the master user, and then according to the voice.
  • the guidance of the recognition control subsystem sets the “Voice Control Command Opener” to “Slightly Wait, Voice Control”; sets the “Call Process Voice Control Command” to “Increase Volume”, and its response operation is to increase the call volume;
  • the voice control command during the call is "reduced volume", and the response operation is to reduce the volume of the call; the end of the "voice control command”
  • the character is set to “execute.”
  • the main control subsystem first reads the audio output configuration parameters in the memory to set the volume of the voice signal receiving subsystem.
  • the voice signal The sending subsystem simultaneously sends the user voice content to the voice recognition control subsystem.
  • the user A feels that the received user B's voice is too small to hear clearly, the user A says: "Slightly wait, voice control: increase Volume, execution.
  • the voice recognition control subsystem determines that the user A is the master user according to the voiceprint information, and after detecting the "slight control command opener” of “slightly wait, voice control", starts to recognize the “call process voice control command”, and then detects After the “execution” “voice control command terminator”, stop identifying the "call process voice control command”, in the process of identifying the “increase volume” "call process voice control command”, while the voice
  • the response operation required to increase the call volume corresponding to the command is reported to the main control subsystem, and the main control subsystem adjusts the audio output configuration parameter to increase the call volume of the voice signal receiving subsystem, so that the received user B
  • the voice signal volume increases; in a quiet environment, user A feels that the received user B's voice is too large, because it will affect others or other reasons.
  • the voice recognition control subsystem determines that the user A is the master user according to the voiceprint information, and after detecting the "slight control command opener” of “slightly wait, voice control", starts to recognize the “call process voice control command”, and then detects After the “execution” “voice control command terminator” is stopped, the “call process voice control command” is stopped, and in the process, the "reduction volume” "call process voice control command” is recognized, and the voice is simultaneously
  • the response operation required to reduce the call volume corresponding to the command is reported to the main control subsystem, and the main control subsystem adjusts the audio output configuration parameter to reduce the call volume of the voice signal receiving subsystem, so that the received user B The voice signal volume is reduced;
  • the main control subsystem adjusts the audio output configuration parameter, and automatically adjusts the volume of the received voice to ensure communication.
  • the effect is also optimized for the experience.
  • the user By detecting and recognizing the voice information of the user during the call, the user automatically wants to complete according to the recognition result.
  • the control operation is carried out to ensure the communication quality and optimize the user experience.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be configured to store program code for performing the following steps:
  • the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), and a mobile hard disk.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • the processor performs, according to the stored program code in the storage medium, whether the first voice information is generated during the call;
  • the processor performs, after acquiring the first voice information, acquiring the second voice information according to the stored program code in the storage medium;
  • the processor performs control of the call process according to the second voice information according to the stored program code in the storage medium.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the various components shown or discussed may be through some interface, device or unit.
  • the indirect coupling or communication connection can be electrical, mechanical or other form.
  • the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit;
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
  • the foregoing storage device includes the following steps: the foregoing storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk or an optical disk.
  • optical disk A medium that can store program code.
  • the above-described integrated unit of the present invention may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a standalone product.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a RAM, a magnetic disk, or an optical disk.
  • the technical solution of the embodiment of the present invention detects whether the first voice information is generated during the call; after acquiring the first voice information, the second voice information is collected; and the call process is performed according to the second voice information. Take control. It can solve the technical problem that the voice control cannot be performed during the call in the related art, and provides a better and more convenient communication experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)

Abstract

一种语音控制方法、装置及计算机存储介质,其中,该方法包括:在通话过程中,检测是否出现第一语音信息(S202);在获取到所述第一语音信息后,采集第二语音信息(S204);根据所述第二语音信息对所述通话过程进行控制(S206)。

Description

语音控制方法、装置及计算机存储介质 技术领域
本发明涉及通信领域,具体涉及一种语音控制方法、装置及计算机存储介质。
背景技术
随着包含通讯功能电子设备的普及,无线通讯已经成为生活中的常态行为,然而无线通讯质量会受到各种因素的影响,如设备、网络以及环境,有可能导致无线通讯的语音信号音量偏大或者偏小,从而需要用户进行音量调节的操作,也可能需要在通话的同时进行其他处理操作,使通讯行为的体验变差。
相关技术中的可以在通话过程中同时进行其他的处理功能,但需要人工手动进行操作,这种方式不够智能化,其所需的操作在通讯过程中也不够方便,相关技术中的语音控制解决方案也不能实现在通话过程中的控制。
针对相关技术中存在的上述问题,目前尚未发现有效的解决方案。
发明内容
本发明实施例提供了一种语音控制方法、装置及计算机存储介质,以至少解决相关技术中在通话过程中不能进行语音控制的技术问题。
根据本发明的一个实施例,提供了一种语音控制方法,包括:在通话过程中,检测是否出现第一语音信息;在获取到所述第一语音信息后,采集第二语音信息;根据所述第二语音信息对所述通话过程进行控制。
作为一种实施方式,在检测是否出现第一语音信息之前,还包括:采集通话用户的声纹信息;确定所述通话用户的声纹信息与预存的声纹信息 匹配。
作为一种实施方式,根据所述第二语音信息对所述通话过程进行以下控制至少之一:调节受话音量、开启录音、结束录音、调节送话音量、结束当前通话、自定义操作。
作为一种实施方式,在采集第二语音信息之后,所述方法还包括:判断在预设时间内是否接收到第三语音信息;在判断结果为是时,停止采集所述第二语音信息。
作为一种实施方式,在判断结果为否时,所述方法包括:指示停止根据所述第二语音信息对所述通话过程进行控制。
根据本发明的另一个实施例,提供了一种语音控制装置,包括:检测模块,用于在通话过程中,检测是否出现第一语音信息;第一采集模块,用于在所述检测模块获取到所述第一语音信息后,采集第二语音信息;控制模块,用于根据所述第一采集模块采集的第二语音信息对所述通话过程进行控制。
作为一种实施方式,所述装置还包括:第二采集模块,用于在所述检测模块检测是否出现第一语音信息之前,采集通话用户的声纹信息;确定模块,用于确定所述通话用户的声纹信息与预存的声纹信息匹配。
作为一种实施方式,所述控制模块用于根据所述第二语音信息对所述通话过程进行以下控制至少之一:调节受话音量、开启录音、结束录音、调节送话音量、结束当前通话、自定义操作。
作为一种实施方式,所述装置还包括:判断模块,用于在所述第一采集模块采集第二语音信息之后,判断在预设时间内是否接收到第三语音信息;第一处理模块,用于在判断结果为是时,控制所述第一采集模块停止采集所述第二语音信息。
作为一种实施方式,所述装置包括:第二处理模块,用于在所述判断 模块判断结果为否时,指示所述控制模块停止根据所述第二语音信息对所述通话过程进行控制。
根据本发明的又一个实施例,还提供了一种计算机存储介质。所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行本发明实施例所述的语音控制方法。
通过本发明实施例,在通话过程中,检测是否出现第一语音信息;在获取到所述第一语音信息后,采集第二语音信息;根据所述第二语音信息对所述通话过程进行控制。可以解决相关技术中在通话过程中不能进行语音控制的技术问题,提供更优良更方便的通讯体验。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是本发明实施例的一种语音控制方法的移动终端的硬件结构框图;
图2是根据本发明实施例的语音控制方法的流程图;
图3是根据本发明实施例的语音控制装置的结构框图;
图4为根据本发明本实施例提供的方法的流程示意图;
图5为根据本发明实施例的装置系统交互实施例。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
实施例1
本申请实施例一所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在移动终端上为例,图1是本发明实施例的一种语音控制方法的移动终端的硬件结构框图。如图1所示,移动终端10可以包括至少一个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器(MCU)或可编程逻辑器件(FPGA)等的处理装置)、配置为存储数据的存储器104、以及配置为通信功能的传输装置106。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,移动终端10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。
存储器104可配置为存储应用软件的软件程序以及模块,如本发明实施例中的语音控制方法对应的程序指令/模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如至少一个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至移动终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106配置为经由一个网络接收或者发送数据。上述的网络具体实例可包括移动终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
在本实施例中提供了一种运行于上述移动终端的语音控制方法,图2 是根据本发明实施例的语音控制方法的流程图,如图2所示,该流程包括如下步骤:
步骤S202,在通话过程中,检测是否出现第一语音信息;
步骤S204,在获取到第一语音信息后,采集第二语音信息;
步骤S206,根据第二语音信息对通话过程进行控制。
通过上述步骤,在通话过程中,检测是否出现第一语音信息;在获取到第一语音信息后,采集第二语音信息;根据第二语音信息对通话过程进行控制。可以解决相关技术中在通话过程中不能进行语音控制的技术问题,提供更优良更方便的通讯体验。
作为一种实施方式,上述步骤的执行主体为可以通过语音进行人机交互的终端,如手机等,但不限于此。
作为一种实施方式,在检测是否出现第一语音信息之前,本实施例所述的语音控制方法还包括:
S11,采集通话用户的声纹信息;
S12,确定所述通话用户的声纹信息与预存的声纹信息匹配。以此可以确定语音控制的主控用户,在安全级别比较高的场景下,在获取到第一语音信息之后,还可以匹配第一语音信息中的声纹信息是否和预存的声纹信息匹配,在匹配的情况下,在继续执行后续步骤。
作为一种实施方式,根据第二语音信息对通话过程进行以下控制至少之一:调节受话音量、开启录音、结束录音、调节送话音量、结束当前通话、其他自定义的控制,如,点量屏幕、截屏、打开应用程序等。
作为一种实施方式,在采集第二语音信息之后,方法还包括:
S21,判断在预设时间内是否接收到第三语音信息;
S22,在判断结果为是时,停止采集第二语音信息。根据之前采集的第二语音信息对通话过程进行控制。在另一个判断分支中,在判断结果为否 时,本实施例所述的语音控制方法还包括:指示停止根据第二语音信息对通话过程进行控制。可以继续采集第一语音信息。
作为一种实施方式,本实施例中的第一语音信息、第二语音信息、第三语音信息可以是设定的特定语句,如第一语音信息可以预设为“HELLO”、“稍等,语音控制”等。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例的方法。
实施例2
在本实施例中还提供了一种语音控制装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图3是根据本发明实施例的语音控制装置的结构框图,如图3所示,该装置包括:
检测模块30,配置为在通话过程中,检测是否出现第一语音信息;
第一采集模块32,配置为在所述检测模块30获取到第一语音信息后,采集第二语音信息;
控制模块34,配置为根据第一采集模块32采集的第二语音信息对通话 过程进行控制。
作为一种实施方式,本实施例的语音控制装置还包括:第二采集模块,配置为在所述检测模块30检测是否出现第一语音信息之前,采集通话用户的声纹信息;
确定模块,配置为确定所述通话用户的声纹信息与预存的声纹信息匹配。
作为一种实施方式,所述控制模块34,配置为根据第二语音信息对通话过程进行以下控制至少之一:调节受话音量、开启录音、结束录音、调节送话音量、结束当前通话。
作为一种实施方式,本实施例的语音控制装置还包括:判断模块,配置为在第一采集模块32采集第二语音信息之后,判断在预设时间内是否接收到第三语音信息;
第一处理模块,配置为在所述判断模块的判断结果为是时,控制所述第一采集模块32停止采集第二语音信息。相应的,所述语音控制装置还包括第二处理模块,配置为在所述判断模块的判断结果为否时,指示所述控制模块34停止根据第二语音信息对通话过程进行控制。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。
实施例3
本实施例为根据本发明的可选实施例,用于结合具体的场景对本申请进行详细说明:
本实施例提供了一种可以在通话过程中进行语音控制的方法和装置,通过检测“语音控制命令开启符”和“语音控制命令结束符”来获取通讯过程中用户的“通话过程语音控制命令”来自动调节通话音量,可以提供更 优良更方便的通讯体验。
本实施例描述了一种可以在通话过程中进行语音控制的方法和装置,这种装置主要包括主控子系统,无线信号收发子系统,存储器子系统,语音信号送话子系统,语音信号受话子系统,人机交互接口子系统,语音识别控制子系统。其中主控子系统用于对各信号的处理编码、对设备的各种操作处理以及对无线信号收发、存储器、语音信号送话、语音信号受话、人机交互接口、语音识别控制子系统的统一管理。其中无线信号收发子系统用于发送和接收无线射频信号,完成通讯链接的建立和维持。其中存储器子系统用来存放通讯设备的软件配置、各功能配置参数等数据。其中语音信号送话子系统负责接收用户所发出的语音信号。其中语音信号受话子系统用来向用户传递通讯对方的语音消息。其中人机交互接口子系统完成用户对设备的操作如拨打电话、接听电话等。其中语音识别控制子系统完成声纹设定并通过语音信号送话子系统对用户发出的语音命令进行识别再向主控子系统反馈所需的响应操作。
本实施例描述的一种可以自动调节通话音量的方法包括:预先通过语音信号送话子系统与语音识别控制子系统采集用户声音以获取用户声纹信息,设定与此声纹信息相符合的用户为主控用户,同时设定通话过程中的“语音控制命令开启符”、“语音控制命令结束符”以及“通话过程语音控制命令”为特定语句。其中“语音控制命令开启符”、“语音控制命令结束符”和“通话过程语音控制命令”都可以为多个不同的特定语句,但必须各自独立不相同。其中“通话过程语音控制命令”的响应操作可以是多个预置的功能操作(如调节受话音量、开启录音、调节送话音量等),也可以是由用户自定义的某种功能操作,且不同响应操作的“通话过程语音控制命令”也必须各自独立不相同。当通讯建立成功后,在通讯过程中,当用户发出“语音控制命令开启符”+“通话过程语音控制命令”+“语音控制 命令结束符”,此时语音识别控制子系统检测到“语音控制命令开启符”与“语音控制命令结束符”及其中间的“通话过程语音控制命令”,识别出主控用户的“通话过程语音控制命令”,将所需响应操作上报给主控子系统,再由主控子系统调整控制各子系统,完成“通话过程语音控制命令”对应的响应操作。藉此实现通话过程中语音控制操作的功能。
本实施例提供了一种可以在通话过程中进行语音控制的方法和装置,图4为根据本发明本实施例提供的方法的流程示意图,包括:
步骤1:语音信号送话子系统采集用户声音,将其发送给语音识别控制子系统;
步骤2:语音识别控制子系统根据用户声纹设定主控用户声音,同时引导用户设定“语音控制命令开启符”、“通话过程语音控制命令”和“语音控制命令结束符”;
步骤3:人机交互接口子系统接受用户的通讯请求(包括呼叫或接听)并传送至主控子系统;
步骤4:主控子系统响应用户通讯请求,并控制无线信号收发子系统完成无线通讯的建立与维持;
步骤5:主控子系统读取存储器子系统中的各类配置参数,设置通话过程中的各子系统的工作状态;
步骤6:语音信号送话子系统将接收到的用户声音在发送至通话链路,同时也发送给语音识别控制子系统;
步骤7:语音识别控制子系统根据声纹信息锁定主控用户,检测到主控用户发出“语音控制命令开启符”,开始识别“通话过程语音控制命令”;
步骤8:在默认时间内,如果语音识别控制子系统检测到主控用户发出“语音控制命令结束符”,则识别处理其在“语音控制命令结束符”之前的“通话过程语音控制命令”;若在默认时间内没有检测到主控用户发出的 “语音控制命令结束符”,则结束识别“通话过程语音控制命令”且不做响应,继续检测“语音控制命令开启符”;
步骤9:语音识别控制子系统将识别到的“通话过程语音控制命令”所需的响应操作上报至主控子系统;
步骤10:主控子系统调节控制各子系统的工作状态,完成“通话过程语音控制命令”对应的响应操作。
图5为根据本发明实施例的装置系统交互实施例,包括:
主控子系统用于对各信号的处理编码、对设备的各种操作处理以及对无线信号收发、存储器、语音信号送话、语音信号受话、人机交互接口等子系统的统一管理。
无线信号收发子系统用于发送和接收无线射频信号,完成通讯链接的建立和维持。
存储器子系统用来存放通讯设备的软件配置、各类参数等数据。
人机交互接口子系统接收用户对设备的通讯请求处理。
语音信号送话子系统负责接收用户所发出的语音信号。
语音信号受话子系统用来传递通讯对方的语音信号。
语音识别控制子系统完成声纹设定并通过语音信号送话子系统对用户发出的语音命令进行识别再向主控子系统反馈所需的响应操作。
下面结合应用场景对本实施例进行说明:
用户A要通过通讯设备与用户B进行语音通讯,在通讯行为之前,通过语音信号送话子系统发送用户声音至语音识别控制子系统完胜声纹设定,锁定其为主控用户,再根据语音识别控制子系统的引导将“语音控制命令开启符”设置为“稍等,语音控制”;设置“通话过程语音控制命令”为“增大音量”,其响应操作为增大通话音量;设置“通话过程语音控制命令”为“减小音量”,其响应操作为减小通话音量;将“语音控制命令结束 符”设置为“执行”。在通讯建立完成开始通话时,主控子系统先读取存储器中的音频输出配置参数,以设置语音信号受话子系统的音量大小。在通话过程中,语音信号送话子系统同时将用户声音内容发送给语音识别控制子系统,当用户A感觉接收到的用户B的声音偏小,听不清楚,此时用户A说:“稍等,语音控制:增大音量,执行。”语音识别控制子系统根据声纹信息确定用户A为主控用户,检测到“稍等,语音控制”这一“语音控制命令开启符”后,开始识别“通话过程语音控制命令”,再检测到“执行”这一“语音控制命令结束符”后,停止识别“通话过程语音控制命令”,在此过程中识别出“增大音量”这一“通话过程语音控制命令”,同时将此语音命令对应的增大通话音量这一所需的响应操作上报给主控子系统,主控子系统调节音频输出配置参数,增大语音信号受话子系统的通话音量,使接收到的用户B的语音信号音量增大;在安静环境中,用户A感觉接收到的用户B的声音偏大,由于会影响旁人或其他原因用户A说:“稍等,语音控制:减小音量,执行。”语音识别控制子系统根据声纹信息确定用户A为主控用户,检测到“稍等,语音控制”这一“语音控制命令开启符”后,开始识别“通话过程语音控制命令”,再检测到“执行”这一“语音控制命令结束符”后,停止识别“通话过程语音控制命令”,在此过程中识别出“减小音量”这一“通话过程语音控制命令”,同时将此语音命令对应的减小通话音量这一所需的响应操作上报给主控子系统,主控子系统调节音频输出配置参数,减小语音信号受话子系统的通话音量,使接收到的用户B的语音信号音量减小;
本实施例通过检测识别用户的语音控制命令,将其所需的响应操作上报至主控子系统,再由主控子系统调节音频输出配置参数,自动的调节受话音量大小,在保证了通讯效果的同时,也优化了使用体验。通过对通话过程中用户的语音信息进行检测和识别,根据识别结果自动完成用户想要 进行的控制操作,既保证了通讯质量,也优化了用户使用体验。
实施例4
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:
S1,在通话过程中,检测是否出现第一语音信息;
S2,在获取到第一语音信息后,采集第二语音信息;
S3,根据第二语音信息对通话过程进行控制。
作为一种实施方式,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
作为一种实施方式,在本实施例中,处理器根据存储介质中已存储的程序代码执行在通话过程中,检测是否出现第一语音信息;
作为一种实施方式,在本实施例中,处理器根据存储介质中已存储的程序代码执行在获取到第一语音信息后,采集第二语音信息;
作为一种实施方式,在本实施例中,处理器根据存储介质中已存储的程序代码执行根据第二语音信息对通话过程进行控制。
作为一种实施方式,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元 的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本发明各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可 轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。
工业实用性
本发明实施例的技术方案通过在通话过程中,检测是否出现第一语音信息;在获取到所述第一语音信息后,采集第二语音信息;根据所述第二语音信息对所述通话过程进行控制。可以解决相关技术中在通话过程中不能进行语音控制的技术问题,提供更优良更方便的通讯体验。

Claims (11)

  1. 一种语音控制方法,包括:
    在通话过程中,检测是否出现第一语音信息;
    在获取到所述第一语音信息后,采集第二语音信息;
    根据所述第二语音信息对所述通话过程进行控制。
  2. 根据权利要求1所述的方法,其中,在检测是否出现第一语音信息之前,所述方法还包括:
    采集通话用户的声纹信息;
    确定所述通话用户的声纹信息与预存的声纹信息匹配。
  3. 根据权利要求1所述的方法,其中,根据所述第二语音信息对所述通话过程进行以下控制至少之一:调节受话音量、开启录音、结束录音、调节送话音量、结束当前通话、自定义操作。
  4. 根据权利要求1所述的方法,其中,在采集第二语音信息之后,所述方法还包括:
    判断在预设时间内是否接收到第三语音信息;
    在判断结果为是时,停止采集所述第二语音信息。
  5. 根据权利要求4所述的方法,其中,在判断结果为否时,所述方法包括:
    指示停止根据所述第二语音信息对所述通话过程进行控制。
  6. 一种语音控制装置,包括:
    检测模块,配置为在通话过程中,检测是否出现第一语音信息;
    第一采集模块,配置为在所述检测模块获取到所述第一语音信息后,采集第二语音信息;
    控制模块,配置为根据所述第一采集模块采集的第二语音信息对所述通话过程进行控制。
  7. 根据权利要求6所述的装置,其中,所述装置还包括:
    第二采集模块,配置为在所述检测模块检测是否出现第一语音信息之前,采集通话用户的声纹信息;
    确定模块,配置为确定所述通话用户的声纹信息与预存的声纹信息匹配。
  8. 根据权利要求6所述的装置,其中,所述控制模块,配置为根据所述第二语音信息对所述通话过程进行以下控制至少之一:调节受话音量、开启录音、结束录音、调节送话音量、结束当前通话、自定义操作。
  9. 根据权利要求6所述的装置,其中,所述装置还包括:
    判断模块,配置为在所述第一采集模块采集第二语音信息之后,判断在预设时间内是否接收到第三语音信息;
    第一处理模块,配置为在判断结果为是时,控制所述第一采集模块停止采集所述第二语音信息。
  10. 根据权利要求9所述的装置,其中,所述装置包括:
    第二处理模块,配置为在所述判断模块判断结果为否时,指示所述控制模块停止根据所述第二语音信息对所述通话过程进行控制。
  11. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至5任一项所述的语音控制方法。
PCT/CN2016/105489 2016-08-24 2016-11-11 语音控制方法、装置及计算机存储介质 WO2018035986A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/319,950 US20190228770A1 (en) 2016-08-24 2016-11-11 Voice control method, device, and computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610718871.6A CN107785013A (zh) 2016-08-24 2016-08-24 语音控制方法及装置
CN201610718871.6 2016-08-24

Publications (1)

Publication Number Publication Date
WO2018035986A1 true WO2018035986A1 (zh) 2018-03-01

Family

ID=61246335

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/105489 WO2018035986A1 (zh) 2016-08-24 2016-11-11 语音控制方法、装置及计算机存储介质

Country Status (3)

Country Link
US (1) US20190228770A1 (zh)
CN (1) CN107785013A (zh)
WO (1) WO2018035986A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109087645A (zh) * 2018-10-24 2018-12-25 科大讯飞股份有限公司 一种解码网络生成方法、装置、设备及可读存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109151564B (zh) * 2018-09-03 2021-06-29 海信视像科技股份有限公司 基于麦克风的设备控制方法及装置
CN109510891B (zh) * 2018-12-29 2020-12-01 深圳市趣创科技有限公司 语音控制录音装置及方法
CN110136710A (zh) * 2019-04-29 2019-08-16 上海力声特医学科技有限公司 人工耳蜗控制方法
US10839060B1 (en) * 2019-08-27 2020-11-17 Capital One Services, Llc Techniques for multi-voice speech recognition commands
JP2021066199A (ja) * 2019-10-17 2021-04-30 本田技研工業株式会社 制御装置
CN110992947B (zh) * 2019-11-12 2022-04-22 北京字节跳动网络技术有限公司 一种基于语音的交互方法、装置、介质和电子设备
JP7338489B2 (ja) * 2020-01-23 2023-09-05 トヨタ自動車株式会社 音声信号制御装置、音声信号制御システム及び音声信号制御プログラム

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103179282A (zh) * 2013-03-26 2013-06-26 东莞宇龙通信科技有限公司 一种通话状态下信息的传达方法、系统及移动终端
US20140171149A1 (en) * 2012-12-17 2014-06-19 Electronics And Telecommunications Research Institute Apparatus and method for controlling mobile device by conversation recognition, and apparatus for providing information by conversation recognition during meeting
CN105049632A (zh) * 2015-08-17 2015-11-11 联想(北京)有限公司 一种通话音量调节方法及电子设备
CN105657165A (zh) * 2015-12-30 2016-06-08 广东欧珀移动通信有限公司 一种通话音量的调节方法及装置
CN105760154A (zh) * 2016-01-27 2016-07-13 广东欧珀移动通信有限公司 一种音频控制方法和装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259908B (zh) * 2012-02-15 2017-06-27 联想(北京)有限公司 一种移动终端及其智能控制方法
CN102780815A (zh) * 2012-06-29 2012-11-14 宇龙计算机通信科技(深圳)有限公司 自动挂断通话的方法及通信终端
EP2784774A1 (en) * 2013-03-29 2014-10-01 Orange Telephone voice personnal assistant
WO2015149359A1 (zh) * 2014-04-04 2015-10-08 华为终端有限公司 一种自动调节音量的方法、音量调节装置及电子设备
CN104301522A (zh) * 2014-09-19 2015-01-21 联想(北京)有限公司 通讯中的信息输入方法及通讯终端
CN105100455A (zh) * 2015-07-06 2015-11-25 珠海格力电器股份有限公司 语音控制接听来电通话的方法和装置
CN105245707A (zh) * 2015-09-28 2016-01-13 努比亚技术有限公司 一种移动终端及处理信息的方法
CN105719646A (zh) * 2016-01-22 2016-06-29 史唯廷 语音控制音乐播放方法及语音控制音乐播放装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140171149A1 (en) * 2012-12-17 2014-06-19 Electronics And Telecommunications Research Institute Apparatus and method for controlling mobile device by conversation recognition, and apparatus for providing information by conversation recognition during meeting
CN103179282A (zh) * 2013-03-26 2013-06-26 东莞宇龙通信科技有限公司 一种通话状态下信息的传达方法、系统及移动终端
CN105049632A (zh) * 2015-08-17 2015-11-11 联想(北京)有限公司 一种通话音量调节方法及电子设备
CN105657165A (zh) * 2015-12-30 2016-06-08 广东欧珀移动通信有限公司 一种通话音量的调节方法及装置
CN105760154A (zh) * 2016-01-27 2016-07-13 广东欧珀移动通信有限公司 一种音频控制方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109087645A (zh) * 2018-10-24 2018-12-25 科大讯飞股份有限公司 一种解码网络生成方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN107785013A (zh) 2018-03-09
US20190228770A1 (en) 2019-07-25

Similar Documents

Publication Publication Date Title
WO2018035986A1 (zh) 语音控制方法、装置及计算机存储介质
EP3316121B1 (en) Communication method, server and device
CN110459221B (zh) 多设备协同语音交互的方法和装置
CN104935615B (zh) 实现语音控制家电设备的系统及方法
US20150201440A1 (en) BLUETOOTH Device Connection Method and Device
EP3157003B1 (en) Terminal control method and device, voice control device and terminal
CN112489648B (zh) 唤醒处理阈值调整方法、语音家电、存储介质
WO2017084389A1 (zh) 一种无线设备的匹配方法、装置及系统
JP2002534716A (ja) 注目期間を有する音声入力装置
CN110996405A (zh) 耳机连接方法、终端、耳机盒子与计算机可读存储介质
EP1536660A3 (en) Communication system, communication units, and method of ambience listening thereto
CN111131966B (zh) 模式控制方法、耳机系统及计算机可读存储介质
CN103118176A (zh) 一种通过车载主机实现手机语音控制功能的方法及系统
CN202818560U (zh) 蓝牙耳机、移动终端及语音控制系统
US10405370B2 (en) Method and apparatus for data exchange between gateways
CN103856624A (zh) 识别身份的方法和移动终端
US10236016B1 (en) Peripheral-based selection of audio sources
CN105677004A (zh) 一种终端的处理方法和终端
CN111064552B (zh) 智能设备控制方法、装置、电子设备和存储介质
CN105827843A (zh) 一种振动分级控制方法、装置及手机
CN104660197B (zh) 一种音量控制方法及播放设备
CN105161111A (zh) 基于蓝牙连接的语音识别方法和装置
CN106303015A (zh) 一种通信消息的处理方法及装置、终端设备
WO2016082515A1 (zh) 无线终端及其连接控制方法、无线接入点设备及通信系统
CN103634448A (zh) 一种来电智能语音回复方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16914006

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16914006

Country of ref document: EP

Kind code of ref document: A1