WO2017071645A1 - 语音控制方法、装置及系统 - Google Patents

语音控制方法、装置及系统 Download PDF

Info

Publication number
WO2017071645A1
WO2017071645A1 PCT/CN2016/103785 CN2016103785W WO2017071645A1 WO 2017071645 A1 WO2017071645 A1 WO 2017071645A1 CN 2016103785 W CN2016103785 W CN 2016103785W WO 2017071645 A1 WO2017071645 A1 WO 2017071645A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
data
smart device
smart
voice control
Prior art date
Application number
PCT/CN2016/103785
Other languages
English (en)
French (fr)
Inventor
彭和清
黎家力
阮亚平
李辉
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017071645A1 publication Critical patent/WO2017071645A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/418Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present invention relates to the field of intelligent control technologies, and in particular, to a voice control method, apparatus, and system.
  • the embodiment of the invention provides a voice control method, device and system, which can at least improve the accuracy and convenience of voice control.
  • a voice control method is provided, which is applied to multiple smart devices in the same network, including: at least one smart device receives user voice through at least one voice interface, and obtains voice resolution from the user Voice data; the smart device identifies a voice control command by comparing the voice data with data in a locally stored voice information list, wherein the voice information list includes at least: each smart device in the network Address, device name recording, voice feature parameter data, and semantic data, the voice control command includes a name of the smart device to be manipulated and a manipulation command; and the voice control finger recognized by the smart device When the confidence level of the command is higher than the preset threshold, the smart device controls the to-be-controlled smart device to execute the manipulation command according to the voice control instruction.
  • the at least two smart devices when the at least two smart devices respectively receive the user voice through the voice interface, and respectively obtain the voice data parsed from the user voice, the at least two smart devices respectively compare the voice data with the local storage.
  • the data in the voice information list further includes: when the confidence levels of the voice control commands recognized by the at least two smart devices are less than the preset threshold, the at least two smart devices The enhanced voice is obtained by the voice interface array composed of the voice interface that satisfies the preset condition, and the enhanced voice control command is identified by comparing the enhanced voice data parsed from the enhanced voice with the data in the locally stored voice information list.
  • one of the at least two smart devices When one of the at least two smart devices has a higher confidence that the confidence level of the enhanced voice control command is higher than the preset threshold, one of the at least two smart devices controls the to-be-controlled according to the enhanced voice control command with a higher confidence than the preset threshold.
  • the smart device executes the corresponding control commands.
  • the voice interface that meets the preset condition includes: a voice interface that receives the user voice, or a voice interface that receives the user voice and the data correlation is greater than a threshold.
  • the smart device receives the user voice through the at least one voice interface, and obtains voice data parsed from the user voice, including:
  • the smart device receives a user voice through a remote voice interface, and receives voice data parsed from the user voice from a remote voice server.
  • the method further includes: each smart device joining the network through the intelligent management terminal, and synchronizing the updated voice information list from the smart management terminal.
  • the method further includes: recording, by each smart device, a device name recording, parsing the device name, recording corresponding voice feature parameter data, and semantic data, and storing the device name recording, voice feature parameter data, and semantic data to the local device. a list of voice messages and synchronizing the updated list of voice messages to other smart devices in the network.
  • the voice feature parameter data includes device name voice feature parameter data and Voice manipulation feature parameter data.
  • a voice control apparatus is further provided, which is applied to a smart device, including: at least one voice interface, configured to receive a user voice; and a data acquiring unit configured to obtain a voice from the user voice a voice data unit, configured to identify a voice control command by comparing the voice data with data in a locally stored voice information list, wherein the voice information list includes at least: each smart device in the network Address, device name recording, voice feature parameter data, and semantic data, the voice control command includes a name of the smart device to be manipulated and a manipulation command; and the command driving unit is configured to set a confidence level of the recognized voice control command to be higher than a preset threshold And controlling, by the voice control instruction, the smart device to be controlled to execute the manipulation command.
  • the voice interface includes a local voice interface and/or a remote voice interface
  • the data obtaining unit includes a data parsing unit and/or a data receiving unit, where the data parsing unit is configured to be from the user voice.
  • the voice data is parsed, and the data receiving unit is configured to receive the voice data parsed from the user voice from the remote voice server.
  • a voice control system comprising: at least two smart devices as described above, wherein confidence levels of voice control commands recognized by the at least two smart devices are When the threshold is smaller than the preset threshold, the at least two smart devices obtain enhanced voice by using a voice interface array that is composed of a voice interface that meets a preset condition, and compares the enhanced voice data and the locally stored voice that are parsed from the enhanced voice respectively.
  • the data in the voice information list identifies an enhanced voice control command, and when the confidence level of one of the enhanced voice control commands is higher than the preset threshold, one of the at least two smart devices is higher than the confidence level
  • the enhanced voice control instruction of the preset threshold controls the smart device to be controlled to execute a corresponding manipulation command.
  • the system further includes: an intelligent management terminal, configured to set a network where the at least two smart devices are located, and synchronize the updated voice information list to the at least two smart devices.
  • an intelligent management terminal configured to set a network where the at least two smart devices are located, and synchronize the updated voice information list to the at least two smart devices.
  • a storage medium is also provided.
  • the storage medium is arranged to store program code for performing the above-described voice control method.
  • the voice control method provided by the embodiment of the present invention is applied to multiple smart devices in the same network, and at least one smart device receives user voice through at least one voice interface, and obtains voice data parsed from the user voice;
  • the device identifies the voice control command by comparing the voice data with the data in the locally stored voice information list, where the voice information list includes at least: an address of each smart device in the network, a device name recording, and a voice.
  • the voice control instruction includes a smart device name to be manipulated and a manipulation command; when the confidence level of the voice control command recognized by the smart device is higher than a preset threshold, the smart device is configured according to the The voice control command controls the smart device to be controlled to execute the manipulation command.
  • the voice control command of the smart device is used to identify the voice control command with the confidence higher than the preset threshold for remote voice control, which improves the accuracy and convenience of the remote voice control smart device.
  • the implementation of the embodiments of the present invention is simple and practical.
  • each smart device records a device name recording, parses the device name recording to obtain corresponding voice feature parameter data and semantic data, and stores the device name recording, voice feature parameter data, and semantic data. Go to the local voice message list and synchronize the updated voice message list to other smart devices on the network.
  • the voice device is configured by the voice interface of the smart device to implement voice location of the smart device, thereby implementing voice device control based on the voice location.
  • the at least two smart devices are obtained by using a voice interface array that is configured by a voice interface that meets a preset condition.
  • the enhanced speech by comparing the enhanced speech data parsed from the enhanced speech with the data in the locally stored speech information list, respectively, identifying an enhanced speech control command when the confidence of one of the enhanced speech control commands is higher than the
  • the threshold is preset
  • one of the at least two smart devices controls the to-be-controlled smart device to execute a corresponding manipulation command according to the enhanced voice control instruction whose confidence is higher than the preset threshold.
  • remote voice control smart devices are provided through multiple smart device voice interfaces, which improves the accuracy and convenience of the remote voice control smart device.
  • FIG. 1 is a flowchart of a voice control method according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a voice control apparatus according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a voice control system according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a smart device joining a voice management network according to an embodiment of the present invention
  • FIG. 5 is a flowchart of configuring voice information of a smart device according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of driving a voice control instruction according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a voice control method according to an embodiment of the present invention. As shown in FIG. 1 , the voice control method provided in this embodiment is applied to multiple smart devices in the same network, and includes the following steps:
  • Step 101 The at least one smart device receives the user voice through the at least one voice interface, and obtains voice data parsed from the user voice.
  • step 101 includes:
  • the smart device receives a user voice through a remote voice interface, and receives voice data parsed from the user voice from a remote voice server.
  • Step 102 The smart device identifies a voice control instruction by comparing the voice data with data in a locally stored voice information list.
  • the voice information list includes at least an address of each smart device in the network, a device name recording, voice feature parameter data, and semantic data.
  • the voice feature parameter data includes, for example, device name voice feature parameter data and voice manipulation feature parameter data.
  • the voice information list includes, for example, a medium access control (MAC) address data packet, a device type data packet, a device name recording data packet, and a device name voice feature parameter of each smart device in the network. Packets, voice manipulation feature parameter packets, semantic parsing packets, and device status flags.
  • MAC medium access control
  • the voice data parsed from the user voice includes, for example, device name voice feature data, voice manipulation feature data, and semantic analysis data.
  • the voice control command includes a smart device name to be manipulated and a manipulation command.
  • a voice control command having a certain degree of confidence is identified according to a preset voice network algorithm.
  • the confidence level indicates the degree of credibility.
  • the voice parameters or semantics are parsed from the user voice through the existing voice recognition technology, and the obtained voice parameters or the data in the voice and voice information list are compared according to a preset algorithm. Determine the combination of the data with the highest confidence to get the voice control instruction. Thereafter, it is determined whether the recognized voice control instruction is executed by comparing the confidence level with a preset threshold.
  • Step 103 When the confidence level of the voice control command recognized by the smart device is higher than a preset threshold, the smart device controls the to-be-controlled smart device to execute the manipulation command according to the voice control instruction.
  • the smart device determines, according to the local voice information list, an address of the smart device to be controlled corresponding to the voice control command, where After establishing a connection with the to-be-controlled smart device, the voice control command is sent to the to-be-controlled smart device, and the to-be-controlled smart device is controlled by the voice control command to execute a manipulation command.
  • the method further includes:
  • the at least two smart devices When the confidence level of the voice control command recognized by the at least two smart devices is less than a preset threshold, the at least two smart devices obtain enhanced voice by using a voice interface array composed of a voice interface that meets a preset condition. Identifying an enhanced voice control command by comparing the enhanced voice data parsed from the enhanced voice with the data in the locally stored voice message list, respectively, when the confidence level of one of the enhanced voice control commands is higher than the preset threshold And one of the at least two smart devices controls the to-be-controlled smart device to execute a corresponding manipulation command according to the enhanced voice control instruction whose confidence is higher than the preset threshold.
  • the voice interface that meets the preset condition includes: a voice interface that receives the user voice, or a voice interface that receives the user voice and the data correlation is greater than a threshold.
  • the voice interface is a local microphone or a remote microphone of the smart device.
  • the method further includes: each smart device joining the network through the intelligent management terminal, and synchronizing the updated voice information list from the smart management terminal.
  • the method further includes: recording, by each smart device, recording a device name, parsing the device name, recording corresponding voice feature parameter data and semantic data, and storing the device name recording, voice feature parameter data, and semantic data to a local device A list of voice messages and synchronizing the updated list of voice messages to other smart devices in the network.
  • the embodiment of the present invention further provides a voice control device, which is applied to a smart device, and includes: at least one voice interface, configured to receive a user voice; and a data acquiring unit configured to obtain voice data parsed from the user voice; a voice recognition unit, configured to identify a voice control command by comparing the voice data with data in a locally stored voice information list, where the voice information list includes at least: an address of each smart device in the network, Device name recording, voice feature parameter data, and semantic data, the voice control command includes a name of the smart device to be manipulated and a manipulation command; the command driving unit is set to recognize the voice control When the confidence level of the command is higher than the preset threshold, the smart device to be controlled is controlled to execute the manipulation command according to the voice control instruction.
  • a voice control device which is applied to a smart device, and includes: at least one voice interface, configured to receive a user voice; and a data acquiring unit configured to obtain voice data parsed from the user voice; a voice recognition unit, configured to identify a voice
  • the voice interface includes a local voice interface and/or a remote voice interface
  • the data obtaining unit includes a data parsing unit and/or a data receiving unit, where the data parsing unit is configured to be parsed from the user voice.
  • Voice data the data receiving unit, configured to receive voice data parsed from the user voice from a remote voice server.
  • the voice interface is for example a microphone.
  • FIG. 2 is a schematic diagram of a voice control apparatus according to an embodiment of the present invention.
  • the voice control apparatus provided in this embodiment includes a voice interface (such as a local microphone or a remote microphone), a data acquisition unit, a voice recognition unit, and an instruction driving unit.
  • the data obtaining unit includes a data parsing unit and/or a data receiving unit, and the data parsing unit is configured to parse the voice data from the user voice, for example, by a voice data storage unit, a voice feature parsing unit, and a semantic parsing unit.
  • the voice data storage unit is configured to store the user voice; the voice feature analysis unit is configured to parse the voice feature data and the voice manipulation feature data from the stored user voice; and the semantic analysis unit is configured to parse the semantics.
  • the data receiving unit is configured to receive voice data parsed from the user voice from a remote voice server.
  • the data parsing unit is, for example, disposed on the remote voice server, and the voice data parsed from the user voice is sent by the remote voice server to the smart device.
  • the data parsing unit and the speech recognition unit are, for example, information-capable components such as a processor, and the command driving unit is, for example, an element having information transmitting capability such as a transmitter, and the data receiving unit is, for example, a receiver or the like having information receiving capability. Components.
  • the embodiments of the present invention are not limited thereto.
  • the functions of the data parsing unit and the speech recognition unit are implemented, for example, by a processor executing a program/instruction stored in the memory.
  • the embodiment further provides a voice control system, including at least two smart devices as described above, wherein when the confidence levels of the voice control commands recognized by the at least two smart devices are less than a preset threshold, The at least two smart devices are connected by voice that meet preset conditions
  • the voice interface array composed of the mouth obtains enhanced voice, and the enhanced voice control command is recognized by comparing the enhanced voice data parsed from the enhanced voice with the data in the locally stored voice information list, respectively, when one of the enhanced voice control
  • the voice interface comprises a local microphone and/or a remote microphone.
  • system further includes an intelligent management terminal, configured to set a network in which the at least two smart devices are located, and synchronize the updated voice information list to the at least two smart devices.
  • FIG. 3 is a schematic diagram of a voice control system according to an embodiment of the present invention.
  • the voice control system provided in this embodiment includes, for example, an intelligent management terminal and smart devices A to D.
  • the embodiment of the present invention is not limited to the number of smart devices, and may be greater than or equal to two.
  • the smart device is connected to the smart management terminal and between the smart devices, for example, by wireless or wired.
  • FIG. 4 is a flowchart of a smart device joining a voice management network according to an embodiment of the present invention. As shown in FIG. 4, the process of the smart device joining the voice management network in this embodiment includes the following steps:
  • Step 401 Start the application (APP, Application) of the intelligent management terminal to enter the main interface of the management system standby, click the “Add Device” button of the intelligent management terminal application, or issue a voice “Add Device” to scan the QR code of the appearance of the smart device.
  • the two-dimensional code includes, for example, device type data, MAC address data, and a smart device's own wireless fidelity (WIFI, Wireless-FIdelity) hotspot password.
  • WIFI wireless fidelity
  • the intelligent management terminal automatically joins the smart device itself WIFI, wherein the smart device When the power-on startup is not connected to the WIFI network, the initial presence is in the WIFI hotspot;
  • Table 1 QR code of the appearance of the smart device includes content
  • Step 402 The intelligent management terminal APP displays all the WIFI networks in the range, selects the official WIFI network that the smart device needs to join, adds the smart device to the selected WIFI network, and obtains the default voice management list information of the smart device. Among them, the voice management list is shown in Table 2:
  • Step 403 The APP of the intelligent management terminal adds an intelligent device, and resolves an Internet Protocol (IP) address by using an address resolution protocol (ARP) according to the MAC address of each smart device in the voice management information list.
  • IP Internet Protocol
  • ARP address resolution protocol
  • TCP Transmission Control Protocol
  • FIG. 5 is a flowchart of configuring voice information of a smart device according to an embodiment of the present invention. As shown in FIG. 5, the process of configuring voice information of a smart device in this embodiment includes the following steps:
  • Step 501 The user performs recording through a local MIC of the smart device or a remote MIC;
  • Step 502 The smart device performs recording recording (such as storing “device name” recording record), performing feature value extraction (such as extracting “device name” voice feature parameter and voice manipulation feature parameter), and performing local or remote voice data parsing unit. Semantic parsing (such as parsing device names) and storing the above data in a local voice information list;
  • Step 503 The smart device synchronizes the local voice information list of all online smart devices through the network. For example, the voice configuration information added by the smart device is synchronized to all smart devices in the network to determine that the voice information list of each smart device in the network is kept up to date and the same.
  • FIG. 6 is a flowchart of network driving of a voice control instruction according to an embodiment of the present invention.
  • the names of n smart devices are: Name 1 , Name 2 ... Name n ; the names of the m microphones carried by the smart device are (m ⁇ n, n is an integer greater than or equal to 1): MIC 1 , MIC 2 ...
  • ...MIC m where the smart device and the microphone are in a one-to-one or one-to-many relationship; CONF(MIC i , Name j ) represents the confidence of the i-th microphone in the smart device network to identify the j-th smart device VAD(MIC i )>0 means that the i-th microphone in the smart device network is spoken; CORR(MIC i , MIC j ) represents the data correlation between the i-th microphone and the j-th microphone in the smart device network.
  • the network driving process of the voice control instruction in this embodiment includes the following steps:
  • Step 601 The user sends a smart device control command by voice, and the MIC of the multiple smart devices in the network in the same space receives the user voice;
  • Step 602 The respective smart devices that receive the user voice compare the data in the data and the voice information list parsed from the user voice, and when the voice control instruction whose confidence exceeds the preset threshold is acquired, the voice control instruction is recognized.
  • the smart device establishes a network with the to-be-controlled smart device corresponding to the voice control command, and drives the to-be-controlled smart device to execute a manipulation command carried by the voice control instruction; for example, the voice control command is a living room headlight;
  • the speaker is closest to the smart device microphone i, and the smart device microphone i recognizes that the calling smart device Name j is trusted, and the smart device Name i establishes a TCP/IP connection with the smart device Name j , and drives the device control command in the command list of the smart device Name j ;
  • Step 603 When the confidence levels of the voice control commands acquired by the multiple smart devices are less than the preset threshold, all the smart devices MIC having the voice input in the network are mobilized to form a MIC array, and the sound source is positioned to form a pointing.
  • the smart device to be controlled is controlled to execute a corresponding manipulation instruction.
  • the smart device that establishes a connection with the smart device to be controlled according to the voice control instruction is, for example, a smart device that recognizes a voice control command with a higher confidence than a preset threshold.
  • the smart device broadcasts all VAD(MIC i )>0 and VAD(MIC j )>0 and CORR by User Datagram Protocol (UDP).
  • MIC i , MIC j )>Threshold C (C ⁇ 1, eg 0.5) microphone automatically forms a microphone array, locates the sound source, and forms a beam pointing to the sound source, enhances the acquisition of speech, improves recognition rate, beamforming
  • the enhanced speech is then used as an input to speech recognition to identify enhanced speech control commands.
  • the smart device name is configured by the voice interface of the smart device to implement voice location of the smart device, and the voice device of the plurality of smart devices is used to perform remote voice control on the smart device.
  • the voice device of the plurality of smart devices is used to perform remote voice control on the smart device.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
  • each of the above modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination.
  • the forms are located in different processors.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the voice control method, apparatus, and system provided by the embodiments of the present invention have the following beneficial effects: the voice control interface of the smart device is used to identify the voice control command with the confidence higher than the preset threshold, and the remote voice control is performed, thereby improving the far distance.
  • the accuracy and convenience of voice-controlled smart devices is simple and practical.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Manufacturing & Machinery (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Selective Calling Equipment (AREA)

Abstract

一种语音控制方法、装置及系统,应用于同一网络中的多个智能设备,包括:至少一智能设备通过至少一语音接口接收用户语音,并获得从用户语音解析出的语音数据(101);智能设备通过比对语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令(102),其中,语音信息列表至少包括:网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,语音控制指令包括待操控智能设备名称以及操控命令;当智能设备识别出的语音控制指令的置信度高于预设阈值时,智能设备根据该语音控制指令控制待操控智能设备执行该操控命令(103)。该语音控制方法、装置及系统,能够提高语音控制的准确性及便利性。

Description

语音控制方法、装置及系统 技术领域
本发明涉及智能控制技术领域,尤其涉及一种语音控制方法、装置及系统。
背景技术
随着人们工作、居家、旅行等的便利需要和智能管控技术的发展,单位、家庭及个人拥有和管理的智能设备越来越多,人机交互的便利给智能设备带来的体验越来越符合人们生产生活的需要。智能设备的麦克风(MIC,Microphone)和扬声器(Speaker)如同人类的耳朵和嘴巴,用来听和说。人类除了眼睛之外与现实世界做互动用的最多的器官就是耳朵和嘴巴,也就是说大多数人通过嘴巴说和耳朵听来感知世界,这两种器官是人类社会认识自然,改造世界最基本的工具。对于现有的智能设备来说,大部分都已经具备了MIC和Speaker。虽然现有技术可以实现通过语音方式进行远程遥控,然而,现有方案在远距离语音操控方面的性能需要进一步提高。
发明内容
本发明实施例提供了一种语音控制方法、装置及系统,至少能够提高语音控制的准确性及便利性。
根据本发明的一个实施例,提供了一种语音控制方法,应用于同一网络中的多个智能设备,包括:至少一智能设备通过至少一语音接口接收用户语音,并获得从所述用户语音解析出的语音数据;所述智能设备通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令,其中,所述语音信息列表至少包括:所述网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,所述语音控制指令包括待操控智能设备名称以及操控命令;当所述智能设备识别出的语音控制指 令的置信度高于预设阈值时,所述智能设备根据所述语音控制指令控制待操控智能设备执行所述操控命令。
可选地,当至少两个智能设备分别通过语音接口接收用户语音,并分别获得从用户语音解析出的语音数据时,所述至少两个智能设备分别通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令之后,还包括:当所述至少两个智能设备识别出的语音控制指令的置信度均小于所述预设阈值时,所述至少两个智能设备通过满足预设条件的语音接口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。
可选地,所述满足预设条件的语音接口包括:接收到用户语音的语音接口,或者,接收到用户语音且数据相关性大于阈值的语音接口。
可选地,所述智能设备通过至少一语音接口接收用户语音,并获得从所述用户语音解析出的语音数据,包括:
所述智能设备通过本地语音接口接收用户语音,并从所述用户语音解析得到语音数据;和/或,
所述智能设备通过远程语音接口接收用户语音,并从远程语音服务器接收从所述用户语音解析出的语音数据。
可选地,该方法还包括:各智能设备通过智能管理终端加入所述网络,并从所述智能管理终端同步更新的语音信息列表。
可选地,该方法还包括:各智能设备记录设备名称录音,解析所述设备名称录音得到相应的语音特征参数数据以及语义数据,存储所述设备名称录音、语音特征参数数据及语义数据至本地的语音信息列表,并将更新的语音信息列表同步给所述网络中的其他智能设备。
可选地,所述语音特征参数数据包括设备名称语音特征参数数据以及 语音操控特征参数数据。
根据本发明的另一实施例,还提供了一种语音控制装置,应用于智能设备,包括:至少一语音接口,设置为接收用户语音;数据获取单元,设置为获得从所述用户语音解析出的语音数据;语音识别单元,设置为通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令,其中,所述语音信息列表至少包括:网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,所述语音控制指令包括待操控智能设备名称以及操控命令;指令驱动单元,设置为当识别出的语音控制指令的置信度高于预设阈值时,根据所述语音控制指令控制待操控智能设备执行所述操控命令。
可选地,所述语音接口包括本地语音接口和/或远程语音接口,所述数据获取单元包括数据解析单元和/或数据接收单元,其中,所述数据解析单元,设置为从所述用户语音解析得到语音数据,所述数据接收单元,设置为从远程语音服务器接收从所述用户语音解析出的语音数据。
根据本发明的另一实施例,还提供了一种语音控制系统,包括:至少两个如上所述的智能设备,其中,当所述至少两个智能设备识别出的语音控制指令的置信度均小于预设阈值时,所述至少两个智能设备通过满足预设条件的语音接口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。
可选地,该系统还包括:智能管理终端,设置为设置所述至少两个智能设备所在的网络,并向所述至少两个智能设备同步更新的语音信息列表。
根据本发明的又一个实施例,还提供了一种存储介质。该存储介质设置为存储用于执行上述语音控制方法的程序代码。
本发明实施例提供的语音控制方法,应用于同一网络中的多个智能设备,至少一智能设备通过至少一语音接口接收用户语音,并获得从所述用户语音解析出的语音数据;所述智能设备通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令,其中,所述语音信息列表至少包括:所述网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,所述语音控制指令包括待操控智能设备名称以及操控命令;当所述智能设备识别出的语音控制指令的置信度高于预设阈值时,所述智能设备根据所述语音控制指令控制待操控智能设备执行所述操控命令。如此,通过智能设备语音接口识别置信度高于预设阈值的语音控制指令进行远程语音操控,提升了远距离语音操控智能设备的准确性和便利性。而且,本发明实施例的实现简单且实用。
可选地,在本发明实施例中,各智能设备记录设备名称录音,解析所述设备名称录音得到相应的语音特征参数数据以及语义数据,存储所述设备名称录音、语音特征参数数据及语义数据至本地的语音信息列表,并将更新的语音信息列表同步给网络中的其他智能设备。如此,通过智能设备的语音接口配置管理网络中的智能设备名称实现智能设备的语音定位,进而实现基于该语音定位的智能设备语音操控。
可选地,当至少两个智能设备识别出的语音控制指令的置信度均小于所述预设阈值时,所述至少两个智能设备通过满足预设条件的语音接口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。如此,通过多个智能设备语音接口进行远程语音操控智能设备,提升了远距离语音操控智能设备的准确性和便利性。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1为本发明实施例提供的语音控制方法的流程图;
图2为本发明一实施例提供的语音控制装置的示意图;
图3为本发明一实施例提供的语音控制系统的示意图;
图4为本发明实施例中智能设备加入语音管理网络的流程图;
图5为本发明实施例中智能设备的语音信息的配置流程图;
图6为本发明实施例中语音控制指令的驱动流程图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
图1为本发明实施例提供的语音控制方法的流程图。如图1所示,本实施例提供的语音控制方法,应用于同一网络中的多个智能设备,包括以下步骤:
步骤101:至少一智能设备通过至少一语音接口接收用户语音,并获得从所述用户语音解析出的语音数据。
具体而言,步骤101包括:
所述智能设备通过本地语音接口接收用户语音,并从所述用户语音解析得到语音数据;和/或,
所述智能设备通过远程语音接口接收用户语音,并从远程语音服务器接收从所述用户语音解析出的语音数据。
步骤102:所述智能设备通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令。
其中,所述语音信息列表至少包括:所述网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据。其中,语音特征参数数据例如包括设备名称语音特征参数数据以及语音操控特征参数数据。于一实施例中,语音信息列表例如包括:所述网络中各智能设备的介质访问控制(MAC,Medium Access Control)地址数据包、设备类型数据包、设备名称录音数据包、设备名称语音特征参数数据包、语音操控特征参数数据包、语义解析数据包以及设备状态标志位。
其中,所述从用户语音解析出的语音数据例如包括:设备名称语音特征数据、语音操控特征数据以及语义解析数据。所述语音控制指令包括待操控智能设备名称以及操控命令。
于此,在用户语音记录、从用户语音解析出的语音数据以及语音信息列表的基础上,根据预设的语音网络算法,识别出具有一定置信度的语音控制指令。其中,置信度表示可信程度。举例而言,针对每个智能设备而言,通过现有的语音识别技术从用户语音中解析出语音参数或语义,根据预设算法比对得到的语音参数或语音与语音信息列表中的数据,确定其中置信度最高的数据组合得到语音控制指令。之后,通过所述置信度与预设阈值的比较,确定是否执行识别出的语音控制指令。
步骤103:当所述智能设备识别出的语音控制指令的置信度高于预设阈值时,所述智能设备根据所述语音控制指令控制待操控智能设备执行所述操控命令。
具体而言,当所述智能设备识别出的语音控制指令的置信度高于预设阈值时,所述智能设备根据本地的语音信息列表确定该语音控制指令对应的待操控智能设备的地址,在与该待操控智能设备建立连接之后,发送所述语音控制指令至该待操控智能设备,通过所述语音控制指令控制该待操控智能设备执行操控命令。
进一步地,当至少两个智能设备分别通过语音接口接收用户语音,并分别获得从用户语音解析出的语音数据时,所述至少两个智能设备分别通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令之后,该方法还包括:
当所述至少两个智能设备识别出的语音控制指令的置信度均小于预设阈值时,所述至少两个智能设备通过满足预设条件的语音接口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。
其中,所述满足预设条件的语音接口包括:接收到用户语音的语音接口,或者,接收到用户语音且数据相关性大于阈值的语音接口。其中,所述语音接口为智能设备的本地麦克风或远程麦克风。
进一步地,该方法还包括:各智能设备通过智能管理终端加入所述网络,并从所述智能管理终端同步更新的语音信息列表。
进一步地,该方法还包括:各智能设备记录设备名称录音,解析所述设备名称录音得到相应的语音特征参数数据以及语义数据,存储所述设备名称录音、语音特征参数数据及语义数据至本地的语音信息列表,并将更新的语音信息列表同步给所述网络中的其他智能设备。
此外,本发明实施例还提供一种语音控制装置,应用于智能设备,包括:至少一语音接口,设置为接收用户语音;数据获取单元,设置为获得从所述用户语音解析出的语音数据;语音识别单元,设置为通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令,其中,所述语音信息列表至少包括:所述网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,所述语音控制指令包括待操控智能设备名称以及操控命令;指令驱动单元,设置为当识别出的语音控制 指令的置信度高于预设阈值时,根据所述语音控制指令控制待操控智能设备执行所述操控命令。
其中,所述语音接口包括本地语音接口和/或远程语音接口,所述数据获取单元包括数据解析单元和/或数据接收单元,其中,所述数据解析单元,设置为从所述用户语音解析得到语音数据,所述数据接收单元,设置为从远程语音服务器接收从所述用户语音解析出的语音数据。所述语音接口例如为麦克风。
图2为本发明一实施例提供的语音控制装置的示意图。如图2所示,本实施例提供的语音控制装置包括语音接口(如本地麦克风或远程麦克风)、数据获取单元、语音识别单元以及指令驱动单元。其中,所述数据获取单元包括数据解析单元和/或数据接收单元,所述数据解析单元设置为从用户语音解析得到语音数据,例如由语音数据存储单元、语音特征解析单元以及语义解析单元组成。具体而言,语音数据存储单元,设置为存储用户语音;语音特征解析单元,设置为从存储的用户语音中解析出语音特征数据以及语音操控特征数据;语义解析单元,设置为解析出语义。所述数据接收单元,设置为从远程语音服务器接收从所述用户语音解析出的语音数据。然而,本实施例对此并不限定。当采用远程语音接口接收用户语音时,数据解析单元例如设置在远程语音服务器,由远程语音服务器将从用户语音中解析出的语音数据发送给智能设备。
于实际应用中,数据解析单元以及语音识别单元例如为处理器等具有信息能力的元件,指令驱动单元例如为发射器等具有信息发送能力的元件,数据接收单元例如为接收器等具有信息接收能力的元件。然而,本发明实施例对此并不限定。数据解析单元以及语音识别单元的功能例如通过处理器执行存储在存储器的程序/指令实现。
此外,本实施例还提供一种语音控制系统,包括至少两个如上所述的智能设备,其中,当所述至少两个智能设备识别出的语音控制指令的置信度均小于预设阈值时,所述至少两个智能设备通过满足预设条件的语音接 口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。其中,所述语音接口包括本地麦克风和/或远程麦克风。
进一步地,该系统还包括智能管理终端,设置为设置所述至少两个智能设备所在的网络,并向所述至少两个智能设备同步更新的语音信息列表。
图3为本发明一实施例提供的语音控制系统的示意图。如图3所示,本实施例提供的语音控制系统例如包括智能管理终端以及智能设备A~D。然而,本发明实施例对于智能设备的数目并不限定,满足大于或等于二即可。其中,智能设备与智能管理终端之间以及智能设备之间例如通过无线或者有线方式连接。
以下对本发明实施例进行详细说明。
图4为本发明实施例中智能设备加入语音管理网络的流程图。如图4所示,本实施例中智能设备加入语音管理网络的过程包括以下步骤:
步骤401:启动智能管理终端的应用(APP,Application)进入管理系统待机主界面,点击智能管理终端应用的“添加设备”按钮,或者,发出语音“添加设备”,扫描智能设备外观的二维码,二维码例如包括设备类型数据、MAC地址数据以及智能设备自身无线保真(WIFI,WIreless-FIdelity)热点密码,如表1所示,智能管理终端自动加入智能设备自身WIFI,其中,智能设备加电启动没有连接WIFI网络时,起始均以WIFI热点存在;
表1智能设备外观的二维码包括内容
Figure PCTCN2016103785-appb-000001
Figure PCTCN2016103785-appb-000002
步骤402:智能管理终端APP显示所在范围内所有WIFI网络,选择智能设备需要加入的正式WIFI网络,将智能设备加入到所选的WIFI网络中,并获取该智能设备缺省的语音管理列表信息,其中,语音管理列表如表2所示:
表2语音管理列表
Figure PCTCN2016103785-appb-000003
Figure PCTCN2016103785-appb-000004
步骤403:智能管理终端的APP每加入一台智能设备,根据语音管理信息列表中的各智能设备的MAC地址,通过地址解析协议(ARP,Address Resolution Protocol)解析网络协议(IP,Internet Protocol)地址,建立传输控制协议(TCP,Transmission Control Protocol)/IP连接后,同步网络中所有的智能设备的语音信息列表。例如,将新增的智能设备信息同步给网络中所有的智能设备,以确定网络中每台智能设备的语音信息列表保持最新且相同。
图5为本发明实施例中智能设备的语音信息的配置流程图。如图5所示,本实施例中智能设备的语音信息的配置过程包括以下步骤:
步骤501:用户通过智能设备的本地MIC或者远程MIC进行录音;
步骤502:智能设备通过本地或者远程语音数据解析单元,进行录音记录(如存储“设备名称”录音记录)、进行特征值提取(如提取“设备名称”语音特征参数以及语音操控特征参数)以及进行语义解析(如解析设备名称),并将上述数据存储到本地语音信息列表中;
步骤503:智能设备通过网络同步所有在线的智能设备的本地语音信息列表。例如,将智能设备新增的语音配置信息同步给网络中所有的智能设备,以确定网络中每台智能设备的语音信息列表保持最新且相同。
图6为本发明实施例中语音控制指令的网络驱动流程图。为了进行更好地说明,先进行如下假设:在同一个网络环境,一个空间内,用户进行语音操控时,有至少一个智能设备的至少一个MIC会收到声音;其中,网络中通过语音配置的n个智能设备的名称为:Name1、Name2……Namen;智能设备所带的m个麦克风的名称为(m≥n,n为大于或等于1的整数):MIC1、MIC2……MICm,其中,智能设备与麦克风为一对一或者一对多的关系;CONF(MICi,Namej)代表智能设备网络中的第i个麦克风识别出呼叫第j个智能设备的置信度;VAD(MICi)>0表示智能设备网络中第i个麦克风有人说话;CORR(MICi,MICj)表示智能设备网络中第i个麦克 风与第j个麦克风的数据相关性。
如图6所示,本实施例中语音控制指令的网络驱动过程包括以下步骤:
步骤601:用户通过语音发出智能设备控制指令,在同一空间的网络中的多个智能设备的MIC收到用户语音;
步骤602:收到用户语音的各个智能设备比对从用户语音解析出的数据及语音信息列表中的数据,当获取置信度超过预设阀值的语音控制指令时,识别出该语音控制指令的智能设备与该语音控制指令对应的待操控智能设备建立网络,驱动该待操控智能设备执行该语音控制指令携带的操控命令;例如,语音控制命令为客厅大灯开;
具体而言,当VAD(MICi)>0且CONF(MICi,Namej)>预设阈值P(P<1,例如0.8),此时说话人离智能设备麦克风i最近,且智能设备麦克风i识别出来的呼叫智能设备Namej可信,则智能设备Namei与智能设备Namej建立TCP/IP连接,驱动智能设备Namej的命令列表中的设备控制命令;
步骤603:当多个智能设备获取的语音控制指令的置信度均小于预设阀值时,则调动网络中所有存在语音输入的智能设备MIC形成一个MIC阵列,对声源进行定位,形成一个指向声源的波束,进而形成置信度高于预设阈值的语音控制指令驱动相应的智能设备操控;其中,所述多个智能设备中的任意一个可根据语音控制指令与待操控智能设备建立连接,进而控制待操控智能设备执行相应操控指令。然而,本发明实施例对此并不限定。所述根据语音控制指令与待操控智能设备建立连接的智能设备例如为识别出高于预设阈值的置信度的语音控制指令的智能设备。
举例而言,当说话人没有特别靠近某个麦克风时,智能设备通过用户数据报协议(UDP,User Datagram Protocol)广播把所有VAD(MICi)>0且VAD(MICj)>0且CORR(MICi,MICj)>阈值C(C<1,例如0.5)的麦克风自动组成一个麦克风阵列,对声源进行定位,并形成一个指向声源的波束,增强采集语音,提高识别率,波束形成后的增强语音作为语音 识别的输入,进而识别出增强语音控制指令。
综上所述,在本发明实施例中,通过智能设备的语音接口来配置管理网络中的智能设备名称实现智能设备的语音定位,并且通过多个智能设备的语音接口进行远程语音操控智能设备,从而提升远距离语音操控智能设备的准确性和便利性。而且,本发明实施例的方案实现简单且实用。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上显示和描述了本发明的基本原理和主要特征和本发明的优点。本发明不受上述实施例的限制,上述实施例和说明书中描述的只是说明本发明的原理,在不脱离本发明精神和范围的前提下,本发明还会有各种变化 和改进,这些变化和改进都落入要求保护的本发明范围内。
工业实用性
如上所述,本发明实施例提供的一种语音控制方法、装置及系统,具有以下有益效果:通过智能设备语音接口识别置信度高于预设阈值的语音控制指令进行远程语音操控,提升了远距离语音操控智能设备的准确性和便利性。而且,实现简单且实用。

Claims (12)

  1. 一种语音控制方法,应用于同一网络中的多个智能设备,包括:
    至少一智能设备通过至少一语音接口接收用户语音,并获得从所述用户语音解析出的语音数据;
    所述智能设备通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令,其中,所述语音信息列表至少包括:所述网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,所述语音控制指令包括待操控智能设备名称以及操控命令;
    当所述智能设备识别出的语音控制指令的置信度高于预设阈值时,所述智能设备根据所述语音控制指令控制待操控智能设备执行所述操控命令。
  2. 如权利要求1所述的方法,其中,当至少两个智能设备分别通过语音接口接收用户语音,并分别获得从用户语音解析出的语音数据时,所述至少两个智能设备分别通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令之后,还包括:当所述至少两个智能设备识别出的语音控制指令的置信度均小于所述预设阈值时,所述至少两个智能设备通过满足预设条件的语音接口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。
  3. 如权利要求2所述的方法,其中,所述满足预设条件的语音接口包括:接收到用户语音的语音接口,或者,接收到用户语音且数据相关性大于阈值的语音接口。
  4. 如权利要求1或2所述的方法,其中,所述智能设备通过至 少一语音接口接收用户语音,并获得从所述用户语音解析出的语音数据,包括:
    所述智能设备通过本地语音接口接收用户语音,并从所述用户语音解析得到语音数据;和/或,
    所述智能设备通过远程语音接口接收用户语音,并从远程语音服务器接收从所述用户语音解析出的语音数据。
  5. 如权利要求1所述的方法,其中,还包括:各智能设备通过智能管理终端加入所述网络,并从所述智能管理终端同步更新的语音信息列表。
  6. 如权利要求1所述的方法,其中,还包括:各智能设备记录设备名称录音,解析所述设备名称录音得到相应的语音特征参数数据以及语义数据,存储所述设备名称录音、语音特征参数数据及语义数据至本地的语音信息列表,并将更新的语音信息列表同步给所述网络中的其他智能设备。
  7. 如权利要求1、2或6所述的方法,其中,所述语音特征参数数据包括设备名称语音特征参数数据以及语音操控特征参数数据。
  8. 一种语音控制装置,应用于智能设备,包括:
    至少一语音接口,设置为接收用户语音;
    数据获取单元,设置为获得从所述用户语音解析出的语音数据;
    语音识别单元,设置为通过比对所述语音数据与本地存储的语音信息列表中的数据,识别出语音控制指令,其中,所述语音信息列表至少包括:网络中各智能设备的地址、设备名称录音、语音特征参数数据以及语义数据,所述语音控制指令包括待操控智能设备名称以及操控命令;
    指令驱动单元,设置为当识别出的语音控制指令的置信度高于预设阈值时,根据所述语音控制指令控制待操控智能设备执行所述操控 命令。
  9. 如权利要求8所述的装置,其中,所述语音接口包括本地语音接口和/或远程语音接口,所述数据获取单元包括数据解析单元和/或数据接收单元,其中,所述数据解析单元,设置为从所述用户语音解析得到语音数据,所述数据接收单元,设置为从远程语音服务器接收从所述用户语音解析出的语音数据。
  10. 一种语音控制系统,包括:至少两个如权利要求8至9任一项所述的智能设备,其中,当所述至少两个智能设备识别出的语音控制指令的置信度均小于预设阈值时,所述至少两个智能设备通过满足预设条件的语音接口组成的语音接口阵列,得到增强的语音,分别通过比对从增强的语音解析出的增强语音数据与本地存储的语音信息列表中的数据,识别出增强语音控制指令,当其中一个增强语音控制指令的置信度高于所述预设阈值时,所述至少两个智能设备中的其中之一根据置信度高于所述预设阈值的增强语音控制指令控制待操控智能设备执行相应的操控命令。
  11. 如权利要求10所述的系统,其中,还包括:智能管理终端,设置为设置所述至少两个智能设备所在的网络,并向所述至少两个智能设备同步更新的语音信息列表。
  12. 一种计算机存储介质,设置为存储用于执行如权利要求1至7中任一项所述的语音控制方法的计算机程序。
PCT/CN2016/103785 2015-10-28 2016-10-28 语音控制方法、装置及系统 WO2017071645A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510715912.1 2015-10-28
CN201510715912.1A CN106653008B (zh) 2015-10-28 2015-10-28 一种语音控制方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2017071645A1 true WO2017071645A1 (zh) 2017-05-04

Family

ID=58629910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/103785 WO2017071645A1 (zh) 2015-10-28 2016-10-28 语音控制方法、装置及系统

Country Status (2)

Country Link
CN (1) CN106653008B (zh)
WO (1) WO2017071645A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019061382A1 (zh) * 2017-09-30 2019-04-04 陈银芳 基于智能音箱的家电语音控制方法及相关产品
CN109658937A (zh) * 2017-10-10 2019-04-19 苏州宝时得电动工具有限公司 智能割草机的语音控制方法、装置、系统和智能割草机
CN111739533A (zh) * 2020-07-28 2020-10-02 睿住科技有限公司 语音控制系统、方法与装置以及存储介质、语音设备
CN111782992A (zh) * 2020-09-04 2020-10-16 北京维数统计事务所有限公司 显示控制方法、装置、设备及可读存储介质
CN112331212A (zh) * 2020-10-27 2021-02-05 合肥飞尔智能科技有限公司 一种智能设备语音控制系统及方法
CN114678022A (zh) * 2022-03-25 2022-06-28 青岛海尔科技有限公司 终端设备的语音控制方法和装置、存储介质及电子设备

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107564518B (zh) * 2017-08-21 2021-10-22 百度在线网络技术(北京)有限公司 智能设备控制方法、装置及计算机设备
CN107577151A (zh) * 2017-08-25 2018-01-12 谢锋 一种语音识别的方法、装置、设备和存储介质
CN107766482B (zh) * 2017-10-13 2021-12-14 北京猎户星空科技有限公司 信息推送及发送方法、装置、电子设备、存储介质
CN107908116B (zh) * 2017-10-20 2021-05-11 深圳市艾特智能科技有限公司 语音控制方法、智能家居系统、存储介质和计算机设备
CN108170034B (zh) * 2017-12-29 2021-06-08 上海器魂智能科技有限公司 智能设备控制方法、装置、计算机设备和储存介质
TWI673673B (zh) * 2018-01-05 2019-10-01 華南商業銀行股份有限公司 智能語音交易系統
CN108183844B (zh) * 2018-02-06 2020-09-08 四川虹美智能科技有限公司 一种智能家电语音控制方法、装置及系统
CN108630201B (zh) * 2018-03-07 2020-09-29 北京墨丘科技有限公司 一种用于建立设备关联的方法和装置
US10755706B2 (en) * 2018-03-26 2020-08-25 Midea Group Co., Ltd. Voice-based user interface with dynamically switchable endpoints
CN109978170B (zh) * 2019-03-05 2020-04-28 浙江邦盛科技有限公司 一种基于多要素的移动设备识别方法
CN113012699B (zh) * 2021-05-07 2024-01-23 宇博科创(深圳)科技有限公司 基于离线语音的红外线遥控开关方法及系统
CN116095254B (zh) * 2022-05-30 2023-10-20 荣耀终端有限公司 音频处理方法和装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1307231A (zh) * 2000-02-02 2001-08-08 邦毅科技股份有限公司 用以操控设备的集中式语音识别遥控方法及系统
US20060047513A1 (en) * 2004-09-02 2006-03-02 Inventec Multimedia & Telecom Corporation Voice-activated remote control system and method
US20060229881A1 (en) * 2005-04-11 2006-10-12 Global Target Enterprise Inc. Voice recognition gateway apparatus
TW200912731A (en) * 2007-09-07 2009-03-16 Compal Communications Inc Voice control system and method
CN101599270A (zh) * 2008-06-02 2009-12-09 海尔集团公司 语音服务器及语音控制的方法
CN102255780A (zh) * 2010-05-20 2011-11-23 株式会社曙飞电子 家庭网络系统及其控制方法
CN102855872A (zh) * 2012-09-07 2013-01-02 深圳市信利康电子有限公司 基于终端及互联网语音交互的家电控制方法及系统
CN104885406A (zh) * 2012-12-18 2015-09-02 三星电子株式会社 用于在家庭网络系统中远程控制家庭设备的方法和设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7885816B2 (en) * 2003-12-08 2011-02-08 International Business Machines Corporation Efficient presentation of correction options in a speech interface based upon user selection probability
EP1581026B1 (en) * 2004-03-17 2015-11-11 Nuance Communications, Inc. Method for detecting and reducing noise from a microphone array
CN102760432B (zh) * 2012-07-06 2015-08-19 广东美的制冷设备有限公司 一种家电用声控遥控器及其控制方法
CN103700368B (zh) * 2014-01-13 2017-01-18 联想(北京)有限公司 用于语音识别的方法、语音识别装置和电子设备

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1307231A (zh) * 2000-02-02 2001-08-08 邦毅科技股份有限公司 用以操控设备的集中式语音识别遥控方法及系统
US20060047513A1 (en) * 2004-09-02 2006-03-02 Inventec Multimedia & Telecom Corporation Voice-activated remote control system and method
US20060229881A1 (en) * 2005-04-11 2006-10-12 Global Target Enterprise Inc. Voice recognition gateway apparatus
TW200912731A (en) * 2007-09-07 2009-03-16 Compal Communications Inc Voice control system and method
CN101599270A (zh) * 2008-06-02 2009-12-09 海尔集团公司 语音服务器及语音控制的方法
CN102255780A (zh) * 2010-05-20 2011-11-23 株式会社曙飞电子 家庭网络系统及其控制方法
CN102855872A (zh) * 2012-09-07 2013-01-02 深圳市信利康电子有限公司 基于终端及互联网语音交互的家电控制方法及系统
CN104885406A (zh) * 2012-12-18 2015-09-02 三星电子株式会社 用于在家庭网络系统中远程控制家庭设备的方法和设备

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019061382A1 (zh) * 2017-09-30 2019-04-04 陈银芳 基于智能音箱的家电语音控制方法及相关产品
CN109658937A (zh) * 2017-10-10 2019-04-19 苏州宝时得电动工具有限公司 智能割草机的语音控制方法、装置、系统和智能割草机
CN111739533A (zh) * 2020-07-28 2020-10-02 睿住科技有限公司 语音控制系统、方法与装置以及存储介质、语音设备
CN111782992A (zh) * 2020-09-04 2020-10-16 北京维数统计事务所有限公司 显示控制方法、装置、设备及可读存储介质
CN112331212A (zh) * 2020-10-27 2021-02-05 合肥飞尔智能科技有限公司 一种智能设备语音控制系统及方法
CN114678022A (zh) * 2022-03-25 2022-06-28 青岛海尔科技有限公司 终端设备的语音控制方法和装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN106653008B (zh) 2021-02-02
CN106653008A (zh) 2017-05-10

Similar Documents

Publication Publication Date Title
WO2017071645A1 (zh) 语音控制方法、装置及系统
CN111989741B (zh) 具有动态可切换端点的基于语音的用户接口
JP6739907B2 (ja) 機器特定方法、機器特定装置及びプログラム
JP6516585B2 (ja) 制御装置、その方法及びプログラム
JP6902136B2 (ja) システムの制御方法、システム、及びプログラム
US20200286482A1 (en) Processing voice commands based on device topology
US11354089B2 (en) System and method for dialog interaction in distributed automation systems
US20210090567A1 (en) Method and apparatus for managing voice-based interaction in internet of things network system
US20220286317A1 (en) Apparatus, system and method for directing voice input in a controlling device
US9996316B2 (en) Mediation of wakeword response for multiple devices
JP6752870B2 (ja) 複数のウェイクワードを利用して人工知能機器を制御する方法およびシステム
KR20220024557A (ko) 자동화된 어시스턴트에 의한 응답 액션을 트리거하기 위한 핫 명령의 검출 및/또는 등록
CN104935615B (zh) 实现语音控制家电设备的系统及方法
CN112272819A (zh) 被动唤醒用户交互设备的方法和系统
US11586413B2 (en) Synchronous sounds for audio assistant on devices
US11057664B1 (en) Learning multi-device controller with personalized voice control
US10236016B1 (en) Peripheral-based selection of audio sources
WO2013071738A1 (zh) 一种个人专用生活协助装置和方法
CN112700770A (zh) 语音控制方法、音箱设备、计算设备和存储介质
CN114999496A (zh) 音频传输方法、控制设备及终端设备
JP2019184679A (ja) ネットワークシステム、サーバ、および情報処理方法
JP2019537071A (ja) 分散したマイクロホンからの音声の処理
CN108630201B (zh) 一种用于建立设备关联的方法和装置
CN111048081B (zh) 一种控制方法、装置、电子设备及控制系统
WO2019202852A1 (ja) 情報処理システム、クライアント装置、情報処理方法及び情報処理プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16859080

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16859080

Country of ref document: EP

Kind code of ref document: A1