WO2019056700A1 - 语音对话机器人的控制方法、装置、终端设备及介质 - Google Patents

语音对话机器人的控制方法、装置、终端设备及介质 Download PDF

Info

Publication number
WO2019056700A1
WO2019056700A1 PCT/CN2018/077043 CN2018077043W WO2019056700A1 WO 2019056700 A1 WO2019056700 A1 WO 2019056700A1 CN 2018077043 W CN2018077043 W CN 2018077043W WO 2019056700 A1 WO2019056700 A1 WO 2019056700A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
voice information
dialogue robot
information
voice dialogue
Prior art date
Application number
PCT/CN2018/077043
Other languages
English (en)
French (fr)
Inventor
黄伟淦
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2019056700A1 publication Critical patent/WO2019056700A1/zh

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/0005Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present application belongs to the field of artificial intelligence technologies, and in particular, to a method, a device, a terminal device and a medium for controlling a voice dialogue robot.
  • voice dialogue robots on the market usually only operate independently. When multiple voice dialogue robots exist in the same environment, but each voice dialogue robot is far apart from each other, if the user needs to control multiple voice dialogue robots at the same time, they can only go to the position where each voice dialogue robot is located, and then can be separately performed. voice control. This situation leads to a lower control efficiency of the voice dialogue robot.
  • the embodiment of the present application provides a method, a device, a terminal device, and a medium for controlling a voice dialogue robot, so as to solve the problem that the control efficiency of the voice dialogue robot in the prior art is relatively low.
  • a first aspect of the embodiments of the present application provides a method for controlling a voice dialogue robot, including:
  • the broadcast robot searches for a signal, and when receiving the response information based on the robot search signal, extracts an identification code of the voice dialogue robot from the response information;
  • control mode is a broadcast mode, synchronizing the first voice information to the voice dialogue robot associated with the identification code to cause the voice dialogue robot to perform control matching the first voice information instruction.
  • a second aspect of the embodiments of the present application provides a control device for a voice dialogue robot, the control device comprising means for performing the control method of the voice dialogue robot described in the above first aspect.
  • a third aspect of the embodiments of the present application provides a terminal device, including a memory and a processor, where the computer stores computer readable instructions executable on the processor, the processor executing the computer
  • the steps of the control method of the voice dialogue robot as described in the first aspect are implemented when the command is read.
  • a fourth aspect of the embodiments of the present application provides a computer readable storage medium storing computer readable instructions, the computer readable instructions being executed by a processor to implement the first aspect as described in the first aspect The steps of the voice dialogue robot's control method.
  • each voice dialogue robot existing in the signal search range can be detected, so that the identification codes of the voice dialogue robots at different positions from the local voice dialogue robot can be automatically acquired. And realize the communication connection with the remote voice dialogue robot; by determining the control mode of the voice information, it is ensured that when the control mode of receiving the voice information sent by the user is the broadcast mode, the voice information can be synchronized to the connected voice conversations.
  • the robot enables the user to perform voice control on a plurality of voice dialogue robots located far apart based on a voice message sent by the user. The user does not need to go to the position where each voice dialogue robot is located to perform voice control. Therefore, the embodiment of the present invention improves the control efficiency of the voice dialogue robot.
  • FIG. 1 is a flowchart of an implementation of a method for controlling a voice dialogue robot according to an embodiment of the present application
  • FIG. 2 is a specific implementation flowchart of a control method S103 of a voice dialogue robot provided by an embodiment of the present application;
  • FIG. 3 is a flowchart of an implementation of a method for controlling a voice dialogue robot according to another embodiment of the present application
  • FIG. 4 is a flowchart of an implementation of a method for controlling a voice dialogue robot according to another embodiment of the present application.
  • FIG. 5 is a flowchart of a specific implementation of a method for controlling a voice dialogue robot provided by an embodiment of the present application
  • FIG. 6 is a structural block diagram of a control apparatus of a voice dialogue robot according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • FIG. 1 shows an implementation flow of a control method of a voice dialogue robot provided by an embodiment of the present application, and the method flow includes steps S101 to S104.
  • the specific implementation principles of each step are as follows:
  • S101 Broadcast a robot search signal, and when receiving response information based on the robot search signal, extract an identification code of the voice dialogue robot from the response information.
  • the execution subject of each step is a voice dialogue robot, and the voice dialogue robot is referred to as a local voice dialogue robot to distinguish it from each remote voice dialogue robot.
  • the robot search signal is continuously issued.
  • response information based on the robot search signal is issued. Since each voice dialogue robot in the signal propagation range is at a different position from the local voice dialogue robot, each voice dialogue robot within the signal propagation range is called a remote voice dialogue robot.
  • the response message sent by the remote voice dialogue robot includes the identification code of the voice dialogue robot.
  • the identification code is used to uniquely identify a voice dialogue robot.
  • the identification code may be, for example, a PIN (Personal Identification Number) code.
  • the response information further includes a device name of the voice dialogue robot.
  • the device name is the name of the voice dialogue robot, which is preset by the manufacturer at the factory, or can be customized by the user.
  • the identification code and the device name included in the same response information are stored in one record of the data table to determine the correspondence between each identification code and its device name.
  • the identification code of the local voice dialogue robot and the device name are also stored in a record of the data table, and the record is marked as a local record.
  • S102 Establish a connection with the voice dialogue robot based on the identification code.
  • the local voice dialogue robot automatically pairs with each remote voice dialogue robot based on the received individual identification codes, and issues a link establishment request to each remote voice dialogue robot. After the link is successfully established, two-way data communication or voice communication can be performed between the local voice dialogue robot and the remote voice dialogue robot.
  • S103 Acquire first voice information sent by the user, and determine a control mode of the first voice information.
  • voice information will be sent to the local voice dialogue robot where the user is currently located.
  • the voice information sent by the user received by the local voice dialogue robot may be, for example, "playing the third song in the song list”.
  • control mode indicates how the local voice dialogue robot synchronizes voice information.
  • Control modes include stand-alone mode, multicast mode, and broadcast mode.
  • the stand-alone mode indicates that the local voice dialogue robot does not need to synchronize the voice information, that is, the control command matched with the voice information only needs to be executed by the local voice dialogue robot.
  • the multicast mode indicates that the local voice dialogue robot needs to synchronize the voice information to one or more remote voice dialogue robots.
  • the broadcast mode indicates that the local voice dialogue robot needs to synchronize the voice information to each remote voice dialogue robot that has been connected at the current time.
  • the control mode of the voice information can be preset by the user in the parameter information of the local voice dialogue robot.
  • the control mode of the voice information of the local voice dialogue robot is uniformly set to the broadcast mode.
  • the foregoing S103 specifically includes:
  • S1031 Parse the first voice information sent by the user to obtain keywords in the first voice information.
  • the voice information sent by the user when the voice information sent by the user is received, the voice information is parsed by using a preset voice recognition algorithm.
  • the speech recognition process includes: framing the speech information by a preset frame length and a frame shift to obtain a frame waveform of M (M is an integer greater than zero); respectively extracting acoustic characteristics of each frame waveform, such as MFCC (Mel-Frequency Cepstral Coefficients) to obtain an N-dimensional vector corresponding to each frame waveform.
  • M is an integer greater than zero
  • MFCC Mel-Frequency Cepstral Coefficients
  • Acquired acoustic models such as hidden Markov models, to output the probability that the frame waveform corresponds to each state; the state with the highest probability is determined as the state corresponding to the frame waveform.
  • each of the available word segments is a keyword of the voice information.
  • the device names of the remote voice dialogue robots are stored. Each keyword obtained at the current time is compared with each device name in the data table to determine whether each keyword is the same as any device name in the data table.
  • the control mode of the voice information sent by the user is the broadcast mode.
  • the voice mode is implemented in the case where the keyword names in the keyword and the data table are different.
  • the personalized setting of the control mode avoids the user's ability to uniformly set the control mode of all voice information in the parameter information of the local voice dialogue robot, thereby improving the setting flexibility of the control mode, thereby also enabling the user to control based on the control mode.
  • the mode of judgment of the mode, the voice information of different control modes is issued, and the control flexibility of the voice dialogue robot is improved.
  • control mode is a broadcast mode, synchronizing the first voice information to the voice dialogue robot associated with the identification code, so that the voice dialogue robot performs matching with the first voice information. Control instructions.
  • each of the stored identification codes is read, and the voice information is synchronously transmitted to the remote voice dialogue robot associated with each identification code, so as to receive the Each voice dialogue robot of the voice message can execute a control command that matches the voice message.
  • the voice information can also be synchronized to other remote voice dialogue robots based on the above steps S101 to S104, thereby expanding the propagation range of the voice information and realizing the distribution. Synchronous control of a range of voice dialogue robots.
  • the user can perform voice control on a plurality of voice dialogue robots that are far apart from each other based on a piece of voice information sent by the user.
  • the user does not need to go to the position where each voice dialogue robot is located to perform voice control. Therefore, the embodiment of the present invention improves the control efficiency of the voice dialogue robot.
  • the synchronization manner of the voice information when the control mode is the multicast mode is further limited. As shown in FIG. 3, after the above S104, the method further includes:
  • control mode is the multicast mode, searching, in the data table storing the identifier and the device name correspondence, the identifier corresponding to the device name carried by the first voice information, The correspondence between the identification code and the device name is obtained from the response information.
  • the voice mode control mode is the stand-alone mode; if it is different from the data table except the local device name If the device name is the same, the control mode of the voice information is determined to be a multicast mode, and it is determined that the voice information sent by the user carries the device name.
  • the control mode of the voice information is the multicast mode
  • the data table the identifier corresponding to the device name carried by the voice information is read.
  • S106 Synchronize the first voice information to the voice dialogue robot associated with the found identification code, so that the voice dialogue robot performs a control instruction that matches the first voice information.
  • each of the remote voice dialogue robots connected at the current time selects each voice dialogue robot associated with each of the above identification codes. Only the voice information sent by the user is synchronized to the selected voice dialogue robots.
  • the embodiment of the present application is applicable to a scenario in which a user needs to control a plurality of voice dialogue robots specified in an area. For example, if there are five voice dialogue robots distributed in the current area, and the device names set by each voice dialogue robot are Alice, Bob, Colly, Doggy, and Ella, respectively, and the user is currently located at Alice, the user needs to control Alice.
  • the identification code corresponding to the device name is obtained by identifying the device name carried by the voice information, and the voice information sent by the user is synchronized to each voice dialogue robot associated with the identifier.
  • the user can accurately transmit the voice information carrying different device names, realize remote synchronization control of the designated voice dialogue robot, and avoid broadcasting only the voice information to all connected voices when notifying the remote voice dialogue robot.
  • the dialogue robot therefore, achieves effective control of the voice dialogue robot and avoids the transmission of invalid information.
  • control method of the voice dialogue robot further includes:
  • S107 Determine a function type of the second voice information if the second voice information synchronized by the voice dialogue robot is received.
  • the function type of voice information refers to the function realized by the voice dialogue robot after executing the control command matching the voice information.
  • the types of functions of voice messages include, but are not limited to, timed reminders, music playback, and question answers.
  • the robot sends a reminder when the preset time arrives, then the function type of the voice message is a timed reminder.
  • the voice information Upon receiving the voice information of the remote voice dialogue robot to the local voice dialogue robot, the voice information is parsed to determine the function type of the voice information.
  • the function type of the voice information is a timed reminder.
  • the time information included in the voice information is the reminder time corresponding to the voice information. If the system time of the current local voice dialogue robot is the reminding time, the local voice dialogue robot detects the real-time position distance from the user.
  • the location distance may be detected by: acquiring the location information reported by the locator in real time based on the locator carried by the user to determine the geographic location of the user; calculating the location and the local voice dialogue robot The distance of the position; the calculated distance is determined as the position distance of the local voice dialogue robot and the user at the current time.
  • the local voice dialogue robot issues a prompt message to enable the user to receive the prompt information.
  • the prompt information includes but is not limited to an audio prompt and a blinking prompt.
  • the local voice dialogue robot activates a built-in camera to scan a face existing in the imaging area.
  • the maximum imaging range of the camera is determined as the above-described preset threshold. If the presence of the face is detected within the maximum imaging range, it is determined that the distance between the user and the local voice dialogue robot is less than a preset threshold, and a prompt message is issued.
  • the facial feature of the face is compared with a preset facial feature of the user to determine a human body currently located within the imaging range. Whether it is the owner of the voice dialogue robot. If yes, it is determined that the distance between the user and the local voice dialogue robot is less than a preset threshold, and a prompt message is sent; if not, it is determined that the distance between the user and the local voice dialogue robot is greater than a preset threshold, and no prompt information is sent.
  • the voice dialogue robot after receiving the voice information of the timed reminder type, it is determined whether the user is located at the local voice dialogue robot by determining whether the location distance between the user and the local voice dialogue robot is less than a preset threshold in real time at the reminder time. Nearby area. If the user is not located in the vicinity of the local voice dialogue robot, it is difficult for the user to receive the prompt information sent by the local voice dialogue robot. Therefore, only when the position distance between the user and the local voice dialogue robot is less than a preset threshold, the prompt information is sent, which achieves a more effective prompting effect, and also avoids multiple voice dialogue robots that receive the voice information simultaneously. Tips reduce the energy consumption of the voice dialogue robot. In addition, by recognizing the detected face facial features, the voice dialogue robot can accurately prompt the owner of the voice dialogue robot, thereby improving the accuracy of the prompt.
  • the foregoing S106 specifically includes:
  • the device name pre-stored by the local voice dialogue robot is acquired. End device name.
  • S1062 Delete, in the first voice information, a voice segment that includes the name of the local device.
  • the voice information sent by the user is identified, and the voice segment including the name of the local device is determined.
  • the voice segment is deleted and deleted, so that the local device name is no longer carried in the voice message sent by the user.
  • S1063 Synchronize the first voice information after deleting the voice segment to the voice dialogue robot associated with the found identity code, so that the voice dialogue robot performs and deletes the voice segment.
  • the first voice information matches the control command.
  • the remote voice dialogue robots to be synchronized are determined.
  • the voice information that no longer carries the name of the local device is sent to each voice dialogue robot that needs to be synchronized.
  • the remote voice dialogue robots When the remote voice dialogue robots receive the voice information synchronized by them, the above S101 to S106 are executed, that is, the voice information is synchronized to the voice corresponding to each device name according to each device name carried by the voice information. Dialogue robot. Therefore, in the embodiment of the present application, the voice segment including the name of the local device in the voice information is deleted, so that each remote voice dialogue robot does not parse the local end when receiving the voice information that is synchronized. The name of the device, so the voice information is no longer synchronized to the source of the voice information, which improves the synchronization efficiency of the information.
  • the timing function is activated.
  • the similarity of the voice information is calculated. If the similarity is greater than the preset threshold, the voice information is determined to be the same voice information actually sent by the user. At this time, a voice information with the strongest signal strength is filtered to execute a control command that matches the voice information.
  • the plurality of voice dialogue robots may synchronize the voice information to the voice mode or the multicast mode.
  • Remote voice dialogue robots Therefore, for any remote voice dialogue robot, multiple pieces of voice information with different signal strengths but the same content may be received.
  • the voice dialogue robot can be prevented from being repeatedly executed.
  • the same control command has the strongest signal strength of the selected voice information, so the recognition accuracy can be improved when the control command matching the voice information is recognized.
  • FIG. 6 is a structural block diagram of the control apparatus of the voice dialogue robot provided by the embodiment of the present application. For the convenience of description, only the embodiment of the present application is shown. Related parts.
  • the apparatus includes: a broadcasting unit 61, configured to broadcast a robot search signal, and, upon receiving response information based on the robot search signal, extract an identification code of a voice dialogue robot from the response information;
  • the connection unit 62 is configured to establish a connection with the voice dialogue robot based on the identification code;
  • the acquiring unit 63 is configured to acquire first voice information sent by the user, and determine a control mode of the first voice information;
  • the synchronization unit 64 is configured to synchronize the first voice information to the voice dialogue robot associated with the identification code if the control mode is a broadcast mode, so that the voice dialogue robot performs the A control instruction that matches a voice message.
  • the broadcast unit 61 includes: a broadcast subunit, configured to extract, from the response information, an identifier of the voice dialogue robot and a device name, and store the identifier and the device name in advance
  • the data unit is configured to: the parsing unit 603 is configured to parse the first voice information sent by the user to obtain keywords in the first voice information.
  • Determining a subunit configured to determine that the control mode of the first voice information is a broadcast mode if the keyword is different from each of the device names stored in the data table.
  • control device of the voice dialogue robot further includes: a searching unit, configured to: if the control mode is a multicast mode, search and describe in the data table storing the identifier and the device name correspondence relationship The identifier corresponding to the device name carried by the first voice information, the correspondence between the identifier and the device name is obtained from the response information, and the second synchronization unit is configured to synchronize the first voice information And the voice dialogue robot associated with the found identification code to cause the voice dialogue robot to execute a control instruction that matches the first voice information.
  • a searching unit configured to: if the control mode is a multicast mode, search and describe in the data table storing the identifier and the device name correspondence relationship The identifier corresponding to the device name carried by the first voice information, the correspondence between the identifier and the device name is obtained from the response information, and the second synchronization unit is configured to synchronize the first voice information And the voice dialogue robot associated with the found identification code to cause the voice dialogue robot to execute a control instruction that matches the first voice information.
  • control device of the voice dialogue robot further includes: a determining unit, configured to determine a function type of the second voice information if the second voice information synchronized by the voice dialogue robot is received; And if the function type is a timed reminder, detecting a distance between the current time and the user when the reminding time corresponding to the second voice information arrives; the prompting unit is configured to: if the position distance is less than a preset threshold , then send a message.
  • the second synchronization unit includes: an acquisition subunit, configured to acquire a local device name, and a deletion subunit, configured to delete, in the first voice information, a voice segment that includes the local device name a synchronization subunit, configured to synchronize the first voice information after deleting the voice segment to the voice dialogue robot associated with the found identity code, so that the voice dialogue robot performs and deletes a control instruction that matches the first voice information after the voice segment.
  • control device of the voice dialogue robot further includes: a timing unit, configured to: if the third voice information synchronized by the voice dialogue robot is received, control a built-in timer to start timing; and a calculating unit, configured to: Calculating the similarity of the third voice information and the fourth voice information if the fourth voice information synchronized by the voice dialogue robot is received before the timing value reaches the first preset threshold; And if the similarity is greater than the second preset threshold, determining, in the third voice information and the fourth voice information, a piece of voice information with a strong signal strength to perform matching with the voice information. Control instruction.
  • a timing unit configured to: if the third voice information synchronized by the voice dialogue robot is received, control a built-in timer to start timing
  • a calculating unit configured to: Calculating the similarity of the third voice information and the fourth voice information if the fourth voice information synchronized by the voice dialogue robot is received before the timing value reaches the first preset threshold; And if the similarity is greater than the second prese
  • FIG. 7 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • the terminal device 7 of this embodiment includes a processor 70 and a memory 71 in which computer readable instructions 72 operable on the processor 70, such as control of a voice dialogue robot, are stored. program.
  • the processor 70 executes the computer readable instructions 72, the steps in the embodiment of the control method of each of the voice dialogue robots described above are implemented, such as steps 101 to 104 shown in FIG.
  • the processor 70 when executing the computer readable instructions 72, implements the functions of the various modules/units in the various apparatus embodiments described above, such as the functions of the units 61-64 shown in FIG.
  • the computer readable instructions 72 may be partitioned into one or more modules/units that are stored in the memory 71 and executed by the processor 70, To complete this application.
  • the one or more modules/units may be a series of computer readable instruction segments capable of performing a particular function for describing the execution of the computer readable instructions 72 in the terminal device 7.
  • the terminal device 7 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device may include, but is not limited to, a processor 70 and a memory 71. It will be understood by those skilled in the art that FIG. 7 is only an example of the terminal device 7, and does not constitute a limitation of the terminal device 7, and may include more or less components than those illustrated, or combine some components or different components.
  • the terminal device may further include an input/output device, a network access device, a bus, and the like.
  • the so-called processor 70 can be a central processing unit (Central Processing Unit, CPU), can also be other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7.
  • the memory 71 may also be an external storage device of the terminal device 7, for example, a plug-in hard disk provided on the terminal device 7, a smart memory card (SMC), and a secure digital (SD). Card, flash card, etc. Further, the memory 71 may also include both an internal storage unit of the terminal device 7 and an external storage device.
  • the memory 71 is configured to store the computer readable instructions and other programs and data required by the terminal device.
  • the memory 71 can also be used to temporarily store data that has been output or is about to be output.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Manipulator (AREA)

Abstract

本方案提供了一种语音对话机器人的控制方法、装置、终端设备及介质,适用于人工智能技术领域,该方法包括:广播机器人搜索信号,并在接收到基于机器人搜索信号的响应信息时,从响应信息中,提取出远程的语音对话机器人的识别码;基于识别码,与语音对话机器人建立连接;获取用户发出的第一语音信息,并确定第一语音信息的控制模式;若控制模式为广播模式,则将第一语音信息同步至与识别码关联的语音对话机器人,以使语音对话机器人执行与第一语音信息匹配的控制指令。本方案使得用户能够基于一条语音信息,同时对位置相隔较远的多个语音对话机器人进行语音控制,无需再走到各个语音对话机器人所在的位置后才能执行语音控制,故提高了控制效率。

Description

语音对话机器人的控制方法、装置、终端设备及介质
本申请要求于2017年09月22日提交中国专利局、申请号为201710864661.2 、发明名称为“语音对话机器人的控制方法及终端设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于人工智能技术领域,尤其涉及一种语音对话机器人的控制方法、装置、终端设备及介质。
背景技术
语音对话机器人的发展为人们的生活带来了极大的便利,然而,市面上的语音对话机器人通常都只能单独运作。当多个语音对话机器人存在于同一环境,但各个语音对话机器人相互间隔较远时,若用户需要同时控制多个语音对话机器人,则只能走到各个语音对话机器人所在的位置后,才能分别进行语音控制。这种情况导致了语音对话机器人的控制效率较为低下。
技术问题
有鉴于此,本申请实施例提供了一种语音对话机器人的控制方法、装置、终端设备及介质,以解决现有技术中语音对话机器人的控制效率较为低下的问题。
技术解决方案
本申请实施例的第一方面提供了一种语音对话机器人的控制方法,包括:
广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码;
基于所述识别码,与所述语音对话机器人建立连接;
获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式;
若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
本申请实施例的第二方面提供了一种语音对话机器人的控制装置,该控制装置包括用于执行上述第一方面所述的语音对话机器人的控制方法的单元。
本申请实施例的第三方面提供了一种终端设备,包括存储器以及处理器,所述存储器中存储有可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如第一方面所述的语音对话机器人的控制方法的步骤。
本申请实施例的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如第一方面所述的语音对话机器人的控制方法的步骤。
有益效果
本申请实施例中,通过广播机器人搜索信号,能够检测出信号搜索范围内所存在的各个语音对话机器人,从而可以自动获取出与本端语音对话机器人处于不同位置的各个语音对话机器人的识别码,并实现与远程的语音对话机器人的通讯连接;通过确定语音信息的控制模式,保证了在接收到用户发出的语音信息的控制模式为广播模式时,能够将语音信息同步至已连接的各个语音对话机器人,使得用户能够基于其发出的一条语音信息,同时对位置相隔较远的多个语音对话机器人进行语音控制。用户无需再走到各个语音对话机器人所在的位置后才能执行语音控制,因此,本方面实施例提高了语音对话机器人的控制效率。
附图说明
图1是本申请实施例提供的语音对话机器人的控制方法的实现流程图;
图2是本申请实施例提供的语音对话机器人的控制方法S103的具体实现流程图;
图3是本申请另一实施例提供的语音对话机器人的控制方法的实现流程图;
图4是本申请又一实施例提供的语音对话机器人的控制方法的实现流程图;
图5是本申请实施例提供的语音对话机器人的控制方法S106的具体实现流程图;
图6是本申请实施例提供的语音对话机器人的控制装置的结构框图;
图7是本申请实施例提供的终端设备的示意图。
本发明的实施方式
为了说明本申请所述的技术方案,下面通过具体实施例来进行说明。
图1示出了本申请实施例提供的语音对话机器人的控制方法的实现流程,该方法流程包括步骤S101至S104。各步骤的具体实现原理如下:
S101:广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码。
本申请实施例中,各步骤的执行主体为语音对话机器人,将该语音对话机器人称为本端语音对话机器人,以区别于远程的各个语音对话机器人。
在本端语音对话机器人的运行过程中,持续发出机器人搜索信号。当位于信号传播范围内的各个语音对话机器人检测到该机器人搜索信号时,将发出基于该机器人搜索信号的响应信息。由于信号传播范围内的各个语音对话机器人与本端语音对话机器人处于不同的位置点,因而将信号传播范围内的各个语音对话机器人称为远程的语音对话机器人。
远程的语音对话机器人所发出的响应信息中,包含有该语音对话机器人的识别码。识别码用于唯一标识一个语音对话机器人。识别码例如可以是PIN(Personal Identification Number)码。
优选地,响应信息中,还包含有语音对话机器人的的设备名称。设备名称为语音对话机器人的名称,其在出厂时由厂家预先设置,也可由用户自定义设置。
将包含于同一响应信息中的识别码以及设备名称存储于数据表的一条记录中,以确定各个识别码及其设备名称的对应关系。同时,将本端语音对话机器人的识别码以及设备名称也存储于数据表的一记录中,并将该记录标记为本地记录。
S102:基于所述识别码,与所述语音对话机器人建立连接。
本端语音对话机器人基于接收到的各个识别码,与远程的各个语音对话机器人进行自动配对,并向远程的各个语音对话机器人发出链路建立请求。链路建立成功后,本端语音对话机器人以及远程的语音对话机器人之间,即可进行双向的数据通讯或语音通讯。
S103:获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式。
当用户需要对语音对话机器人进行控制时,将会对用户当前所在位置的本端语音对话机器人发出语音信息。本端语音对话机器人所接收到的用户发出的语音信息例如可以是,“播放歌曲清单中的第三首歌”。
本申请实施例中,不同的语音信息对应不同的控制模式。控制模式表示本端语音对话机器人对语音信息的同步方式。控制模式包括单机模式、组播模式以及广播模式。
单机模式表示,本端语音对话机器人无需对语音信息进行同步,即,与该语音信息匹配的控制指令仅需由本端语音对话机器人执行。
组播模式表示,本端语音对话机器人需将该语音信息同步至远程的一个或多个语音对话机器人。
广播模式表示,本端语音对话机器人需将该语音信息同步至当前时刻已连接的各个远程的语音对话机器人。
语音信息的控制模式可由用户预设于本端语音对话机器人的参数信息中。例如,在参数信息中,将本端语音对话机器人的语音信息的控制模式统一设置为广播模式。
作为本申请的一个实施例,如图2所示,上述S103具体包括:
S1031:对用户发出的第一语音信息进行解析,以获取所述第一语音信息中的关键词。
本申请实施例中,在接收到用户发出的语音信息时,通过预设的语音识别算法对该语音信息进行解析。
具体地,语音识别过程包括:以预设的帧长以及帧移来对语音信息进行分帧处理,得到M(M为大于零的整数)帧波形;分别提取每一帧波形的声学特征,如MFCC(梅尔频率倒谱系数Mel-Frequency Cepstral Coefficients),以获得每一帧波形对应的N维向量。由于一个词语的发音由音素这一语音单位构成,比音素更细的语音单位为状态,一个音素包含3个状态,因此,本申请实施例中,将每一帧波形所对应的N维向量输入预先获得的声学模型,如隐马尔可夫模型,以输出该帧波形对应各个状态的概率;将概率最大的状态确定为该帧波形对应的状态。通过确定各帧波形所对应的状态,将连续出现的每三个状态组合成一个音素,再把若干个音素进行组合后,输出语音信息所对应的词语,由此实现语音信息到文本的转换。
在将语音信息转换为相应的文本,并对该文本进行分词后,可得到的每一分词均为语音信息的关键词。
S1032:若所述关键词与所述数据表中存储的各个所述设备名称均不相同,则确定所述第一语音信息的控制模式为广播模式。
在S101所生成的数据表中,存储有远程的各个语音对话机器人的设备名称。将当前时刻获得的每一关键词分别与数据表中的各个设备名称进行对比,以分别判断各个关键词是否与数据表中的任一设备名称相同。
对于语音信息中的各个关键词,若其与数据表中的任一设备名称均不相同,则确定用户发出的语音信息的控制模式为广播模式。
本申请实施例中,通过实时解析语音信息中的关键词,并在关键词与数据表中的设备名称均不相同的情况下,确定语音信息的控制模式为广播模式,实现了对语音信息的控制模式的个性化设置,避免了用户只能在本端语音对话机器人的参数信息中,统一设置所有语音信息的控制模式,因而提高了控制模式的设置灵活性,由此也使得用户可以基于控制模式的判断规则,发出不同控制模式的语音信息,提高了对于语音对话机器人的控制灵活性。
S104:若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
当语音信息的控制模式为广播模式时,在数据表中,读取已存储的各个识别码,将该语音信息分别同步发送至远程的与各个识别码相关的语音对话机器人,以使接收到该语音信息的各个语音对话机器人可以执行与该语音信息匹配的控制指令。
优选地,对于接收到该语音信息的各个语音对话机器人,同样可基于上述步骤S101至S104,将该语音信息同步至远程的其他语音对话机器人,由此扩大语音信息的传播范围,实现了对分布范围更远的语音对话机器人的同步控制。
本申请实施例中使得用户能够基于其发出的一条语音信息,同时对位置相隔较远的多个语音对话机器人进行语音控制。用户无需再走到各个语音对话机器人所在的位置后才能执行语音控制,因此,本方面实施例提高了语音对话机器人的控制效率。
在上述各个实施例的基础之上,作为本申请的另一实施例,对控制模式为组播模式时的语音信息的同步方式作进一步的限定。如图3所示,在上述S104之后,还包括:
S105:若所述控制模式为组播模式,则在存储有识别码以及设备名称对应关系的数据表中,查找与所述第一语音信息所携带的设备名称相对应的所述识别码,所述识别码以及设备名称的对应关系从所述响应信息中获取。
对于语音信息中的任一关键词,若其与本端语音对话机器人的设备名称相同,则确定语音信息的控制模式为单机模式;若其与数据表中除本端设备名称之外的任一设备名称相同,则确定语音信息的控制模式为组播模式,并确定用户发出的语音信息携带有该设备名称。
当语音信息的控制模式为组播模式时,在数据表中,读取与语音信息所携带的设备名称相对应的识别码。
S106:将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
基于当前时刻所读取到的各个识别码,在当前时刻所连接的各个远程的语音对话机器人中,筛选出与上述各个识别码关联的各个语音对话机器人。仅将用户发出的语音信息同步至筛选出的各个语音对话机器人。
本申请实施例适用于用户需要控制区域内指定的多个语音对话机器人的场景之下。例如,若当前区域内分布有5个语音对话机器人,其中各个语音对话机器人设置的设备名称分别为Alice、Bob、Colly、Doggy以及Ella,且用户当前位于Alice所处的位置,则用户需要控制Alice、Colly以及Ella同时播放王力宏的歌曲时,可对Alice发出语音信息“Alice,你和Bob、Ella一起播放王力宏的歌”。此时,接收到语音信息的Alice将会把该语音信息同步至Bob以及Ella,从而保证Alice、Colly以及Ella均能获取该语音信息,并一起执行与该语音信息匹配的控制指令。
本申请实施例中,在组播模式下,通过识别语音信息所携带的设备名称,获取与设备名称对应的识别码,并将用户发出的语音信息同步至与识别码关联的各个语音对话机器人,使得用户能够准确地发出携带有不同设备名称的语音信息,实现对指定的语音对话机器人的远程同步控制,避免了在通知远程的语音对话机器人时,仅能将语音信息广播至已连接的所有语音对话机器人,因此,实现了对语音对话机器人的有效控制,避免了无效信息的传输。
作为本申请的又一实施例,如图4所示,所述语音对话机器人的控制方法还包括:
S107:若接收到所述语音对话机器人所同步的第二语音信息,则确定所述第二语音信息的功能类型。
语音信息的功能类型,是指语音对话机器人在执行与语音信息匹配的控制指令后所实现的功能。语音信息的功能类型包括但不限于定时提醒、音乐播放、以及问题回答等。
若用户发出一语音信息,且该语音信息用于控制语音对话机器人在预设时间到达时发出提醒,则该语音信息的功能类型即为定时提醒。
在接收到远程任一语音对话机器人同步至本端语音对话机器人的语音信息时,对该语音信息进行解析,以确定该语音信息的功能类型。
示例性地,若识别到语音信息中包含时间信息以及“提醒”二字,则确定语音信息的功能类型为定时提醒。
S108:若所述功能类型为定时提醒,则在所述第二语音信息对应的提醒时刻到达时,检测当前时刻与用户的位置距离。
语音信息中所包含的时间信息即为语音信息对应的提醒时刻。若当前本端语音对话机器人的系统时间为该提醒时刻,则本端语音对话机器人检测其与用户的实时位置距离。
在一示例中,位置距离的检测方式可以是:基于用户所携带的定位器,获取定位器所实时上报的位置信息,以确定用户的地理位置;计算该地理位置与本端语音对话机器人所处位置的距离;将计算出的距离确定为当前时刻本端语音对话机器人与用户的位置距离。
S109:若所述位置距离小于预设阈值,则发出提示信息。
若位置距离小于预设阈值,则本端语音对话机器人发出提示信息,以使用户接收该提示信息。提示信息包括但不限于音频提示以及闪烁提示等。
作为本申请的在另一实施示例,本端语音对话机器人启动内置的摄像头,可扫描摄像区域内所存在的人脸。此时,将摄像头的最大摄像范围确定为上述预设阈值。若在最大摄像范围内检测到人脸存在,则确定用户与本端语音对话机器人位置距离小于预设阈值,并发出提示信息。
优选地,在上述示例中,若在最大摄像范围内检测到人脸存在,则将该人脸的脸部特征与预设的用户的脸部特征进行对比,以确定当前位于摄像范围内的人体是否为语音对话机器人的主人。若是,则确定用户与本端语音对话机器人位置距离小于预设阈值,并发出提示信息;若否,则确定用户与本端语音对话机器人位置距离大于预设阈值,不发出提示信息。
本申请实施例中,在接受到定时提醒类型的语音信息后,通过在提醒时刻实时判定用户与本端语音对话机器人的位置距离是否小于预设阈值,能够确定用户是否位于本端语音对话机器人的附近区域。若用户并非位于本端语音对话机器人的附近区域,则用户也难以接收到本端语音对话机器人发出的提示信息。因此,仅在用户与本端语音对话机器人的位置距离小于预设阈值时才发出提示信息,达到了更为有效的提示效果,同时也避免了接收到语音信息的多个语音对话机器人都同时发出提示,降低了语音对话机器人的能耗。另外,通过对检测到的人脸脸部特征进行识别,使得语音对话机器人的能够准确地对语音对话机器人的主人发出提示,提高了提示的准确性。
作为本申请的一个实施例,如图5所示,上述S106具体包括:
S1061:获取本端设备名称。
在组播模式之下,在本端语音对话机器人将用户发出的语音信息同步至指定的一个或多个远程的语音对话机器人之前,先获取本端语音对话机器人所预先存储的设备名称,即本端设备名称。
S1062:在所述第一语音信息中,删除包含所述本端设备名称的语音片段。
对用户发出的语音信息进行识别,确定出其中包含本端设备名称的语音片段。将该语音片段进行截取后删除,使得用户发出的语音信息中,不再携带有本端设备名称。
S1063:将删除所述语音片段后的所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与删除所述语音片段后的所述第一语音信息相匹配的控制指令。
根据上述S105所查找出的各个识别码,确定所需同步的远程的各个语音对话机器人。将不再携带有本端设备名称的语音信息发送至所需同步的各个语音对话机器人。
例如,若用户对Alice发出语音信息“Alice、Bob和Ella一起播放王力宏的歌”,则由于Alice的本端设备名称为“Alice”,故在该语音信息中,删除包含“Alice”的语音片段,得到“Bob和Ella一起播放王力宏的歌”;Alice将“Bob和Ella一起播放王力宏的歌”这一语音信息同步至Bob和Ella。
由于远程的各个语音对话机器人在接收其所同步的语音信息时,会执行上述S101至S106,即,根据该语音信息所携带的各个设备名称,将该语音信息再次同步至于各个设备名称对应的语音对话机器人。因此,本申请实施例中,通过将语音信息中包含本端设备名称的语音片段进行删除,使得远程的各个语音对话机器人在接收其所同步的语音信息时,不会再解析出所述本端设备名称,因而不会再将该语音信息重复同步至语音信息的来源端,提高了信息的同步效率。
在上述各个实施例的基础之上,作为本申请的一个实施例,若接收到远程的语音对话机器人所同步过来的语音信息,则启动计时功能。在预设时长之内,若再次接收到远程的语音对话机器人所同步过来的语音信息,则计算这些语音信息的相似度。若相似度大于预设阈值,则确定这些语音信息为用户实际发出的同一语音信息,此时,筛选其中信号强度最强的一条语音信息,以执行与语音信息相匹配的控制指令。
本申请实施例中,由于用户发出的语音信息可能同时被附近的多个语音对话机器人检测得到,故在广播模式或者组播模式之下,所述多个语音对话机器人均会该语音信息同步至远程的各个语音对话机器人。因此,对于远程的任一语音对话机器人而言,可能会接收到信号强度不同但内容相同的多条语音信息。这种情况下,通过判断预设时长内所先后接收到的各条语音信息的相似度,并在相似度大于阈值时,筛选出信号强度最强的语音信息,能够避免语音对话机器人重复执行多次相同的控制指令,由于筛选出的语音信息的信号强度最强,故在识别与语音信息匹配的控制指令时,能够提高识别的准确性。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
对应于上文实施例所述的语音对话机器人的控制方法,图6示出了本申请实施例提供的语音对话机器人的控制装置的结构框图,为了便于说明,仅示出了与本申请实施例相关的部分。
参照图6,该装置包括:广播单元61,用于广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码;连接单元62,用于基于所述识别码,与所述语音对话机器人建立连接;获取单元63,用于获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式;第一同步单元64,用于若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
可选地,所述广播单元61包括:广播子单元,用于从所述响应信息中,提取出语音对话机器人的识别码以及设备名称,并将所述识别码以及所述设备名称存储至预先建立的数据表;所述获取单元603包括:解析子单元,用于对用户发出的第一语音信息进行解析,以获取所述第一语音信息中的关键词。
确定子单元,用于若所述关键词与所述数据表中存储的各个所述设备名称均不相同,则确定所述第一语音信息的控制模式为广播模式。
可选地,所述语音对话机器人的控制装置还包括:查找单元,用于若所述控制模式为组播模式,则在存储有识别码以及设备名称对应关系的数据表中,查找与所述第一语音信息所携带的设备名称相对应的所述识别码,所述识别码以及设备名称的对应关系从所述响应信息中获取;第二同步单元,用于将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
可选地,所述语音对话机器人的控制装置还包括:确定单元,用于若接收到所述语音对话机器人所同步的第二语音信息,则确定所述第二语音信息的功能类型;检测单元,用于若所述功能类型为定时提醒,则在所述第二语音信息对应的提醒时刻到达时,检测当前时刻与用户的位置距离;提示单元,用于若所述位置距离小于预设阈值,则发出提示信息。
可选地,所述第二同步单元包括:获取子单元,用于获取本端设备名称;删除子单元,用于在所述第一语音信息中,删除包含所述本端设备名称的语音片段;同步子单元,用于将删除所述语音片段后的所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与删除所述语音片段后的所述第一语音信息相匹配的控制指令。
可选地,所述语音对话机器人的控制装置还包括:计时单元,用于若接收到所述语音对话机器人所同步的第三语音信息,则控制内置的计时器启动计时;计算单元,用于在计时值达到第一预设阈值之前,若接收到所述语音对话机器人所同步的第四语音信息,则计算这所述第三语音信息以及所述第四语音信息的相似度;执行单元,用于若所述相似度大于第二预设阈值,则在所述第三语音信息以及所述第四语音信息中,确定出信号强度较强的一条语音信息,以执行与该语音信息匹配的控制指令。
图7是本申请一实施例提供的终端设备的示意图。如图7所示,该实施例的终端设备7包括处理器70以及存储器71,所述存储器71中存储有可在所述处理器70上运行的计算机可读指令72,例如语音对话机器人的控制程序。所述处理器70执行所述计算机可读指令72时实现上述各个语音对话机器人的控制方法实施例中的步骤,例如图1所示的步骤101至104。或者,所述处理器70执行所述计算机可读指令72时实现上述各装置实施例中各模块/单元的功能,例如图6所示单元61至64的功能。
示例性的,所述计算机可读指令72可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器71中,并由所述处理器70执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述所述计算机可读指令72在所述终端设备7中的执行过程。
所述终端设备7可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述终端设备可包括,但不仅限于,处理器70、存储器71。本领域技术人员可以理解,图7仅仅是终端设备7的示例,并不构成对终端设备7的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器70可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application Specific Integrated Circuit,ASIC)、现成可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器71可以是所述终端设备7的内部存储单元,例如终端设备7的硬盘或内存。所述存储器71也可以是所述终端设备7的外部存储设备,例如所述终端设备7上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器71还可以既包括所述终端设备7的内部存储单元也包括外部存储设备。所述存储器71用于存储所述计算机可读指令以及所述终端设备所需的其他程序和数据。所述存储器71还可以用于暂时地存储已经输出或者将要输出的数据。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种语音对话机器人的控制方法,其特征在于,包括:
    广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码;
    基于所述识别码,与所述语音对话机器人建立连接;
    获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式;
    若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  2. 如权利要求1所述的控制方法,其特征在于,所述从所述响应信息中,提取出语音对话机器人的识别码,包括:
    从所述响应信息中,提取出语音对话机器人的识别码以及设备名称,并将所述识别码以及所述设备名称存储至预先建立的数据表;
    所述获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式,包括:
    对用户发出的第一语音信息进行解析,以获取所述第一语音信息中的关键词;
    若所述关键词与所述数据表中存储的各个所述设备名称均不相同,则确定所述第一语音信息的控制模式为广播模式。
  3. 如权利要求1所述的控制方法,其特征在于,还包括:
    若所述控制模式为组播模式,则在存储有识别码以及设备名称对应关系的数据表中,查找与所述第一语音信息所携带的设备名称相对应的所述识别码,所述识别码以及设备名称的对应关系从所述响应信息中获取;
    将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  4. 如权利要求1所述的控制方法,其特征在于,还包括:
    若接收到所述语音对话机器人所同步的第二语音信息,则确定所述第二语音信息的功能类型;
    若所述功能类型为定时提醒,则在所述第二语音信息对应的提醒时刻到达时,检测当前时刻与用户的位置距离;
    若所述位置距离小于预设阈值,则发出提示信息。
  5. 如权利要求3所述的控制方法,其特征在于,所述将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令,包括:
    获取本端设备名称;
    在所述第一语音信息中,删除包含所述本端设备名称的语音片段;
    将删除所述语音片段后的所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与删除所述语音片段后的所述第一语音信息相匹配的控制指令。
  6. 一种语音对话机器人的控制装置,其特征在于,包括:
    广播单元,用于广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码;
    连接单元,用于基于所述识别码,与所述语音对话机器人建立连接;
    获取单元,用于获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式;
    第一同步单元,用于若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  7. 根据权利要求6所述的控制装置,其特征在于,所述广播单元包括:
    广播子单元,用于从所述响应信息中,提取出语音对话机器人的识别码以及设备名称,并将所述识别码以及所述设备名称存储至预先建立的数据表;
    所述获取单元包括:
    解析子单元,用于对用户发出的第一语音信息进行解析,以获取所述第一语音信息中的关键词;
    确定子单元,用于若所述关键词与所述数据表中存储的各个所述设备名称均不相同,则确定所述第一语音信息的控制模式为广播模式。
  8. 根据权利要求6所述的控制装置,其特征在于,还包括:
    查找单元,用于若所述控制模式为组播模式,则在存储有识别码以及设备名称对应关系的数据表中,查找与所述第一语音信息所携带的设备名称相对应的所述识别码,所述识别码以及设备名称的对应关系从所述响应信息中获取;
    第二同步单元,用于将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  9. 根据权利要求6所述的语音对话机器人的控制装置,其特征在于,还包括:
    确定单元,用于若接收到所述语音对话机器人所同步的第二语音信息,则确定所述第二语音信息的功能类型;
    检测单元,用于若所述功能类型为定时提醒,则在所述第二语音信息对应的提醒时刻到达时,检测当前时刻与用户的位置距离;
    提示单元,用于若所述位置距离小于预设阈值,则发出提示信息。
  10. 根据权利要求8所述的语音对话机器人的控制装置,其特征在于,所述第二同步单元包括:
    获取子单元,用于获取本端设备名称;
    删除子单元,用于在所述第一语音信息中,删除包含所述本端设备名称的语音片段;
    同步子单元,用于将删除所述语音片段后的所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与删除所述语音片段后的所述第一语音信息相匹配的控制指令。
  11. 一种终端设备,其特征在于,包括存储器以及处理器,所述存储器中存储有可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
    广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码;
    基于所述识别码,与所述语音对话机器人建立连接;
    获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式;
    若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  12. 根据权利要求11所述的终端设备,其特征在于,所述从所述响应信息中,提取出语音对话机器人的识别码,包括:
    从所述响应信息中,提取出语音对话机器人的识别码以及设备名称,并将所述识别码以及所述设备名称存储至预先建立的数据表;
    所述获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式,包括:
    对用户发出的第一语音信息进行解析,以获取所述第一语音信息中的关键词;
    若所述关键词与所述数据表中存储的各个所述设备名称均不相同,则确定所述第一语音信息的控制模式为广播模式。
  13. 根据权利要求11所述的终端设备,其特征在于,所述处理器执行所述计算机可读指令时,还实现如下步骤:
    若所述控制模式为组播模式,则在存储有识别码以及设备名称对应关系的数据表中,查找与所述第一语音信息所携带的设备名称相对应的所述识别码,所述识别码以及设备名称的对应关系从所述响应信息中获取;
    将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  14. 根据权利要求11所述的终端设备,其特征在于,所述处理器执行所述计算机可读指令时,还实现如下步骤:
    若接收到所述语音对话机器人所同步的第二语音信息,则确定所述第二语音信息的功能类型;
    若所述功能类型为定时提醒,则在所述第二语音信息对应的提醒时刻到达时,检测当前时刻与用户的位置距离;
    若所述位置距离小于预设阈值,则发出提示信息。
  15. 根据权利要求13所述的终端设备,其特征在于,所述将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令,包括:
    获取本端设备名称;
    在所述第一语音信息中,删除包含所述本端设备名称的语音片段;
    将删除所述语音片段后的所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与删除所述语音片段后的所述第一语音信息相匹配的控制指令。
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被至少一个处理器执行时实现如下步骤:
    广播机器人搜索信号,并在接收到基于所述机器人搜索信号的响应信息时,从所述响应信息中,提取出语音对话机器人的识别码;
    基于所述识别码,与所述语音对话机器人建立连接;
    获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式;
    若所述控制模式为广播模式,则将所述第一语音信息同步至与所述识别码关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  17. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述从所述响应信息中,提取出语音对话机器人的识别码,包括:
    从所述响应信息中,提取出语音对话机器人的识别码以及设备名称,并将所述识别码以及所述设备名称存储至预先建立的数据表;
    所述获取用户发出的第一语音信息,并确定所述第一语音信息的控制模式,包括:
    对用户发出的第一语音信息进行解析,以获取所述第一语音信息中的关键词;
    若所述关键词与所述数据表中存储的各个所述设备名称均不相同,则确定所述第一语音信息的控制模式为广播模式。
  18. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述计算机可读指令被至少一个处理器执行时,还实现如下步骤:
    若所述控制模式为组播模式,则在存储有识别码以及设备名称对应关系的数据表中,查找与所述第一语音信息所携带的设备名称相对应的所述识别码,所述识别码以及设备名称的对应关系从所述响应信息中获取;
    将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令。
  19. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述计算机可读指令被至少一个处理器执行时,还实现如下步骤:
    若接收到所述语音对话机器人所同步的第二语音信息,则确定所述第二语音信息的功能类型;
    若所述功能类型为定时提醒,则在所述第二语音信息对应的提醒时刻到达时,检测当前时刻与用户的位置距离;
    若所述位置距离小于预设阈值,则发出提示信息。
  20. 根据权利要求18所述的计算机可读存储介质,其特征在于,所述将所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与所述第一语音信息匹配的控制指令,包括:
    获取本端设备名称;
    在所述第一语音信息中,删除包含所述本端设备名称的语音片段;
    将删除所述语音片段后的所述第一语音信息同步至与查找出的所述识别码相关联的所述语音对话机器人,以使所述语音对话机器人执行与删除所述语音片段后的所述第一语音信息相匹配的控制指令。
PCT/CN2018/077043 2017-09-22 2018-02-23 语音对话机器人的控制方法、装置、终端设备及介质 WO2019056700A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710864661.2A CN107756412B (zh) 2017-09-22 2017-09-22 语音对话机器人的控制方法及终端设备
CN201710864661.2 2017-09-22

Publications (1)

Publication Number Publication Date
WO2019056700A1 true WO2019056700A1 (zh) 2019-03-28

Family

ID=61266674

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077043 WO2019056700A1 (zh) 2017-09-22 2018-02-23 语音对话机器人的控制方法、装置、终端设备及介质

Country Status (2)

Country Link
CN (1) CN107756412B (zh)
WO (1) WO2019056700A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490971B (zh) * 2021-12-30 2024-04-05 重庆特斯联智慧科技股份有限公司 基于人机对话交互的机器人控制方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004033624A (ja) * 2002-07-05 2004-02-05 Nti:Kk ペット型ロボットによる遠隔制御装置
JP2006068489A (ja) * 2004-08-02 2006-03-16 Tomy Co Ltd 対話型ペットロボット
CN106325142A (zh) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 一种机器人系统及其控制方法
CN106547249A (zh) * 2016-10-14 2017-03-29 广州励丰文化科技股份有限公司 一种语音检测与本地媒体相结合的机械臂控制台及方法
CN106782502A (zh) * 2016-12-29 2017-05-31 昆山库尔卡人工智能科技有限公司 一种儿童机器人用的语音识别装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902253B (zh) * 2012-10-09 2015-07-15 鸿富锦精密工业(深圳)有限公司 具有语音控制功能的智能开关及智能控制系统
CN104007678A (zh) * 2014-05-26 2014-08-27 邯郸美的制冷设备有限公司 家用电器语音控制的方法、终端和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004033624A (ja) * 2002-07-05 2004-02-05 Nti:Kk ペット型ロボットによる遠隔制御装置
JP2006068489A (ja) * 2004-08-02 2006-03-16 Tomy Co Ltd 対話型ペットロボット
CN106325142A (zh) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 一种机器人系统及其控制方法
CN106547249A (zh) * 2016-10-14 2017-03-29 广州励丰文化科技股份有限公司 一种语音检测与本地媒体相结合的机械臂控制台及方法
CN106782502A (zh) * 2016-12-29 2017-05-31 昆山库尔卡人工智能科技有限公司 一种儿童机器人用的语音识别装置

Also Published As

Publication number Publication date
CN107756412B (zh) 2019-09-17
CN107756412A (zh) 2018-03-06

Similar Documents

Publication Publication Date Title
US11758328B2 (en) Selection of master device for synchronized audio
US11875820B1 (en) Context driven device arbitration
US20210074291A1 (en) Implicit target selection for multiple audio playback devices in an environment
US10431217B2 (en) Audio playback device that dynamically switches between receiving audio data from a soft access point and receiving audio data from a local access point
US10643609B1 (en) Selecting speech inputs
WO2021159688A1 (zh) 声纹识别方法、装置、存储介质、电子装置
WO2017071182A1 (zh) 一种语音唤醒方法、装置及系统
WO2019029352A1 (zh) 一种智能语音交互方法及系统
US11810593B2 (en) Low power mode for speech capture devices
EP3583509A1 (en) Selection of master device for synchronized audio
WO2014173325A1 (zh) 喉音识别方法及装置
CN111178081A (zh) 语义识别的方法、服务器、电子设备及计算机存储介质
CN108305629B (zh) 一种场景学习内容获取方法、装置、学习设备及存储介质
US11856674B1 (en) Content-based light illumination
WO2019056700A1 (zh) 语音对话机器人的控制方法、装置、终端设备及介质
US11783833B2 (en) Multi-device output management based on speech characteristics
US20220161131A1 (en) Systems and devices for controlling network applications
US11783805B1 (en) Voice user interface notification ordering
CN112185374A (zh) 一种确定语音意图的方法及装置
CN112802465A (zh) 一种语音控制方法及系统
US11694684B1 (en) Generation of computing functionality using devices
US10475450B1 (en) Multi-modality presentation and execution engine
US11990116B1 (en) Dynamically rendered notifications and announcements
US12002469B2 (en) Multi-device output management based on speech characteristics
US11853975B1 (en) Contextual parsing of meeting information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18858160

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/09/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18858160

Country of ref document: EP

Kind code of ref document: A1