WO2022078189A1 - 一种支持动态意图的控制方法、装置及存储介质 - Google Patents

一种支持动态意图的控制方法、装置及存储介质 Download PDF

Info

Publication number
WO2022078189A1
WO2022078189A1 PCT/CN2021/120604 CN2021120604W WO2022078189A1 WO 2022078189 A1 WO2022078189 A1 WO 2022078189A1 CN 2021120604 W CN2021120604 W CN 2021120604W WO 2022078189 A1 WO2022078189 A1 WO 2022078189A1
Authority
WO
WIPO (PCT)
Prior art keywords
intent
blueprint
information
node
control
Prior art date
Application number
PCT/CN2021/120604
Other languages
English (en)
French (fr)
Inventor
何博文
曹晓康
马世奎
Original Assignee
达闼机器人有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 达闼机器人有限公司 filed Critical 达闼机器人有限公司
Publication of WO2022078189A1 publication Critical patent/WO2022078189A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/015Input arrangements based on nervous system activity detection, e.g. brain waves [EEG] detection, electromyograms [EMG] detection, electrodermal response detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, and in particular, to a control method, device and storage medium supporting dynamic intentions.
  • robotic devices In the development and application of the robotic device, when the user interacts with the robotic device, the user's input information will be understood through natural language, thereby forming user intents, so that the robotic device can understand and execute the user intent. Due to the variety of user intents, robotic devices need a technology to support this variety of intents.
  • the embodiments of the present disclosure creatively provide a control method, an apparatus, and a storage medium that support dynamic intentions in order to solve the problems existing in the information interaction between the existing robot equipment and the user.
  • a control method for supporting dynamic intent comprising: acquiring input information; identifying intent information represented by the input information; determining a corresponding blueprint node according to the intent information; The blueprint node is called to execute blueprint logic to control the execution of the behavior operation matching the intent information.
  • the intent information is an intent structure including an intent name and an intent parameter.
  • determining the corresponding blueprint node according to the intent information includes: determining the corresponding blueprint node according to the intent name in the intent structure; controlling the parameters of the blueprint node and the intent structure The intent parameters are the same.
  • invoking the blueprint node to execute the blueprint logic includes: when the blueprint node is triggered, executing the next blueprint node corresponding to the blueprint node.
  • identifying the intent information represented by the input information includes: sending the input information to the cloud; and receiving intent information obtained by the cloud performing intent identification on the input information.
  • a control device supporting dynamic intent comprising: an acquisition module for acquiring input information; an intent recognition module for recognizing the intent information represented by the input information; a blueprint The module is used to determine the corresponding blueprint node according to the intention information; the control execution module is used to call the blueprint node to execute the blueprint logic, so as to control the execution of the behavior operation matching the intention information.
  • the intent information is an intent structure including an intent name and an intent parameter.
  • the blueprint module includes: a blueprint node determination unit, configured to determine a corresponding blueprint node according to an intent name in the intent structure; a control unit, configured to control the parameters of the blueprint node and all The intent parameters in the intent structure are the same.
  • control execution module is specifically configured to trigger the blueprint node; when the blueprint node is triggered, execute the next blueprint node corresponding to the blueprint node.
  • the intent recognition module is specifically configured to send the input information to the cloud; and receive the intent information obtained after the cloud performs intent recognition on the input information.
  • a control device supporting dynamic intentions comprising: one or more processors; a memory for storing one or more programs, the one or more programs being stored by the one or more programs or multiple processors execute, so that the one or more processors implement any of the above control methods supporting dynamic intent.
  • a computer-readable storage medium comprising a set of computer-executable instructions for executing any of the foregoing control methods supporting dynamic intent when the instructions are executed.
  • the embodiments of the present disclosure support the control method, device, and storage medium for dynamic intent.
  • the robot device first obtains input information from the user; then identifies the intent information represented by the input information; then determines the corresponding blueprint node according to the intent information; and finally invokes the blueprint node to execute Blueprint logic to control the execution of behavioral actions that match intent information.
  • the robot device of the present disclosure can control the behavior of the robot device by using the blueprint technology of the virtual engine, so that the application blueprint triggers the corresponding blueprint node to execute related logic according to the intent information, thereby realizing the support of the dynamic intent by the robot device.
  • FIG. 1 shows a schematic diagram 1 of an implementation flow of a control method supporting dynamic intent according to an embodiment of the present disclosure.
  • FIG. 2 shows a second schematic flowchart of the implementation of the control method supporting dynamic intent according to an embodiment of the present disclosure.
  • FIG. 3 shows a blueprint of the response processing of the intent configuration of an application instance of the present disclosure in the cloud.
  • FIG. 4 shows a blueprint of response processing of intent configuration in the cloud for another application instance of the present disclosure.
  • FIG. 5 shows a blueprint of response processing of intent configuration in the cloud for yet another application instance of the present disclosure.
  • FIG. 6 shows a schematic diagram 1 of the composition and structure of a control device supporting dynamic intent according to an embodiment of the present disclosure.
  • FIG. 7 shows a second schematic diagram of the composition and structure of a control apparatus supporting dynamic intent according to an embodiment of the present disclosure.
  • FIG. 8 shows a schematic structural diagram of a robot device provided by an embodiment of the present disclosure.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • plurality means two or more, unless expressly and specifically defined otherwise.
  • FIG. 1 shows a schematic diagram 1 of an implementation flow of a control method supporting dynamic intent according to an embodiment of the present disclosure.
  • the embodiments of the present disclosure can be applied to a robotic device, as shown in FIG. 1 , including the following steps:
  • Step 101 acquiring input information.
  • the robotic device acquires input information from the user.
  • the input information can be voice information from the user, for example, the robot device collects the user's voice information through the microphone array hardware; the input information can also be command information automatically generated by the robot device in response to a user trigger, such as when the user is in the robot device.
  • a user trigger such as when the user is in the robot device.
  • the hardware trigger button or the software trigger button on the touch interface performs key triggering, the command information that is generated in response to the user triggering the trigger button; the input information can also be input by the user in the input area of the display interface of the robot device text information.
  • the robot device can further preprocess the input information, such as segmenting the input sentence, removing stop words, removing special characters and other preprocessing operations ; after that, continue to execute the subsequent step 102 .
  • Step 102 Identify the intent information represented by the input information.
  • the intent information is an intent structure including an intent name and an intent parameter.
  • the robot device can perform intention recognition on the input information through its own awareness recognition system, and obtain the recognition result including the intention information; the robot device can also use the cloud-based intention recognition system to perform intention recognition on the input information, so as to obtain the intention information including the intention information. recognition result.
  • the robot device can perform natural speech processing (NLP) and natural speech understanding (NLU) through an intention recognition system, thereby recognizing the intention information represented by the input information.
  • NLP natural speech processing
  • NLU natural speech understanding
  • Step 103 Determine the corresponding blueprint node according to the intent information.
  • the robot device determines the corresponding blueprint node according to the received intent name of the intent structure, and controls the parameters of the blueprint node to be consistent with the intent parameters of the intent structure.
  • the intent parameter in the intent structure is empty, after determining the corresponding blueprint node, the control operation on the blueprint node parameter can be omitted.
  • Step 104 invoking the blueprint node to execute the blueprint logic to control the execution of the behavior operation matching the intent information.
  • the robot device triggers the blueprint node, and when the blueprint node is triggered, the next blueprint node corresponding to the blueprint node is automatically executed to control the execution of the behavior operation matching the intent information.
  • the robot device of the present disclosure can control the behavior of the robot device by using the blueprint technology of the virtual engine, so that the application blueprint triggers the corresponding blueprint node to execute related logic according to the intent information, thereby realizing the support of the dynamic intent by the robot device.
  • Fig. 2 shows the second implementation flow diagram of the control method supporting dynamic intent according to an embodiment of the present disclosure
  • Fig. 3 shows a blueprint of response processing of the intent configuration of an application instance of the present disclosure in the cloud
  • Fig. 4 shows another aspect of the present disclosure
  • FIG. 5 shows a blueprint of response processing of intent configuration of another application instance of the present disclosure.
  • control method for supporting dynamic intent can be applied to a robot device, and specifically includes the following steps:
  • Step 201 acquiring input information.
  • the robotic device acquires input information from the user.
  • the input information can be voice information from the user, for example, the robot device collects the user's voice information through the microphone array hardware; the input information can also be command information automatically generated by the robot device in response to a user trigger, such as when the user is in the robot device.
  • a user trigger such as when the user is in the robot device.
  • the hardware trigger button or the software trigger button on the touch interface performs key triggering, the command information that is generated in response to the user triggering the trigger button; the input information can also be input by the user in the input area of the display interface of the robot device text information.
  • the robot device can further preprocess the input information, such as segmenting the input sentence, removing stop words, removing special characters and other preprocessing operations ; after that, continue to execute the subsequent step 202 .
  • Step 202 sending the input information to the cloud.
  • the robot device sends the received input information to the cloud through a robot control unit (RCU), which may also be called a cloud brain.
  • RCU robot control unit
  • Step 203 Receive intent information obtained after the cloud performs intent recognition on the input information.
  • the intent information is an intent structure including an intent name and an intent parameter.
  • the robot device can perform natural speech processing (NLP) and natural speech understanding (NLU) through the cloud brain's intent recognition system, so as to recognize the intent information represented by the input information, and feed back the intent. information to the robotic device.
  • NLP natural speech processing
  • NLU natural speech understanding
  • Step 204 Determine the corresponding blueprint node according to the intent information.
  • the robot device determines the corresponding blueprint node according to the received intent name of the intent structure, and controls the parameters of the blueprint node to be consistent with the intent parameters of the intent structure.
  • the control operation on the blueprint node parameter can be omitted.
  • Step 205 invoking the blueprint node to execute the blueprint logic to control the execution of the behavior operation matching the intent information.
  • the robot device triggers the blueprint node, and when the blueprint node is triggered, the next blueprint node corresponding to the blueprint node is automatically executed to control the execution of the behavior operation matching the intent information.
  • the user interacts with the robot device by voice, and the robot device obtains the input information (voice information) "Please do the action and punch your fist".
  • the intent of the input information (voice information) in the cloud is configured as: in the intent structure.
  • the intent name is "TakeAction”
  • the intent parameter is "action name (motionName)”
  • the value corresponding to the intent parameter motionName is "punch fist”.
  • RCU Robot Control Unit
  • the parameter of the blueprint node is "PlayMotionName”, which corresponds to the intent parameter of the intent structure, and its value is "punch fist".
  • Player node When the blueprint node is triggered, it will execute the next blueprint node "PlayMotion”, which controls the behavior of the robot device, that is, controls the robot device to perform the action "punch fist".
  • the user performs voice interaction with the robot device, and the robot device obtains the input information (voice information) "a little forward", and the intent of the input information (voice information) in the cloud is configured as: intent in the intent structure
  • intent in the intent structure The name is "MoveForward”, and there is no intent parameter, that is, the intent parameter is empty.
  • RCU Robot Control Unit
  • the blueprint node name is "MoveForward”
  • the intent structure is one-to-one, and the blueprint node has no parameters.
  • the blueprint node When the blueprint node is triggered, it will execute the next blueprint node "Move”, which controls the behavior of the robotic device, that is, controls the robotic device to perform forward movement, and the moving distance can be a preset moving distance .
  • the user performs voice interaction with the robot device, and the robot device obtains input information (voice information) to "go to the table", and the intent configuration of the input information (voice information) in the cloud is: in the intent structure
  • the intent name is "navigation (navigationToPosion)”
  • the intent parameter is "destination (destination)”
  • the value corresponding to the intent parameter destination is "table”.
  • the blueprint node name is "navigation (navigationToPosion)", and the intent of the intent structure
  • the names correspond one-to-one
  • the parameter of the blueprint node is "destination”, which corresponds to the intent parameter of the intent structure, and its value is "table”.
  • the blueprint node When the blueprint node is triggered, it will execute the next blueprint node "CS Navigation Skill (CSNavigate Skill)", which controls the robot device to navigate to the coordinates corresponding to the destination "table".
  • CS Navigation Skill CSNavigate Skill
  • the robot device of the present disclosure firstly uses the cloud brain to perform intention recognition on the received input information, and obtains the intention information represented by the input information; then, with the help of the blueprint technology of the virtual engine, the application blueprint triggers the corresponding blueprint node according to the intention information to execute related tasks.
  • the logic can control the behavior of the robot device, so as to realize the support for the dynamic intention of the robot device.
  • FIG. 6 shows a schematic diagram 1 of the composition structure of a control apparatus supporting dynamic intention according to an embodiment of the present disclosure
  • FIG. 7 shows a schematic diagram 2 of composition structure of a control apparatus supporting dynamic intention according to an embodiment of the present disclosure.
  • a control apparatus 60 supporting dynamic intentions according to an embodiment of the present disclosure includes:
  • the intent identification module 602 is configured to identify intent information represented by the input information; wherein the intent information is an intent structure including an intent name and an intent parameter.
  • a blueprint module 603, configured to determine a corresponding blueprint node according to the intent information
  • the control execution module 604 is configured to call the blueprint node to execute the blueprint logic, so as to control the execution of the behavior operation matching the intention information.
  • the blueprint module 603 includes:
  • the blueprint node determination unit 6031 is used to determine the corresponding blueprint node according to the intent name in the intent structure
  • the control unit 6032 is used to control the parameters of the blueprint node consistent with the intent parameters in the intent structure.
  • control execution module 604 is specifically configured to trigger the blueprint node; when the blueprint node is triggered, execute the next blueprint node corresponding to the blueprint node.
  • the intent recognition module 602 is specifically configured to send input information to the cloud; and receive intent information obtained after the cloud performs intent recognition on the input information.
  • FIG. 8 shows a schematic structural diagram of a robot device provided by an embodiment of the present disclosure.
  • the robotic device may be the dynamic intent enabled control device 60 or a stand-alone device independent of it that can communicate with the dynamic intent enabled control device 60 to receive collected input signals therefrom.
  • FIG. 8 illustrates a block diagram of a robotic device according to an embodiment of the present disclosure.
  • the robotic device 11 includes one or more processors 111 and a memory 112 .
  • the processor 111 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the robotic device 11 to perform desired functions.
  • CPU central processing unit
  • the processor 111 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the robotic device 11 to perform desired functions.
  • Memory 112 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 111 may execute the program instructions to implement the dynamic intent-supporting control methods of various embodiments of the present disclosure described above and / or other desired functionality.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the robotic device 11 may also include an input device 113 and an output device 114 interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 113 may be the above-mentioned microphone or microphone array for capturing the input signal of the sound source.
  • the input device 113 may be a communication network connector for receiving the collected input signal from the control device 60 supporting dynamic intent.
  • the input device 13 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 114 can output various information to the outside, including the determined distance information, direction information, and the like.
  • the output device 114 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • the robotic device 11 may also include any other suitable components depending on the specific application.
  • embodiments of the present disclosure may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification
  • the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be computer-readable storage media having computer program instructions stored thereon that, when executed by a processor, cause the processor to perform the above-described "Example Method" section of this specification Steps in a method for training a multi-task model according to various embodiments of the present disclosure described in .
  • the computer-readable storage medium may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • each component or each step can be decomposed and/or recombined. These disaggregations and/or recombinations should be considered equivalents of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Neurology (AREA)
  • Neurosurgery (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Dermatology (AREA)
  • Manipulator (AREA)

Abstract

本公开提供了一种支持动态意图的控制方法、装置及存储介质,该方法包括:机器人设备首先获取来自用户的输入信息;接着识别输入信息表征的意图信息;之后根据意图信息确定对应的蓝图节点;最后调用蓝图节点执行蓝图逻辑,以控制执行与意图信息匹配的行为操作。

Description

一种支持动态意图的控制方法、装置及存储介质 技术领域
本公开涉及人工智能技术领域,尤其涉及一种支持动态意图的控制方法、装置及存储介质。
背景技术
在机器人设备的开发应用中,当用户与机器人设备进行信息交互时,用户的输入信息会通过自然语言理解,从而形成一个个用户意图,以通过机器人设备理解并执行所述用户意图。由于用户意图的多样性,故机器人设备需要一种技术去支持这种多样化的意图。
发明内容
本公开实施例为了解决现有机器人设备与用户进行信息交互时所存在的问题,创造性地提供了一种支持动态意图的控制方法、装置及存储介质。
根据本公开第一方面,创造性地提供了一种支持动态意图的控制方法,所述方法包括:获取输入信息;识别所述输入信息表征的意图信息;根据所述意图信息确定对应的蓝图节点;调用所述蓝图节点执行蓝图逻辑,以控制执行与所述意图信息匹配的行为操作。
根据本公开一实施方式,所述意图信息为包括意图名称和意图参数的意图结构体。
根据本公开一实施方式,根据所述意图信息确定对应的蓝图节点包括:根据所述意图结构体中的意图名称确定对应的蓝图节点;控制所述蓝图节点的参数与所述意图结构体中的意图参数一致。
根据本公开一实施方式,调用所述蓝图节点执行蓝图逻辑,包括:当 所述蓝图节点被触发时,执行所述蓝图节点对应的下一个蓝图节点。
根据本公开一实施方式,识别所述输入信息表征的意图信息,包括:发送所述输入信息至云端;接收由云端对所述输入信息进行意图识别后所得到的意图信息。
根据本公开第二方面,还提供了一种支持动态意图的控制装置,所述装置包括:获取模块,用于获取输入信息;意图识别模块,用于识别所述输入信息表征的意图信息;蓝图模块,用于根据所述意图信息确定对应的蓝图节点;控制执行模块,用于调用所述蓝图节点执行蓝图逻辑,以控制执行与所述意图信息匹配的行为操作。
根据本公开一实施方式,所述意图信息为包括意图名称和意图参数的意图结构体。
根据本公开一实施方式,所述蓝图模块包括:蓝图节点确定单元,用于根据所述意图结构体中的意图名称确定对应的蓝图节点;控制单元,用于控制所述蓝图节点的参数与所述意图结构体中的意图参数一致。
根据本公开一实施方式,所述控制执行模块,具体用于触发所述蓝图节点;当所述蓝图节点被触发时,执行所述蓝图节点对应的下一个蓝图节点。
根据本公开一实施方式,所述意图识别模块,具体用于发送所述输入信息至云端;接收由云端对所述输入信息进行意图识别后所得到的意图信息。
根据本公开第三方面,又提供了一种支持动态意图的控制装置,包括:一个或多个处理器;存储器,用于存储一个或多个程序,所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述任一支持动态意图的控制方法。
根据本公开第四方面,又提供了一种计算机可读存储介质,所述存储 介质包括一组计算机可执行指令,当所述指令被执行时用于执行上述任一支持动态意图的控制方法。
本公开实施例支持动态意图的控制方法、装置及存储介质,机器人设备首先获取来自用户的输入信息;接着识别输入信息表征的意图信息;之后根据意图信息确定对应的蓝图节点;最后调用蓝图节点执行蓝图逻辑,以控制执行与意图信息匹配的行为操作。如此,本公开机器人设备通过借助虚拟引擎的蓝图技术,使得应用蓝图根据意图信息触发对应的蓝图节点执行相关的逻辑,得以控制机器人设备的行为,从而实现机器人设备对动态意图的支持。
需要理解的是,本公开的教导并不需要实现上面所述的全部有益效果,而是特定的技术方案可以实现特定的技术效果,并且本公开的其他实施方式还能够实现上面未提到的有益效果。
附图说明
通过参考附图阅读下文的详细描述,本公开示例性实施方式的上述以及其他目的、特征和优点将变得易于理解。在附图中,以示例性而非限制性的方式示出了本公开的若干实施方式,其中:
在附图中,相同或对应的标号表示相同或对应的部分。
图1示出了本公开实施例支持动态意图的控制方法的实现流程示意图一。
图2示出了本公开实施例支持动态意图的控制方法的实现流程示意图二。
图3示出了本公开一应用实例在云端的意图配置的响应处理的蓝图。
图4示出了本公开另一应用实例在云端的意图配置的响应处理的蓝图。
图5示出了本公开又一应用实例在云端的意图配置的响应处理的蓝图。
图6示出了本公开实施例支持动态意图的控制装置的组成结构示意图 一。
图7示出了本公开实施例支持动态意图的控制装置的组成结构示意图二。
图8示出了本公开实施例提供的机器人设备的组成结构示意图。
具体实施方式
为使本公开的目的、特征、优点能够更加的明显和易懂,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而非全部实施例。基于本公开中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本公开的至少一个实施例或示例中。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或隐含地包括至少一个该特征。在本公开的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。
图1示出了本公开实施例支持动态意图的控制方法的实现流程示意图一。本公开实施例可应用在机器人设备上,如图1所示,包括如下步骤:
步骤101,获取输入信息。
具体地,机器人设备获取来自用户的输入信息。
其中,输入信息可以是来自用户的语音信息,如机器人设备通过麦克风阵列硬件采集用户的语音信息;输入信息也可以是响应于用户触发,机器人设备自动生成的指令信息,如当用户在机器人设备的硬件触发按钮或触控界面上的软件触发按键进行按键触发时,响应于用户触发所生成的与触发按键匹配的指令信息;输入信息还可以是用户在机器人设备的显示界面的输入区域所输入的文本信息。
当然,本领域技术人员应该理解的是,在步骤101接收输入信息之后,机器人设备可以进一步对输入信息进行预处理,比如对输入语句进行分句、去除停用词、去除特殊字符等预处理操作;之后,再继续执行后续步骤102。
步骤102,识别输入信息表征的意图信息。
其中,意图信息为包括意图名称和意图参数的意图结构体。
具体地,机器人设备可以通过自身的意识识别系统对输入信息进行意图识别,得到包括意图信息的识别结果;机器人设备还可以借助云端的意图识别系统来对输入信息进行意图识别,从而得到包括意图信息的识别结果。
本领域技术人员应该理解的是,机器人设备无论是通过自身的意图识别系统还是借助云端的意图识别系统来对输入信息进行意图识别,其意图识别的具体实现都类似。
在一应用示例中,以输入信息为语音信息为例,机器人设备可以通过意图识别系统进行自然语音处理(NLP)和自然语音理解(NLU),从而识别得到输入信息所表征的意图信息。
步骤103,根据意图信息确定对应的蓝图节点。
具体地,机器人设备基于虚拟引擎的应用的蓝图模块,根据所接收到的意图结构体的意图名称确定对应的蓝图节点,并控制蓝图节点的参数与意图结构体的意图参数一致。当然,对于意图结构体中意图参数为空的情 况,可以在确定对应的蓝图节点后,省略对蓝图节点参数的控制操作。
步骤104,调用蓝图节点执行蓝图逻辑,以控制执行与意图信息匹配的行为操作。
具体地,机器人设备触发蓝图节点,当蓝图节点被触发时,自动执行蓝图节点对应的下一个蓝图节点,以控制执行与意图信息匹配的行为操作。
如此,本公开机器人设备通过借助虚拟引擎的蓝图技术,使得应用蓝图根据意图信息触发对应的蓝图节点执行相关的逻辑,得以控制机器人设备的行为,从而实现机器人设备对动态意图的支持。
图2示出了本公开实施例支持动态意图的控制方法的实现流程示意图二;图3示出了本公开一应用实例在云端的意图配置的响应处理的蓝图;图4示出了本公开另一应用实例在云端的意图配置的响应处理的蓝图;图5示出了本公开又一应用实例在云端的意图配置的响应处理的蓝图。
参考图2,本公开实施例支持动态意图的控制方法可应用在机器人设备上,具体包括如下步骤:
步骤201,获取输入信息。
具体地,机器人设备获取来自用户的输入信息。
其中,输入信息可以是来自用户的语音信息,如机器人设备通过麦克风阵列硬件采集用户的语音信息;输入信息也可以是响应于用户触发,机器人设备自动生成的指令信息,如当用户在机器人设备的硬件触发按钮或触控界面上的软件触发按键进行按键触发时,响应于用户触发所生成的与触发按键匹配的指令信息;输入信息还可以是用户在机器人设备的显示界面的输入区域所输入的文本信息。
当然,本领域技术人员应该理解的是,在步骤201接收输入信息之后,机器人设备可以进一步对输入信息进行预处理,比如对输入语句进行分句、去除停用词、去除特殊字符等预处理操作;之后,在继续执行后续步骤202。
步骤202,发送输入信息至云端。
具体地,机器人设备将接收的输入信息通过机器人控制单元(RCU)发送至云端,也可称作云端大脑。
步骤203,接收由云端对输入信息进行意图识别后所得到的意图信息。
其中,意图信息为包括意图名称和意图参数的意图结构体。以输入信息为语音信息为例,机器人设备可以通过云端大脑的意图识别系统进行自然语音处理(NLP)和自然语音理解(NLU),从而识别得到输入信息所表征的意图信息,并反馈所述意图信息至机器人设备。
步骤204,根据意图信息确定对应的蓝图节点。
具体地,机器人设备基于虚拟引擎的应用的蓝图模块,根据所接收到的意图结构体的意图名称确定对应的蓝图节点,并控制蓝图节点的参数与意图结构体的意图参数一致。当然,对于意图结构体中意图参数为空的情况,可以在确定对应的蓝图节点后,省略对蓝图节点参数的控制操作。
步骤205,调用蓝图节点执行蓝图逻辑,以控制执行与意图信息匹配的行为操作。
具体地,机器人设备触发蓝图节点,当蓝图节点被触发时,自动执行蓝图节点对应的下一个蓝图节点,以控制执行与意图信息匹配的行为操作。
在一应用实例中,用户与机器人设备进行语音交互,机器人设备获取输入信息(语音信息)“请做动作碰拳”,该输入信息(语音信息)在云端的意图配置为:意图结构体中的意图名称为“做动作(TakeAction)”,意图参数为“动作名称(motionName)”,意图参数motionName对应的值为“碰拳”。进一步地,该意图结构体从云端发送到机器人设备本体的机器人控制单元(RCU)时,响应处理的蓝图如图3所示,蓝图节点名称为“做动作(TakeAction)”,和意图结构体的意图名称一一对应,蓝图节点的参数为“手游动作名称(PlayMotionName)”,对应于意图结构的意图参数,它的值为 “碰拳”。当该蓝图节点被触发时,它会执行下一个蓝图节点“手游动作(PlayMotion)”,该蓝图节点控制机器人设备的行为,即控制机器人设备执行动作“碰拳”。
在另一应用实例中,用户与机器人设备进行语音交互,机器人设备获取输入信息(语音信息)“向前一点”,该输入信息(语音信息)在云端的意图配置为:意图结构体中的意图名称为“向前移动(MoveForward)”,无意图参数,即意图参数为空。进一步地,该意图结构体从云端发送到机器人设备本体的机器人控制单元(RCU)时,响应处理的蓝图如图4所示,蓝图节点名称为“向前移动(MoveForward)”,和意图结构体的意图名称一一对应,蓝图节点无参数。当该蓝图节点被触发时,它会执行下一个蓝图节点“移动(Move)”,该蓝图节点控制机器人设备的行为,即控制机器人设备执行向前移动,移动的距离可以是预设的移动距离。
在又一应用实例中,用户与机器人设备进行语音交互,机器人设备获取输入信息(语音信息)“走到桌子那”,该输入信息(语音信息)在云端的意图配置为:意图结构体中的意图名称为“导航(navigationToPosion)”,意图参数为“目的地(destination)”,意图参数destination对应的值为“桌子”。进一步地,该意图结构体从云端发送到机器人设备本体的机器人控制单元(RCU)时,响应处理的蓝图如图5所示,蓝图节点名称为“导航(navigationToPosion)”,和意图结构体的意图名称一一对应,蓝图节点的参数为“目的地(destination)”,对应于意图结构的意图参数,它的值为“桌子”。当该蓝图节点被触发时,它会执行下一个蓝图节点“CS导航技能(CSNavigate Skill)”,该蓝图节点控制机器人设备导航到目的地“桌子”对应的坐标。
如此,本公开机器人设备首先借助云端大脑对接收到的输入信息进行意图识别,得到输入信息表征的意图信息;接着借助虚拟引擎的蓝图技术, 使得应用蓝图根据意图信息触发对应的蓝图节点执行相关的逻辑,得以控制机器人设备的行为,从而实现机器人设备对动态意图的支持。
图6示出了本公开实施例支持动态意图的控制装置的组成结构示意图一;图7示出了本公开实施例支持动态意图的控制装置的组成结构示意图二。
参考图6,本公开实施例支持动态意图的控制装置60,包括:
获取模块601,用于获取输入信息;
意图识别模块602,用于识别输入信息表征的意图信息;其中,意图信息为包括意图名称和意图参数的意图结构体。
蓝图模块603,用于根据意图信息确定对应的蓝图节点;以及
控制执行模块604,用于调用蓝图节点执行蓝图逻辑,以控制执行与意图信息匹配的行为操作。
在一可实施方式中,如图7所示,蓝图模块603包括:
蓝图节点确定单元6031,用于根据意图结构体中的意图名称确定对应的蓝图节点;
控制单元6032,用于控制蓝图节点的参数与意图结构体中的意图参数一致。
在一可实施方式中,控制执行模块604,具体用于触发蓝图节点;当蓝图节点被触发时,执行蓝图节点对应的下一个蓝图节点。
在一可实施方式中,意图识别模块602,具体用于发送输入信息至云端;接收由云端对输入信息进行意图识别后所得到的意图信息。
图8示出了本公开实施例提供的机器人设备的组成结构示意图。
下面,参考图8来描述根据本公开实施例的机器人设备。该机器人设备可以是支持动态意图的控制装置60或与它独立的单机设备,该单机设备可以与支持动态意图的控制装置60进行通信,以从它们接收所采集到的输 入信号。
图8图示了根据本公开实施例的机器人设备的框图。
如图8所示,机器人设备11包括一个或多个处理器111和存储器112。
处理器111可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制机器人设备11中的其他组件以执行期望的功能。
存储器112可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器111可以运行所述程序指令,以实现上文所述的本公开的各个实施例的支持动态意图的控制方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。
在一个示例中,机器人设备11还可以包括:输入装置113和输出装置114,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。
例如,在该机器人设备是支持动态意图的控制装置60时,该输入装置113可以是上述的麦克风或麦克风阵列,用于捕捉声源的输入信号。在该电子设备是单机设备时,该输入装置113可以是通信网络连接器,用于从支持动态意图的控制装置60接收所采集的输入信号。
此外,该输入装置13还可以包括例如键盘、鼠标等等。
该输出装置114可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出装置114可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等。
当然,为了简化,图8中仅示出了该机器人设备11中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,机器人设备11还可以包括任何其他适当的组件。
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的多任务模型的训练方法中的步骤。
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的多任务模型的训练方法中的步骤。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储 器件、或者上述的任意合适的组合。
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。
本公开中涉及的器件、装置、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇,指“包括但不限于”,且可与其互换使用。这里所使用的词汇“或”和“和”指词汇“和/或”,且可与其互换使用,除非上下文明确指示不是如此。这里所使用的词汇“诸如”指词组“如但不限于”,且可与其互换使用。
还需要指出的是,在本公开的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本公开的等效方案。
提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本公开。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一般原理可以应用于其他方面而不脱离本公开的范围。因此,本公开不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。
为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本公开的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添 加和子组合。

Claims (10)

  1. 一种支持动态意图的控制方法,其特征在于,所述方法包括:
    获取输入信息;
    识别所述输入信息表征的意图信息;
    根据所述意图信息确定对应的蓝图节点;
    调用所述蓝图节点执行蓝图逻辑,以控制执行与所述意图信息匹配的行为操作。
  2. 根据权利要求1所述的方法,其特征在于,所述意图信息为包括意图名称和意图参数的意图结构体。
  3. 根据权利要求2所述的方法,其特征在于,根据所述意图信息确定对应的蓝图节点包括:
    根据所述意图结构体中的意图名称确定对应的蓝图节点;
    控制所述蓝图节点的参数与所述意图结构体中的意图参数一致。
  4. 根据权利要求1所述的方法,其特征在于,调用所述蓝图节点执行蓝图逻辑,包括:
    触发所述蓝图节点;
    当所述蓝图节点被触发时,执行所述蓝图节点对应的下一个蓝图节点。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,识别所述输入信息表征的意图信息,包括:
    发送所述输入信息至云端;
    接收由云端对所述输入信息进行意图识别后所得到的意图信息。
  6. 一种支持动态意图的控制装置,其特征在于,所述装置包括:
    获取模块,用于获取输入信息;
    意图识别模块,用于识别所述输入信息表征的意图信息;
    蓝图模块,用于根据所述意图信息确定对应的蓝图节点;
    控制执行模块,用于调用所述蓝图节点执行蓝图逻辑,以控制执行与所述意图信息匹配的行为操作。
  7. 根据权利要求6所述的装置,其特征在于,所述意图信息为包括意图名称和意图参数的意图结构体。
  8. 根据权利要求7所述的装置,其特征在于,所述蓝图模块包括:
    蓝图节点确定单元,用于根据所述意图结构体中的意图名称确定对应的蓝图节点;
    控制单元,用于控制所述蓝图节点的参数与所述意图结构体中的意图参数一致。
  9. 一种支持动态意图的控制装置,其特征在于,包括:一个或多个处理器;存储器,用于存储一个或多个程序,所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1至5任一项所述的支持动态意图的控制方法。
  10. 一种计算机可读存储介质,其特征在于,所述存储介质包括一组计算机可执行指令,当所述指令被执行时用于执行权利要求1至5任一项所述的支持动态意图的控制方法。
PCT/CN2021/120604 2020-10-12 2021-09-26 一种支持动态意图的控制方法、装置及存储介质 WO2022078189A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011083939.0A CN112306236B (zh) 2020-10-12 2020-10-12 一种支持动态意图的控制方法、装置及存储介质
CN202011083939.0 2020-10-12

Publications (1)

Publication Number Publication Date
WO2022078189A1 true WO2022078189A1 (zh) 2022-04-21

Family

ID=74488410

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/120604 WO2022078189A1 (zh) 2020-10-12 2021-09-26 一种支持动态意图的控制方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN112306236B (zh)
WO (1) WO2022078189A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306236B (zh) * 2020-10-12 2022-09-06 达闼机器人股份有限公司 一种支持动态意图的控制方法、装置及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190384630A1 (en) * 2018-06-19 2019-12-19 Sap Se Service blueprint creation for complex service calls
CN111143523A (zh) * 2019-12-02 2020-05-12 北京声智科技有限公司 意图确认方法及装置
CN111552238A (zh) * 2020-04-17 2020-08-18 达闼科技(北京)有限公司 机器人控制方法、装置、计算设备及计算机存储介质
CN112306236A (zh) * 2020-10-12 2021-02-02 达闼机器人有限公司 一种支持动态意图的控制方法、装置及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106914018B (zh) * 2017-03-07 2018-01-30 深圳前海小橙网科技有限公司 基于ue4的交互式虚拟现实的实现方法及其系统
CN108579086B (zh) * 2018-03-27 2019-11-08 腾讯科技(深圳)有限公司 对象的处理方法、装置、存储介质和电子装置
CN111494957B (zh) * 2020-04-17 2023-04-07 网易(杭州)网络有限公司 游戏场景的数据处理方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190384630A1 (en) * 2018-06-19 2019-12-19 Sap Se Service blueprint creation for complex service calls
CN111143523A (zh) * 2019-12-02 2020-05-12 北京声智科技有限公司 意图确认方法及装置
CN111552238A (zh) * 2020-04-17 2020-08-18 达闼科技(北京)有限公司 机器人控制方法、装置、计算设备及计算机存储介质
CN112306236A (zh) * 2020-10-12 2021-02-02 达闼机器人有限公司 一种支持动态意图的控制方法、装置及存储介质

Also Published As

Publication number Publication date
CN112306236B (zh) 2022-09-06
CN112306236A (zh) 2021-02-02

Similar Documents

Publication Publication Date Title
US10656909B2 (en) Learning intended user actions
KR102490776B1 (ko) 디지털 개인 비서 내에서 헤드리스로 작업을 완료하기 위한 기법
KR102498811B1 (ko) 자동화된 어시스턴트를 호출하기 위한 다이내믹 및/또는 컨텍스트 특정 핫워드
WO2018126935A1 (zh) 基于语音的交互方法、装置、电子设备及操作系统
US10504513B1 (en) Natural language understanding with affiliated devices
KR20200007882A (ko) 자동 비서를 위한 명령 번들 제안 제공
JP7313378B2 (ja) ルーチンの実行中のクライアントデバイス同士の間の自動アシスタントルーチンの転送
JP2020528566A (ja) ホットワード認識音声合成
US10831297B2 (en) Method, apparatus and computer-readable media for touch and speech interface
JP7017643B2 (ja) テキスト非依存話者認識
US11393490B2 (en) Method, apparatus, device and computer-readable storage medium for voice interaction
WO2019232980A1 (zh) 节点配置方法及装置、计算机可读存储介质和电子设备
US20190304455A1 (en) Electronic device for processing user voice
KR20200124298A (ko) 원격으로 생성된 자동화된 어시스턴트 콘텐츠를 렌더링할 때 클라이언트 디바이스 지연 완화
WO2022078189A1 (zh) 一种支持动态意图的控制方法、装置及存储介质
KR20230005351A (ko) 자동 음성 어시스턴트의 오류 감지 및 처리
KR20210001082A (ko) 사용자 발화를 처리하는 전자 장치와 그 동작 방법
US11238865B2 (en) Function performance based on input intonation
CN111353017A (zh) 智能交互方法和装置
KR20210001905A (ko) 전자 장치 및 그 제어 방법
Lee et al. Universal voice-enabled user interfaces using JavaScript
US11978458B2 (en) Electronic apparatus and method for recognizing speech thereof
US20240038246A1 (en) Non-wake word invocation of an automated assistant from certain utterances related to display content
CN112242139B (zh) 语音交互方法、装置、设备和介质
CN115472162A (zh) 通信终端的控制方法、装置、介质及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21879235

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21879235

Country of ref document: EP

Kind code of ref document: A1