CN117400238A - Task execution method of robot, robot and storage medium - Google Patents

Task execution method of robot, robot and storage medium Download PDF

Info

Publication number
CN117400238A
CN117400238A CN202311247246.4A CN202311247246A CN117400238A CN 117400238 A CN117400238 A CN 117400238A CN 202311247246 A CN202311247246 A CN 202311247246A CN 117400238 A CN117400238 A CN 117400238A
Authority
CN
China
Prior art keywords
key value
instruction
standard
robot
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311247246.4A
Other languages
Chinese (zh)
Inventor
温焕宇
赖有仿
何婉君
焦继超
韦和钧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ubtech Robotics Corp
Original Assignee
Ubtech Robotics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ubtech Robotics Corp filed Critical Ubtech Robotics Corp
Priority to CN202311247246.4A priority Critical patent/CN117400238A/en
Publication of CN117400238A publication Critical patent/CN117400238A/en
Pending legal-status Critical Current

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Automation & Control Theory (AREA)
  • Manipulator (AREA)

Abstract

The application is applicable to the technical field of robots, and provides a task execution method of a robot, the robot and a storage medium, wherein the method comprises the following steps: acquiring a voice task instruction input by a user, wherein the voice task instruction is an instruction for instructing a robot to execute a task; extracting a first key value in the voice task instruction; searching a standard key value matched with the first key value, wherein the standard key value is prestored and accurate semantic information which can be understood by the robot; determining the user intention corresponding to the voice task instruction according to the standard key value matched with the first key value; executing the task indicated by the voice task instruction based on the user intention; the robot can rapidly and accurately complete the task indicated by the voice task instruction according to the user intention.

Description

Task execution method of robot, robot and storage medium
Technical Field
The application belongs to the technical field of robots, and particularly relates to a task execution method of a robot, the robot and a storage medium.
Background
With the development of technology, the application fields of robots are more and more, and tasks executable by the robots are more and more. At present, when a user controls a robot to execute a task through a voice command, the task execution error or failure is often caused due to inaccurate analysis of the voice command by the robot, so how to accurately analyze the voice command is a problem to be solved at present.
Disclosure of Invention
The embodiment of the application provides a task execution method of a robot, the robot and a storage medium, which can solve the problems of inaccurate analysis of a voice instruction and unsatisfactory task completion of the robot.
In a first aspect, an embodiment of the present application provides a method for performing a task of a robot, including:
acquiring a voice task instruction input by a user, wherein the voice task instruction is an instruction for instructing a robot to execute a task;
extracting a first key value in the voice task instruction;
searching a standard key value matched with the first key value;
determining the user intention corresponding to the voice task instruction according to the standard key value matched with the first key value;
based on the user intent, executing the task indicated by the voice task instruction.
In a second aspect, embodiments of the present application provide a robot, including:
the instruction acquisition module is used for acquiring a voice task instruction input by a user, wherein the voice task instruction is an instruction for instructing the robot to execute a task;
the key value extraction module is used for extracting a first key value in the voice task instruction;
the key value matching module is used for searching a standard key value matched with the first key value;
The intention determining module is used for determining the user intention corresponding to the voice task instruction according to the standard key value matched with the first key value;
and the instruction execution module is used for executing the task indicated by the voice task instruction based on the user intention.
In a third aspect, an embodiment of the present application provides a robot, including: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the method of any one of the above first aspects when the computer program is executed.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of the first aspects.
In a fifth aspect, embodiments of the present application provide a computer program product for, when run on a terminal device, causing the terminal device to perform the method of any one of the first aspects.
Compared with the prior art, the embodiment of the first aspect of the application has the beneficial effects that: after the voice task instruction is acquired, the first key value in the voice task instruction is extracted, and the first key value is matched with the standard key value, and because the standard key value is accurate semantic information which can be understood by the robot, the accurate user intention expressed by the voice task instruction which can be read by the robot can be determined according to the standard key value; because the determined user intention is an accurate user intention, the robot can rapidly and accurately complete the task indicated by the voice task instruction according to the user intention.
It will be appreciated that the advantages of the second to fifth aspects may be found in the relevant description of the first aspect, and are not described here again.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required for the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a task execution method of a robot according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for determining a key value in a voice task instruction according to an embodiment of the present application;
FIG. 3 is a flow chart of a method for determining user intent provided in an embodiment of the present application;
FIG. 4 is a flow chart illustrating a method for determining key information according to an embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating a method for generating standard key values according to an embodiment of the present disclosure;
fig. 6 is a flowchart of a task execution method of a robot according to another embodiment of the present application;
Fig. 7 is a schematic structural view of a robot according to an embodiment of the present disclosure;
fig. 8 is a schematic structural view of a robot according to another embodiment of the present application.
Detailed Description
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used in this specification and the appended claims, the term "if" may be interpreted in context as "when … …" or "upon" or "in response to determining" or "in response to detecting". Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise.
At present, an autonomous mobile robot needs to rely on specific coordinates set by people when navigating, the robot can only navigate according to the set coordinates, and the robot cannot understand more information of the coordinates, so that the robot cannot make more intelligent behaviors, for example, the robot cannot conduct autonomous navigation and interact with people according to voice instructions.
Based on the defects, aiming at the robot, the intelligent interaction and navigation system is constructed, and the function of the robot for completing tasks through autonomous navigation is realized. Specifically, the robot in the application can control the robot to perform corresponding actions before reaching a certain object or entering a certain room according to the voice task instruction, for example, taking things, pressing a certain key, and the like. In addition, the robot in the application can interact with a person according to the voice command, so that the task can be completed better.
The following describes in detail a task execution method of a robot provided in the present application, and specifically describes a process of executing a navigation task by a navigation robot as an example.
Fig. 1 shows a schematic flow chart of a task execution method of a robot provided in the present application, and with reference to fig. 1, the method is described in detail as follows:
s101, acquiring a voice task instruction input by a user, wherein the voice task instruction is an instruction for instructing a robot to execute a task.
In this embodiment, a user sends a voice task instruction, and a voice acquisition module on the robot acquires the voice task instruction sent by the user.
By way of example, the voice task instruction may be "go to the nearest desk front", "go to the first office refrigerator front", "help me take a cup of water" or "play a song" etc.
S102, extracting a first key value in the voice task instruction.
In this embodiment, the robot analyzes and processes the voice task instruction, screens out the key value in the voice task instruction according to the requirement, in this application, marks the key value in the screened voice task instruction as the first key value, and the key value can be understood as a keyword to a certain extent, and can represent the semantic feature of the voice task instruction.
Specifically, the robot analyzes parts of speech of words in the voice task instruction, and determines verbs, nouns, adjectives and the like in the voice task instruction; and analyzing the grammar structure of the voice task instruction to obtain a subject, a predicate and the like in the voice task instruction. According to the grammar structure of the voice task instruction and the part of speech of each word, determining the word meeting the preset requirement, and recording the word meeting the preset requirement as a first key value, wherein the preset requirement can comprise verbs, names of predicate positions and the like.
The first key value is the primary word in the voice task instruction that is useful for resolving the user's intent.
By way of example, the first key values in "go to nearest table front" are "go" and "table". The first key values in "go to front of office refrigerator" are "go", "serial number text (one number)", "office" and "refrigerator". The first key values in "help me take a cup of water" are "take" and "water".
S103, searching a standard key value matched with the first key value, wherein the standard key value is prestored and accurate semantic information which can be understood by the robot.
In this embodiment, the standard key value is stored in the robot in advance, and the standard key value may be generated according to a semantic map, or manually set according to the surrounding environment of the robot. The standard key value reflects the user's intention to some extent.
Each standard key value corresponds to one or more keywords or words. When searching the standard key value matched with the first key value, if the first key value exists in the keyword or word corresponding to the standard key value, the standard key value can be used as the standard key value matched with the first key value.
By way of example, the first key value "go" in "go to nearest table front" matches the standard key value "navigate" for which the user intends to navigate; the standard key value for "table" match is "table" and the user intends to be a table. The first key value in the front of the office refrigerator going to the first number is the standard key value matched with the first key value going to the first number is navigation; the standard key value matched with the serial number text (number one) is 'number one'; the standard key value of the office match is office; the standard key value for the "refrigerator" match is "refrigerator". The first key value "take" in "help me take a cup of water" is the matched standard key value "navigation"; the standard key value for "water" match is "water".
In this embodiment, the standard key value may be stored in a hash table, and decoding and matching are performed by using the first key value and the standard key value in the hash table, so as to obtain a standard key value matched with the first key value.
S104, determining the user intention corresponding to the voice task instruction according to the standard key value matched with the first key value.
In this embodiment, the user intent is determined based on the determined intent of the standard key representation and the voice task instruction.
By way of example, when a robot is navigating a robot, a user "going to the front of the nearest table" intends to "navigate to the front of the nearest table". The user 'going to the front of the first office refrigerator' intends to 'navigate to the first office and navigate to the front of the refrigerator'. The user who "helps me take a cup of water" intends to "navigate to the nearest water front".
In this embodiment, the user intention may further include key information required for executing the user intention, where the key information required for executing the user intention is determined based on the standard key value, and key information corresponding to the standard key value is preset. The key information may be spatial information of objects, category information of objects, attribute relationships between objects, and the like.
By way of example, the key information in the user's intent to "go to the nearest front of the table" may include the spatial coordinates of the table. The key information in the user's intention to go to the front of the first office refrigerator may include the space coordinates of the office, the serial number of the office, the space coordinates of the refrigerator, the positional relationship between the office and the refrigerator, and the like. The key information in the user's intention to help me take a cup of water may include the location coordinates where the water is located, etc.
S105, executing the task indicated by the voice task instruction based on the user intention.
In this embodiment, the robot performs a corresponding task according to the user's intention. Specifically, the robot performs autonomous navigation to complete a task according to the intention of a user and according to a pre-stored map, and the pre-stored map may be a semantic map or a plane map.
In the embodiment of the application, after the voice task instruction is acquired, a first key value in the voice task instruction is extracted, and the first key value is matched with a standard key value, and because the standard key value is accurate semantic information which can be understood by a robot, the accurate user intention expressed by the voice task instruction which can be read by the robot can be determined according to the standard key value; because the determined user intention is an accurate user intention, the robot can rapidly and accurately complete the task indicated by the voice task instruction according to the user intention.
In one possible implementation, when extracting the first key value in the voice task instruction, the voice task instruction needs to be converted into text information, and the text information is identified to obtain the first key value.
As shown in fig. 2, specifically, the step S102 may specifically include:
S1021, converting the voice task instruction into text information.
Specifically, a voice task instruction is input into a text conversion model to obtain text information.
S1022, searching verbs and task objects in the text information, and taking the verbs and the task objects in the text information as first key values in the voice task instruction.
In this embodiment, verbs in the text information are extracted, the verbs are words that represent actions that need to be performed by the robot, for example, "take", "go" and "get", and the extracted verbs are described as the first key values.
Task objects in the text information are extracted, the task objects are objects for the robot to execute tasks, for example, "table", "chair" and "office", and the extracted task objects are marked as first key values.
In this embodiment, since the verb in the voice task instruction represents the action required to be executed by the robot and the task object represents the object required to be executed by the robot, both the verb and the task object in the voice task instruction are information that plays a key role in the user intention of the voice task instruction, and therefore, the verb and the task object in the voice task instruction can accurately reflect the user intention by using the verb and the task object as the first key value.
As shown in fig. 3, in one possible implementation, the implementation procedure of step S104 may include:
s1041, replacing the first key value in the voice task instruction by using the standard key value matched with the first key value to obtain a standard instruction corresponding to the voice task instruction.
In this embodiment, the first key value in the voice task instruction is replaced by the corresponding standard key value, and the replaced voice task instruction is recorded as the standard instruction.
For example, the first key value "go" in "go to nearest desk front" matches the standard key value "navigate", and the first key value "desk" matches the standard key value "desk"; the "go" is replaced by the "navigate" and the "desk" is replaced by the "desk" to get the standard instruction "navigate to the nearest desk front".
In one implementation, a standard key value different from the first key value in the standard key values is searched, and the corresponding first key value is replaced by the standard key value different from the first key value.
In the application, the first key value is replaced by the standard key value, and the standard key value is a key value which can be understood by the robot, so that the obtained standard instruction is an instruction which can be accurately understood by the robot.
S1042, obtaining key information corresponding to the standard key value matched with the first key value, wherein the key information comprises the spatial position information of the task object in the standard instruction.
In this embodiment, the key information is information required to execute the intention characterized by the standard key value. Each standard key characterizes an intent. The spatial location information may include location coordinates of the task object, spatial location relationships of the task object with other objects, and the like.
The key information corresponding to the standard key value is preset, and the standard key value and the corresponding key information can be extracted from the semantic map or manually set. The standard key values and the corresponding key information form a voice visual dictionary, and the semantic visual dictionary can be expressed in the form of a hash table. And decoding the standard key value and the key information in the hash table when the standard key value corresponding to the first key value is searched, and then matching the decoded standard key value with the first key value.
S1043, generating the user intention corresponding to the voice task instruction based on the standard instruction and the key information.
In this embodiment, the standard instruction and the key information are spliced to generate the user intention.
In the embodiment of the application, the standard instruction is generated by using the standard key value matched with the first key value, the key information required by executing the standard instruction is searched by using the standard key value matched with the first key value, and finally the user intention is generated according to the standard instruction and the key information, so that the obtained user intention is more accurate and comprehensive, and the robot can accurately execute the task conveniently.
As shown in fig. 4, in one possible implementation, the implementation procedure of step S1042 may include:
s10421, extracting an action key value and a target key value in the standard key value matched with the first key value, wherein the target key value is a key value representing the task object.
In this embodiment, the attribute of the standard key value is the same as the attribute of the first key value. Since the first key value is the verb and the task object in the searched voice task instruction, the verb and the task object are set as standard key values when the standard key values are set.
The object key may include the name and number of the task object. For example, the target key value may be "table No. one" or "chair No. two", etc.
S10422, searching an instruction execution interface corresponding to the action key value, wherein the instruction execution interface is used for executing the standard instruction, and the key information corresponding to the action key value comprises the instruction execution interface.
In this embodiment, the action key value characterizes an action that needs to be executed by the robot, and the robot needs a corresponding interface to execute the action, so that an instruction execution interface that is needed to execute the action, that is, an instruction execution interface that is needed to execute a standard instruction, can be searched according to the action key value. For example, when the action key value is "navigation", the instruction execution interface corresponding to the action key value is "navigation interface".
S10423, searching the space position information of the task object represented by the target object key value, wherein the key information corresponding to the target object key value comprises the space position information of the task object.
In this embodiment, the target key value is a key value representing a task object, and the task object is an object that requires a robot to execute a task, so if the robot is required to navigate to the task object, it is necessary to know the position of the task object, the positional relationship between the task object and other objects, and so on, and therefore if the standard key value is the target key value, the key information corresponding to the target key value needs to include the spatial position information of the task object.
As shown in fig. 5, in one possible implementation, the standard key values and corresponding key information may be determined based on a semantic map. Specifically, the method for determining the standard key value and the corresponding key information comprises the following steps:
S201, acquiring a semantic map of a task area where the robot is located.
In this embodiment, the semantic map is preset. When the robot executes the voice task instruction, the environment information of the task area can be collected, and the semantic map is optimized by utilizing the environment information.
S202, generating a standard key value of an object based on identification information of the object existing in the semantic map.
In this embodiment, the identification information may include the name and number of the object. The identification information of the object is used as a standard key value of the object, for example, a table No. three is a standard key value.
S203, generating key information of the object based on the spatial position information of the object in the semantic map.
In one embodiment, key information for the object is generated based on spatial location information of the object, a category of the object, and the like.
S204, acquiring each instruction execution interface in the robot.
In this embodiment, the instruction execution interface of the robot may include a navigation interface, a music playing interface, and the like.
S205, setting a standard key value corresponding to the instruction execution interface based on the action which can be executed by the instruction execution interface, wherein key information corresponding to the standard key value of the instruction execution interface is the instruction execution interface.
In this embodiment, the standard key value of the instruction execution interface characterizes the instruction executable by the instruction execution interface, for example, the standard key value of the navigation interface may be set to "navigation".
After setting the standard key value corresponding to the instruction interface, keywords of the standard key value corresponding to the instruction execution interface may be set, and the standard key value corresponding to the instruction execution interface may also correspond to one or more keywords, for example, the keywords corresponding to the standard key value "navigation" may include: go, take, etc.
In the embodiment of the application, the standard key value and the key information of the object in the task area are determined according to the semantic map, so that the determined standard key value and key information are more accurate and more in line with the environment of the task area; and setting a standard key value corresponding to the instruction execution interface, so that the robot can determine the interface required to execute the task according to the standard key value, and the robot can smoothly complete the task.
In one possible implementation manner, if each first key value can find a standard key value matched with the first key value, it is stated that the robot can determine the user intention according to the conversion of the voice task instruction, so the implementation process of the step S104 may further include:
And if all the first key values are matched with the standard key values, determining the user intention corresponding to the voice task instruction according to the standard key values matched with the first key values.
In one possible implementation, if the first key value which does not match the standard key value exists, the robot may prompt the user to re-input the voice task instruction, which indicates that the accurate user intention cannot be obtained according to the standard key value.
Specifically, after step S103, the above method may further include:
and if the first key value which is not matched with the standard key value exists, outputting prompt information, wherein the prompt information is used for prompting a user to determine the user intention of the voice task instruction.
In one embodiment, if all the first key values do not match the standard key values, a first prompt message is output, where the first prompt message is used to prompt the user that the user cannot determine the user intention expressed by the voice task instruction, that is, the user intention does not exist in the voice task instruction, and the user is required to reissue the voice task instruction.
For example, if the voice task instruction is "help me take water", the first key values are "take" and "water", and neither of the first key values can be matched to the standard key value, the robot determines that there is no user intention.
In one embodiment, if there is a first key value matching the standard key value in the first key value, and there is also a first key value not matching the standard key value, a second prompt message is output, where the second prompt message is used to prompt the user that the user cannot accurately determine the user intention expressed by the voice task instruction, that is, the user intention exists in the voice task instruction, but the user intention is ambiguous, and the user is required to adjust the voice task instruction so as to give more detailed information.
For example, if the voice task instruction is "next to water", the robot matches the standard key value of the first key value "go", but does not match the standard key value of the first key value "water", that is, there is no person nearby the robot, the user intention of determining the voice task instruction is ambiguous, and the robot can output "i find no water nearby, please re-determine the request", and remind the user to re-input the voice task instruction.
If the voice task instruction is before the refrigerator going to the first office, the robot is matched with the standard key values of the first key value of the first office and the first office, but because a plurality of refrigerators exist in the first office, the robot is not matched with the standard key value of the first key value of the refrigerator, the user intention of the voice task instruction is not clear, the robot can output the refrigerator before going to the first office, and the user is reminded of inputting the voice task instruction again.
In the embodiment of the application, when the robot cannot obtain accurate user intention, the user is prompted to be unable to complete the task, the purpose of interaction between the robot and the user is achieved, and the intelligent degree of the robot is improved.
As shown in fig. 6, in one possible implementation manner, the method may further include:
the semantic map is encoded in advance to obtain a visual semantic dictionary, wherein the visual semantic dictionary comprises standard key values of objects, key information corresponding to the standard key values of the objects, standard key values of an instruction execution interface which are set manually and key information corresponding to the standard key values of the instruction execution interface, which are determined based on the semantic map.
After receiving a voice task instruction input by a user, the robot converts the voice task instruction into text information and extracts a first key value in the text information.
And matching the first key value in the text information with the standard key value in the visual dictionary to obtain the standard key value matched with the first key value.
And determining whether the user intention of the voice task instruction can be obtained or not according to the standard key value and the key information corresponding to the standard key value. If each first key value is matched with the standard key value, the accurate user intention can be obtained. And if all the first key values are not matched with the standard key values, determining that the user intention does not exist in the voice task instruction. If N first key values are matched with the standard key values and M first key values are not matched with the standard key values, N and M are both larger than 0, the user intention in the voice task instruction is determined to be ambiguous.
If the accurate user intention is obtained, the robot navigates to the target point according to the user intention and the semantic map.
If the user intention does not exist in the voice task instruction, the robot prompts the user that the user intention cannot be determined and requests to input the voice task instruction again.
If the user intention in the voice task instruction is not clear, the robot prompts the user that the clear user intention cannot be determined, and a detailed user task instruction is given.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way.
Corresponding to the task execution method of the robot described in the above embodiments, fig. 7 shows a block diagram of the robot provided in the embodiment of the present application, and for convenience of explanation, only the portions related to the embodiment of the present application are shown.
Referring to fig. 7, the robot 300 may include: an instruction fetch module 310, a key value extraction module 320, a key value matching module 330, an intent determination module 340, and an instruction execution module 350.
The instruction obtaining module 310 is configured to obtain a voice task instruction input by a user, where the voice task instruction is an instruction for instructing the robot to execute a task;
A key value extracting module 320, configured to extract a first key value in the voice task instruction;
a key value matching module 330, configured to find a standard key value matched with the first key value;
an intention determining module 340, configured to determine a user intention corresponding to the voice task instruction according to a standard key value matched with the first key value;
an instruction execution module 350, configured to execute the task indicated by the voice task instruction based on the user intention.
In one possible implementation, the key value extraction module 320 may be specifically configured to:
converting the voice task instruction into text information;
and searching verbs and task objects in the text information, and taking the verbs and the task objects in the text information as first key values in the voice task instruction.
In one possible implementation, the intent determination module 340 may be specifically configured to:
replacing the first key value in the voice task instruction by using a standard key value matched with the first key value to obtain a standard instruction corresponding to the voice task instruction;
acquiring key information corresponding to a standard key value matched with the first key value, wherein the key information comprises spatial position information of a task object in the standard instruction;
And generating the user intention corresponding to the voice task instruction based on the standard instruction and the key information.
In one possible implementation, the intent determination module 340 may be specifically configured to:
extracting an action key value and a target object key value in a standard key value matched with the first key value, wherein the target object key value is a key value representing the task object;
searching an instruction execution interface corresponding to the action key value, wherein the instruction execution interface is used for executing the standard instruction, and the key information corresponding to the action key value comprises the instruction execution interface;
and searching the space position information of the task object represented by the target object key value, wherein the key information corresponding to the target object key value comprises the space position information of the task object.
In one possible implementation, the intent determination module 340 may be specifically configured to:
and if all the first key values are matched with the standard key values, determining the user intention corresponding to the voice task instruction according to the standard key values matched with the first key values.
In one possible implementation, connected to the key value matching module 330 further includes:
and the prompt output module is used for outputting prompt information if the first key value which is not matched with the standard key value exists, wherein the prompt information is used for prompting a user to determine the user intention of the voice task instruction.
In one possible implementation, connected to the key value matching module 330 further includes:
the map acquisition module is used for acquiring a semantic map of a task area where the robot is located;
the first standard key value generation module is used for generating a standard key value of an object based on identification information of the object existing in the semantic map;
the key information generation module is used for generating key information of the object based on the spatial position information of the object in the semantic map;
the interface acquisition module is used for acquiring each instruction execution interface in the robot;
and the second standard key value generation module is used for setting a standard key value corresponding to the instruction execution interface based on an action which can be executed by the instruction execution interface, wherein key information corresponding to the standard key value of the instruction execution interface is the instruction execution interface.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The embodiment of the present application also provides a robot, referring to fig. 8, the robot 400 may include: at least one processor 410, a memory 420, and a computer program stored in the memory 420 and executable on the at least one processor 410, the processor 410, when executing the computer program, performing the steps of any of the various method embodiments described above, such as steps S101 to S105 in the embodiment shown in fig. 1. Alternatively, the processor 410 may perform the functions of the modules/units of the apparatus embodiments described above, such as the functions of the instruction fetch module 310 through the instruction execution module 350 of fig. 7, when executing the computer program.
By way of example, a computer program may be partitioned into one or more modules/units that are stored in memory 420 and executed by processor 410 to complete the present application. The one or more modules/units may be a series of computer program segments capable of performing specific functions for describing the execution of the computer program in the robot 400.
It will be appreciated by those skilled in the art that fig. 8 is merely an example of a robot and is not limiting of the terminal devices, and may include more or fewer components than shown, or may combine certain components, or different components, such as input-output devices, network access devices, buses, etc.
The processor 410 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 420 may be an internal storage unit of the terminal device, or may be an external storage device of the terminal device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like. The memory 420 is used for storing the computer program as well as other programs and data required by the terminal device. The memory 420 may also be used to temporarily store data that has been output or is to be output.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or one type of bus.
The task execution method of the robot provided by the embodiment of the application can be applied to terminal equipment such as computers, tablet computers, notebook computers, netbooks, personal digital assistants (personal digital assistant, PDA) and the like, and the embodiment of the application does not limit the specific types of the terminal equipment.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed terminal device, apparatus and method may be implemented in other manners. For example, the above-described embodiments of the terminal device are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above-described embodiments, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by one or more processors, the computer program may implement the steps of each of the method embodiments described above.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above-described embodiments, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by one or more processors, the computer program may implement the steps of each of the method embodiments described above.
Also, as a computer program product, the steps of the various method embodiments described above may be implemented when the computer program product is run on a terminal device, causing the terminal device to execute.
Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium may include content that is subject to appropriate increases and decreases as required by jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is not included as electrical carrier signals and telecommunication signals.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. A task execution method of a robot, comprising:
acquiring a voice task instruction input by a user, wherein the voice task instruction is an instruction for instructing a robot to execute a task;
extracting a first key value in the voice task instruction;
searching a standard key value matched with the first key value, wherein the standard key value is prestored and accurate semantic information which can be understood by the robot;
determining the user intention corresponding to the voice task instruction according to the standard key value matched with the first key value;
based on the user intent, executing the task indicated by the voice task instruction.
2. The method of claim 1, wherein the extracting the first key value in the voice task instruction comprises:
converting the voice task instruction into text information;
and searching verbs and task objects in the text information, and taking the verbs and the task objects in the text information as first key values in the voice task instruction.
3. The method of claim 1, wherein the determining the user intent corresponding to the voice task instruction based on the standard key value that matches the first key value comprises:
replacing the first key value in the voice task instruction by using a standard key value matched with the first key value to obtain a standard instruction corresponding to the voice task instruction;
acquiring key information corresponding to a standard key value matched with the first key value, wherein the key information comprises spatial position information of a task object in the standard instruction;
and generating the user intention corresponding to the voice task instruction based on the standard instruction and the key information.
4. The method of claim 3, wherein the obtaining key information corresponding to a standard key value that matches the first key value comprises:
Extracting an action key value and a target object key value in a standard key value matched with the first key value, wherein the target object key value is a key value representing the task object;
searching an instruction execution interface corresponding to the action key value, wherein the instruction execution interface is used for executing the standard instruction, and the key information corresponding to the action key value comprises the instruction execution interface;
and searching the space position information of the task object represented by the target object key value, wherein the key information corresponding to the target object key value comprises the space position information of the task object.
5. The method of claim 1, wherein the determining the user intent corresponding to the voice task instruction based on the standard key value that matches the first key value comprises:
and if all the first key values are matched with the standard key values, determining the user intention corresponding to the voice task instruction according to the standard key values matched with the first key values.
6. The method of claim 1, wherein after the locating the standard key that matches the first key, the method further comprises:
and if the first key value which is not matched with the standard key value exists, outputting prompt information, wherein the prompt information is used for prompting a user to determine the user intention of the voice task instruction.
7. The method of any of claims 1 to 6, wherein prior to said looking up a standard key that matches said first key, the method further comprises:
acquiring a semantic map of a task area where the robot is located;
generating a standard key value of an object based on identification information of the object existing in the semantic map;
generating key information of the object based on the spatial position information of the object in the semantic map;
acquiring each instruction execution interface in the robot;
and setting a standard key value corresponding to the instruction execution interface based on the action which can be executed by the instruction execution interface, wherein key information corresponding to the standard key value of the instruction execution interface is the instruction execution interface.
8. A robot, comprising:
the instruction acquisition module is used for acquiring a voice task instruction input by a user, wherein the voice task instruction is an instruction for instructing the robot to execute a task;
the key value extraction module is used for extracting a first key value in the voice task instruction;
the key value matching module is used for searching a standard key value matched with the first key value;
the intention determining module is used for determining the user intention corresponding to the voice task instruction according to the standard key value matched with the first key value;
And the instruction execution module is used for executing the task indicated by the voice task instruction based on the user intention.
9. A robot comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the computer program.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the method according to any one of claims 1 to 7.
CN202311247246.4A 2023-09-25 2023-09-25 Task execution method of robot, robot and storage medium Pending CN117400238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311247246.4A CN117400238A (en) 2023-09-25 2023-09-25 Task execution method of robot, robot and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311247246.4A CN117400238A (en) 2023-09-25 2023-09-25 Task execution method of robot, robot and storage medium

Publications (1)

Publication Number Publication Date
CN117400238A true CN117400238A (en) 2024-01-16

Family

ID=89487988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311247246.4A Pending CN117400238A (en) 2023-09-25 2023-09-25 Task execution method of robot, robot and storage medium

Country Status (1)

Country Link
CN (1) CN117400238A (en)

Similar Documents

Publication Publication Date Title
CN110148416B (en) Speech recognition method, device, equipment and storage medium
CN110069608B (en) Voice interaction method, device, equipment and computer storage medium
US9767092B2 (en) Information extraction in a natural language understanding system
CN108847241B (en) Method for recognizing conference voice as text, electronic device and storage medium
CN107591155B (en) Voice recognition method and device, terminal and computer readable storage medium
CN109002510B (en) Dialogue processing method, device, equipment and medium
CN110223695B (en) Task creation method and mobile terminal
CN110415679B (en) Voice error correction method, device, equipment and storage medium
CN112417102B (en) Voice query method, device, server and readable storage medium
CN113139387B (en) Semantic error correction method, electronic device and storage medium
JP2020030408A (en) Method, apparatus, device and medium for identifying key phrase in audio
CN112287680B (en) Entity extraction method, device and equipment of inquiry information and storage medium
CN111737979B (en) Keyword correction method, device, correction equipment and storage medium for voice text
KR20190000776A (en) Information inputting method
CN110335628B (en) Voice test method and device of intelligent equipment and electronic equipment
CN110263346B (en) Semantic analysis method based on small sample learning, electronic equipment and storage medium
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
US20190279623A1 (en) Method for speech recognition dictation and correction by spelling input, system and storage medium
CN117400238A (en) Task execution method of robot, robot and storage medium
CN111611793A (en) Data processing method, device, equipment and storage medium
CN115905497A (en) Method, device, electronic equipment and storage medium for determining reply sentence
CN110010131B (en) Voice information processing method and device
CN114242047A (en) Voice processing method and device, electronic equipment and storage medium
KR20220024251A (en) Method and apparatus for building event library, electronic device, and computer-readable medium
CN112148751B (en) Method and device for querying data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination