Robot voice instruction recognition method and related robot device
Technical Field
The embodiment of the invention relates to the field of robots, in particular to a method for recognizing a voice command of a robot and a related robot device.
Background
When a robot which can move and has mechanical action capability works, noise is inevitably generated due to the operation of a steering engine and/or a motor or the operation or cleaning of other objects (such as door opening operation, dish washing operation, floor sweeping operation and the like), and if a user sends a voice instruction to the robot in the process, the noise inevitably has certain influence on the accuracy of the voice recognition of the robot. Even if a noise reduction algorithm is adopted, only steady-state noise can be filtered, and variable and unpredictable noise in the working process of the robot is difficult to filter. Another method is that the robot stops the current action after detecting that the user utters voice, and resumes the previous action after receiving the voice command, but this inevitably reduces the work efficiency.
In view of the above, it is an urgent problem in the art to overcome the above-mentioned drawbacks of the prior art.
Disclosure of Invention
The technical problem mainly solved by the embodiment of the invention is to provide a robot voice instruction recognition method and a related robot device, which can better coordinate the voice instruction recognition accuracy and the working efficiency.
In order to solve the above technical problem, one technical solution adopted by the embodiment of the present invention is: a method for recognizing robot voice commands is provided, which comprises the following steps: after detecting that a user sends out voice, and when the current voice instruction recognition accuracy is lower than a preset accuracy threshold, adjusting the working state of the robot until the adjusted voice instruction recognition accuracy of the robot reaches the preset accuracy threshold.
Wherein, the method further comprises: acquiring the current noise level; and determining the recognition accuracy of the current voice command according to the corresponding relation between the noise level and the recognition accuracy according to the current noise level.
The preset accuracy threshold corresponds to the priority of the current task executed by the robot; the higher the priority of executing a task, the lower the corresponding accuracy threshold.
Wherein, the adjustment of the working state of the robot to the robot that the accuracy of the adjusted voice command recognition reaches the preset accuracy threshold comprises: determining the adjustment range of the working state of the robot according to the difference value between the current voice command recognition accuracy and a preset accuracy threshold; and adjusting the action of the robot according to the determined adjustment amplitude, so that the recognition accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold.
Wherein, the method further comprises: and when the recognition accuracy of the adjusted voice command of the robot does not reach a preset accuracy threshold, informing the user that the current environment noise is larger or the robot approaches to the user.
When the adjusted voice instruction recognition accuracy rate of the robot does not reach the preset accuracy rate threshold value, the user is informed that the current environmental noise is larger or the user approaches to the robot, namely, when the adjusted voice instruction recognition accuracy rate of the robot does not reach the preset accuracy rate threshold value and the priority of the currently executed task is lower than the preset priority, the user is informed that the current environmental noise is larger or the user approaches to the robot.
Wherein, the adjustment of the working state of the robot to the robot that the accuracy of the adjusted voice command recognition reaches the preset accuracy threshold comprises: and step-by-step adjusting the action of the robot according to a preset adjustment amplitude until the recognition accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold.
Wherein, the method further comprises: when the robot stops acting after the working state is adjusted and the voice command recognition accuracy rate still does not reach the preset accuracy rate threshold value, the user is informed that the current environment noise is larger or the robot approaches to the user.
Wherein, the working state of the adjusting robot comprises one or more of the following: slowing down the action speed, reducing the rotating speed of the motor and closing the non-human voice frequency range steering engine.
In order to solve the above technical problem, another technical solution adopted by the embodiment of the present invention is: provided is a robot voice instruction recognition device including: and the adjusting module is used for adjusting the working state of the robot until the adjusted voice instruction recognition accuracy of the robot reaches a preset accuracy threshold value after the user sends voice and the current voice instruction recognition accuracy is lower than the preset accuracy threshold value.
Wherein, above-mentioned device still includes: the acquisition module is used for acquiring the current noise level; and the accuracy determining module is used for determining the recognition accuracy of the current voice command according to the current noise level and the corresponding relation between the noise level and the recognition accuracy.
Wherein, above-mentioned adjustment module includes: the amplitude determining unit is used for determining the adjustment amplitude of the working state of the robot according to the difference value between the current voice command recognition accuracy and a preset accuracy threshold; and the first adjusting unit is used for adjusting the action of the robot according to the determined adjusting amplitude, so that the recognition accuracy of the voice command of the robot after adjustment reaches a preset accuracy threshold.
In order to solve the above technical problem, another technical solution adopted by the embodiment of the present invention is: provided is a robot device including:
at least one processor; and
a memory coupled to the at least one processor; wherein,
the memory stores a program of instructions executable by the at least one processor to cause the at least one processor to:
after detecting that a user sends out voice, and when the current voice instruction recognition accuracy is lower than a preset accuracy threshold, adjusting the working state of the robot until the adjusted voice instruction recognition accuracy of the robot reaches the preset accuracy threshold.
Compared with the prior art, the implementation mode of the invention has the beneficial effects that:
in the embodiment of the invention, after the voice instruction sent by the user is detected, and when the current voice instruction recognition accuracy is lower than the preset accuracy threshold, the robot device adjusts the working state of the robot device, for example, the action speed is slowed down, so that the adjusted voice instruction recognition accuracy reaches the preset accuracy threshold, the robot device can receive the voice instruction more accurately without stopping the action, and the voice instruction recognition accuracy and the working efficiency are better coordinated.
Drawings
FIG. 1 is a flow diagram of one embodiment of a method of robotic voice command recognition of the present invention;
FIG. 2 is a schematic diagram of a robot voice command recognition apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of one embodiment of the robot apparatus of the present invention.
Detailed Description
The embodiment of the invention provides a robot voice instruction recognition method, a robot voice instruction recognition device and a robot device.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for recognizing a robot voice command according to an embodiment of the present invention. As shown in fig. 1, the present embodiment includes:
step 101: after detecting that a user sends out voice, judging whether the recognition accuracy of the current voice instruction is lower than a preset accuracy threshold;
the execution main body of the present embodiment may specifically be a robot apparatus. The robot device can continuously detect whether the user utters voice in the task execution process, for example, when the user voice is captured, the user is considered to be uttering voice, and then whether the current voice instruction recognition accuracy is lower than a preset accuracy threshold is judged.
Preferably, the preset accuracy threshold corresponds to the priority of the current task executed by the robot; the higher the priority of executing a task, the lower the corresponding accuracy threshold. Thus, the higher priority task is less likely to be disturbed by a voice instruction issued by the user.
For example, the task priority of the robot is divided into five levels (actually, the task priority can be divided into any level) of 1, 2, 3, 4 and 5, wherein the level 1 is the highest priority and the level 5 is the lowest priority. The priority of the voice command may be defined as between any two priorities, such as 1.5, 2.5, 3.5, 4.5, to ensure that the voice command is not at the same priority as any task. The accuracy threshold values preset according to the task priorities are x1, x2, x3, x4 and x5 respectively, and correspond to 1-5 levels of the task priorities, wherein the accuracy threshold value x1 is the lowest, and the accuracy threshold value x5 is the highest.
The current speech instruction recognition accuracy may be determined from the current noise, for example: acquiring the current noise level; and determining the recognition accuracy of the current voice command according to the current noise level and the corresponding relation between the locally stored noise level and the recognition accuracy. For example, the noise level 1 is 10dB or more and less than 20dB, the noise level 2 is 20dB or more and less than 30dB, and the higher the noise level is, the lower the voice command recognition accuracy is.
Or after detecting that the user sends a voice command, trying to recognize the expression content of the voice command once, and if the expression content cannot be recognized, judging that the recognition accuracy of the current voice command is lower than a preset accuracy threshold. For example, in step 101, after detecting that the user issues a voice instruction, before determining whether the current voice instruction recognition accuracy is lower than a preset accuracy threshold, the method may further include: the content of the expression of the user voice instruction is recognized. At this time, the determining whether the current speech instruction recognition accuracy is lower than the preset accuracy threshold may specifically include: and when the expression content of the user voice instruction cannot be recognized, judging that the recognition accuracy of the current voice instruction is lower than a preset accuracy threshold.
Step 102: and when the recognition accuracy of the current voice command is judged to be lower than the preset accuracy threshold, adjusting the working state of the robot until the recognition accuracy of the voice command after the robot is adjusted reaches the preset accuracy threshold.
Specifically, the adjusting of the working state of the robot may include one or any of the following: the action speed is slowed down, the rotating speed of the motor is reduced, and the non-human sound frequency band steering engine is closed, so that the influence of the operation noise of the robot or an object on the voice command recognition accuracy rate is reduced, the current voice command recognition accuracy rate can be improved, and the robot device can correctly receive the voice command.
The corresponding relation between the voice command recognition accuracy difference and the adjustment range can be preset according to experience, so that the working state of the robot can be adjusted according to the difference. Therefore, in step 102, adjusting the working state of the robot until the recognition accuracy of the adjusted voice command of the robot reaches the preset accuracy threshold may specifically include: determining the adjustment range of the working state of the robot according to the difference value between the current voice command recognition accuracy and a preset accuracy threshold; and adjusting the action of the robot according to the determined adjustment amplitude, so that the recognition accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold.
Under normal conditions, after the working state of the robot is adjusted according to the difference, the recognition accuracy of the voice command of the robot can reach a preset accuracy threshold, if the recognition accuracy cannot reach the preset accuracy threshold, the current noise is judged to come from the external environment instead of self movement, and a user can be informed that the current environment noise is large, the recognition is possibly incorrect, or the user approaches to receive the voice command better. Preferably, under the condition that the adjusted voice instruction recognition accuracy does not reach the preset accuracy threshold and the priority of the current execution task is lower than the preset priority, the user is informed that the current environmental noise is larger or the current environmental noise is closer to the user, so that the execution task with higher priority is less likely to be disturbed by the voice instruction sent by the user.
Of course, the step-by-step adjustment of the robot action may also be performed, that is, the step-by-step adjustment of the working state of the robot until the recognition accuracy of the adjusted voice command of the robot reaches the preset accuracy threshold may include: and step-by-step adjusting the action of the robot according to a preset adjustment amplitude until the recognition accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold. For example, the corresponding action speed and amplitude of the human voice frequency band can be influenced by adjusting according to step decrement by 20%, the current noise level is obtained again after each decrement, whether the recognition accuracy of the current voice command is lower than a preset accuracy threshold value or not is judged, and if the recognition accuracy of the current voice command is lower than the preset accuracy threshold value, the current action speed, amplitude and the like are kept until the voice command is received completely.
If the step is decreased progressively until the robot stops moving, and the voice command recognition accuracy is still lower than the preset accuracy threshold, the noise is judged to come from the external environment instead of the self-movement, so that the user can be informed that the current environment is high in noise and possibly causes the incorrect recognition condition, or approach the user to better receive the voice command.
Step 103: and when the recognition accuracy of the current voice command is judged to be not lower than the preset accuracy threshold, continuously receiving the voice sent by the user.
In this embodiment, after detecting that the user sends the voice command, and when the current voice command recognition accuracy is lower than the preset accuracy threshold, the robot apparatus adjusts its own working state, for example, slows down the motion speed, so that the adjusted voice command recognition accuracy reaches the preset accuracy threshold, and thus the robot apparatus can receive the voice command more accurately without stopping the motion, and the voice command recognition accuracy and the working efficiency are better coordinated. If the adjusting speed of the robot is fast enough, convergence can be completed quickly, the current task executed by a user is basically not influenced, and the experience is good; if the adjustment speed is slower, the convergence time may be longer, but still better than if the method were not applied.
It should be noted that, in some embodiments, the determination of whether the current voice instruction recognition accuracy is lower than the preset accuracy threshold may not be performed after the voice instruction sent by the user is detected. For example, the robot device may determine whether the current voice instruction recognition accuracy is lower than a preset accuracy threshold after starting a new task, store the determination result locally, and read the determination result after detecting that the user has issued a voice instruction.
Example two
Referring to fig. 2, fig. 2 is a schematic structural diagram of a robot voice command recognition apparatus according to an embodiment of the present invention. As shown in fig. 2, the present embodiment includes:
a detection module 201, configured to detect a voice instruction sent by a user;
the detection module may continuously detect whether the user utters a voice command, for example, it may be considered that the user is uttering a voice command when capturing the user's voice.
The judging module 202 is configured to judge whether the current speech instruction recognition accuracy is lower than a preset accuracy threshold;
preferably, the preset accuracy threshold corresponds to the priority of the current task executed by the robot; the higher the priority of executing a task, the lower the corresponding accuracy threshold. Thus, the higher priority task is less likely to be disturbed by a voice instruction issued by the user.
The current speech instruction recognition accuracy may be determined based on the current noise. For example, the robot voice instruction recognition may further include: the acquisition module is used for acquiring the current noise level; and the accuracy determining module is used for determining the recognition accuracy of the current voice command according to the current noise level and the corresponding relation between the noise level and the recognition accuracy.
In this embodiment, after the detection module 201 detects that the user sends a voice instruction, the determination module 202 is triggered to execute an operation.
The adjusting module 203 is configured to, after the detecting module detects that the user sends the voice instruction, and when the judging module judges that the current voice instruction recognition accuracy is lower than the preset accuracy threshold, adjust the working state of the robot until the adjusted voice instruction recognition accuracy of the robot reaches the preset accuracy threshold.
The corresponding relation between the voice command recognition accuracy difference and the adjustment range can be preset according to experience, so that the working state of the robot can be adjusted according to the difference. For example, the adjusting module 203 may include: the amplitude determining unit is used for determining the adjustment amplitude of the working state of the robot according to the difference value between the current voice command recognition accuracy and a preset accuracy threshold; and the first adjusting unit is used for adjusting the action of the robot according to the determined adjusting amplitude, so that the recognition accuracy of the voice command of the robot after adjustment reaches a preset accuracy threshold.
The movement of the robot can also be adjusted stepwise. For example, the adjusting module 203 may include: and the second adjusting unit is used for adjusting the action of the robot step by step according to the preset adjusting amplitude until the recognition accuracy of the voice command of the robot after adjustment reaches a preset accuracy threshold.
Specifically, the adjusting module 203 may include one or any of the following: the action speed is slowed down, the rotating speed of the motor is reduced, and the non-human sound frequency band steering engine is closed, so that the influence of the operation noise of the robot or an object on the voice command recognition accuracy rate is reduced, the current voice command recognition accuracy rate can be improved, and the robot device can correctly receive the voice command.
In this embodiment, after the detection module detects that the user sends the voice command, and when the determination module determines that the current voice command recognition accuracy is lower than the preset accuracy threshold, the adjustment module adjusts the working state of the adjustment module, for example, slows down the action speed, so that the adjusted voice command recognition accuracy reaches the preset accuracy threshold, and thus the robot device can receive the voice command more accurately without stopping the action, and the voice command recognition accuracy and the working efficiency are well coordinated.
In some embodiments, the determining module 202 may not perform the operation after the detecting module 201 detects that the user issues the voice command. For example, the determining module 202 may determine whether the current speech instruction recognition accuracy is lower than a preset accuracy threshold after the robot starts a new task, and locally store the determination result, and after the detecting module 201 detects that the user sends a speech instruction, the robot speech instruction recognition device directly reads the determination result.
In some embodiments, the robot voice command recognition apparatus may not include the detection module 201 and the determination module 202, and the external device triggers the adjustment module 203 to adjust the working state of the robot after the user utters the voice and when the current voice command recognition accuracy is lower than the preset accuracy threshold.
EXAMPLE III
Referring to fig. 3, fig. 3 is a schematic structural diagram of a robot apparatus according to an embodiment of the present invention. As shown in fig. 3, the robot apparatus 300 includes:
at least one processor 310, one processor 310 being exemplified in fig. 3; and a memory 320 communicatively coupled to the at least one processor 310; the memory stores a program of instructions executable by the at least one processor, and the program of instructions is executed by the at least one processor to enable the at least one processor to perform the method of robot voice instruction recognition.
The processor 310 and the memory 320 may be connected by a bus or other means, such as the bus connection shown in fig. 3.
The memory 320 is a non-volatile computer-readable storage medium and can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the method for recognizing robot voice instructions in the embodiments of the present application. The processor 310 executes various functional applications and data processing of the robot device, i.e., implements the robot voice instruction recognition method applied to the robot device of the above-described method embodiments, by executing the nonvolatile software program, instructions, and modules stored in the memory 320.
The memory 320 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of the above-described robot voice instruction recognition method, and the like. Further, the memory 320 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 320 may include memory located remotely from the processor 310, which may be connected to the robotic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory 320 and, when executed by the one or more processors 310, perform the method of robot voice instruction recognition applied to a robotic device in any of the method embodiments described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above described systems, apparatuses and units may refer to the corresponding processes in the above described method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.