WO2018090252A1

WO2018090252A1 - Voice instruction recognition method for robot, and related robot device

Info

Publication number: WO2018090252A1
Application number: PCT/CN2016/106118
Authority: WO
Inventors: 骆磊
Original assignee: 深圳达闼科技控股有限公司
Priority date: 2016-11-16
Filing date: 2016-11-16
Publication date: 2018-05-24
Also published as: CN106796790A; CN106796790B

Abstract

A voice instruction recognition method for a robot, and a related robot device. The method comprises: when it is detected that a user starts speaking and the current voice instruction recognition accuracy is less than a pre-set accuracy threshold value, adjusting an operating state of a robot until the adjusted voice instruction recognition accuracy of the robot reaches the pre-set accuracy threshold value. The method can enable a robot device to receive a voice instruction more accurately without stopping operation, and better coordinates the voice instruction recognition accuracy and operating efficiency.

Description

Robot voice command recognition method and related robot device

Technical field

Embodiments of the present invention relate to the field of robots, and in particular, to a method for recognizing a voice command of a robot and related robot devices.

Background technique

Currently, robots with mechanical motion capability are inevitably operated during operation due to the operation of the steering gear and/or the motor, or the operation or cleaning of other objects (such as door opening operation, dishwashing operation, sweeping operation, etc.). Noise is generated, and if the user issues a voice command to the robot in the process, the noise will inevitably have an impact on the accuracy of the robot voice recognition. Even with the noise reduction algorithm, only the steady-state noise can be filtered, and it is difficult to filter out the variable and unpredictable noise during the working process of the robot. Another method is that the robot stops the current action after detecting the user's voice, and waits for the voice command to continue the previous action, but this will inevitably reduce the work efficiency.

In view of this, it is an urgent problem to be solved in the art to overcome the drawbacks of the prior art described above.

Summary of the invention

The technical problem to be solved by the embodiments of the present invention is to provide a method for recognizing a voice command of a robot and a related robot device, which can better coordinate the accuracy and working efficiency of voice command recognition.

In order to solve the above technical problem, a technical solution adopted by the embodiment of the present invention is to provide a method for recognizing a voice command of a robot, comprising: after detecting that a voice is sent by a user, and the current voice command recognition accuracy is lower than a preset When the accuracy threshold is adjusted, the working state of the robot is adjusted until the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.

The method further includes: obtaining a current noise level; determining, according to the current noise level, a current voice instruction recognition accuracy rate according to a correspondence between the noise level and the recognition accuracy rate.

The preset accuracy threshold corresponds to the priority of the currently executing task of the robot; the higher the priority of executing the task, the lower the corresponding accuracy threshold.

The adjusting the working state of the robot until the adjusted accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold includes: determining the work of the robot according to the difference between the current voice command recognition accuracy rate and the preset accuracy rate threshold value. The adjustment range of the state; the movement of the robot is adjusted according to the determined adjustment range, so that the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.

The method further includes: when the adjusted accuracy of the voice command recognition of the robot does not reach the preset accuracy threshold, the user is notified that the current environmental noise is large or close to the user.

Wherein, when the accuracy of the adjusted voice command recognition of the robot does not reach the preset accuracy threshold, the user is informed that the current environmental noise is large or close to the user, which means that the accuracy of the voice command recognition after the adjustment of the robot is not When the preset accuracy threshold is reached and the priority of the currently executed task is lower than the preset priority, the user is informed that the current environmental noise is large or close to the user.

Wherein, adjusting the working state of the robot until the adjusted accuracy of the voice command recognition of the robot reaches a preset accuracy threshold includes: adjusting the motion of the robot according to the preset adjustment amplitude, until the robot corrects the adjusted voice command. The rate reaches the preset accuracy threshold.

The method further includes: when the robot stops the action after adjusting the working state, when the voice instruction recognition accuracy rate has not reached the preset accuracy rate threshold, the user is notified that the current environmental noise is large or close to the user.

Among them, adjusting the working state of the robot includes one or any of the following: placing the slow motion speed, reducing the motor speed, and turning off the non-human voice frequency steering gear.

In order to solve the above technical problem, another technical solution adopted by the embodiment of the present invention is to provide a robot voice instruction recognition apparatus, including: an adjustment module, after the user issues a voice, and the current voice instruction recognition accuracy is lower than Adjust the preset accuracy threshold The working state of the robot until the robot adjusts the accuracy of the voice command recognition to a preset accuracy threshold.

The device further includes: an obtaining module, configured to acquire a current noise level; and an accuracy determining module configured to determine, according to a current noise level, a current voice command recognition accuracy rate according to a correspondence between the noise level and the recognition accuracy rate.

The adjustment module includes: an amplitude determining unit, configured to determine, according to a difference between a current voice command recognition accuracy rate and a preset accuracy rate threshold value, an adjustment range of a working state of the robot; and a first adjusting unit, configured to The determined adjustment range adjusts the motion of the robot so that the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.

In order to solve the above technical problem, another technical solution adopted by the embodiment of the present invention is to provide a robot apparatus, including:

At least one processor;

a memory coupled to at least one processor; wherein

The memory stores a program of instructions executable by the at least one processor, the program of instructions being executed by the at least one processor to cause the at least one processor to:

After detecting the user's voice, and when the current voice command recognition accuracy is lower than the preset accuracy threshold, the working state of the robot is adjusted until the adjusted voice command recognition accuracy of the robot reaches a preset accuracy threshold.

Compared with the prior art, the beneficial effects of the embodiments of the present invention are:

In the embodiment of the present invention, after detecting that the user issues a voice command, and the current voice command recognition accuracy rate is lower than a preset accuracy rate threshold, the robot device adjusts its working state, for example, slow-motion speed, so that the adjustment is performed. After the voice command recognition accuracy reaches the preset accuracy rate threshold, the robot device can receive the voice command more accurately without stopping the action, and better coordinate the voice command recognition accuracy and work efficiency.

DRAWINGS

1 is a schematic flow chart of an embodiment of a method for recognizing a voice command of a robot according to the present invention;

2 is a schematic structural diagram of an embodiment of a robot voice command recognition apparatus according to the present invention;

Figure 3 is a schematic view showing the structure of an embodiment of the robot apparatus of the present invention.

detailed description

Embodiments of the present invention provide a method for recognizing a voice command of a robot, a robot voice command recognition device, and a robot device.

Please refer to FIG. 1. FIG. 1 is a schematic flow chart of an embodiment of a method for recognizing a voice command of a robot according to the present invention. As shown in FIG. 1, the embodiment includes:

Step 101: After detecting that the voice is sent by the user, determining whether the current voice instruction recognition accuracy is lower than a preset accuracy threshold;

The execution body of this embodiment may specifically be a robot device. During the execution of the task, the robot device can continuously detect whether the user emits a voice. For example, when the user's voice is captured, the user is considered to be making a voice, and then it is determined whether the current voice command recognition accuracy is lower than a preset accuracy threshold.

Preferably, the preset accuracy rate threshold corresponds to the priority of the currently executing task of the robot; the higher the priority of executing the task, the lower the corresponding accuracy rate threshold. Thus, the higher the priority of the execution task, the less likely it is to be disturbed by the voice command issued by the user.

For example, the task priority of a robot is divided into 1, 2, 3, 4, and 5 levels (actually can be divided into any level), with level 1 being the highest priority and level 5 being the lowest priority. The priority of voice commands can be defined between any two priorities, such as 1.5, 2.5, 3.5, and 4.5, to ensure that voice commands are not at the same priority as any task. The accuracy thresholds preset according to the priority of the task are x1, x2, x3, x4, and x5, respectively, corresponding to the priority of the task, 1-5, the accuracy threshold x1 is the lowest, and the accuracy threshold x5 is the highest.

The current voice command recognition accuracy rate may be determined according to the current noise, for example, acquiring a current noise level; and determining a current voice command recognition accuracy rate according to a current noise level according to a correspondence relationship between the locally stored noise level and the recognition accuracy rate. For example, the noise level 1 is greater than or equal to 10 dB and less than 20 dB, and the noise level 2 is greater than or equal to 20 dB and less than 30 dB. The higher the noise level, the lower the accuracy of the voice command recognition.

It is also possible to try to recognize the expression content of the voice instruction after detecting the user issuing the voice instruction, and if not, determine that the current voice instruction recognition accuracy is lower than the preset criterion. The threshold is determined. For example, in step 101, after detecting that the voice command is sent by the user, before determining whether the current voice command recognition accuracy is lower than the preset accuracy rate threshold, the method may further include: identifying the expression content of the user voice command. At this time, determining whether the current voice command recognition accuracy is lower than the preset accuracy rate threshold may include: determining that the current voice command recognition accuracy rate is lower than a preset accuracy rate threshold when the expression content of the user voice command cannot be recognized.

Step 102: When it is determined that the current voice instruction recognition accuracy is lower than the preset accuracy rate threshold, adjust the working state of the robot until the adjusted voice command recognition accuracy of the robot reaches a preset accuracy threshold.

Specifically, adjusting the working state of the robot may include one or any of the following: releasing the slow motion speed, reducing the motor speed, and turning off the non-human voice frequency steering gear, all of which help to reduce the operating noise of the robot itself or the object. The influence of the voice command recognition accuracy rate can improve the accuracy of the current voice command recognition, and facilitate the robot device to correctly receive the voice command.

The correspondence between the accuracy of the voice command recognition accuracy and the adjustment range can be preset according to experience, so that the working state of the robot can be adjusted according to the difference. Therefore, in step 102, adjusting the working state of the robot until the adjusted accuracy of the voice command recognition of the robot reaches a preset accuracy threshold may specifically include: determining a difference between the accuracy rate and the preset accuracy threshold according to the current voice instruction. The adjustment range of the working state of the robot is determined; the movement of the robot is adjusted according to the determined adjustment range, so that the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.

Under normal circumstances, after adjusting the working state of the robot according to the above difference, the accuracy of the voice command recognition of the robot should be able to reach the preset accuracy threshold. If it still cannot be reached, it is determined that the current noise is from the external environment rather than itself. The movement can inform the user that the current environmental noise is large, may cause the identification to be incorrect, or approach the user to better receive the voice command. Preferably, when the adjusted voice command recognition accuracy rate does not reach the preset accuracy rate threshold, and the priority of the currently executed task is lower than the preset priority, the user is informed that the current environmental noise is large or close to the user. Thus, the higher the priority of the execution task, the less likely it is to be disturbed by the voice commands issued by the user.

Of course, the movement of the robot can also be adjusted step by step, that is, the working state of the robot is adjusted until the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold. The method includes: adjusting the action of the robot step by step according to a preset adjustment range, until the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy rate threshold. For example, according to the stepwise decrement 20% adjustment, the corresponding action speed and amplitude of the vocal frequency band may be affected, and the current noise level is obtained again after each decrement, and it is determined whether the current voice command recognition accuracy is lower than a preset accuracy rate threshold, if lower than The preset accuracy threshold keeps the current motion speed and amplitude until the voice command is received.

If the step is decremented until the robot stops moving, and the accuracy of the voice command recognition is still lower than the preset accuracy threshold, it is determined that the noise is from the external environment rather than the motion of the user, and the user may be informed that the current environmental noise is large, which may result in incorrect identification. Situation, or approaching the user to better receive voice commands.

Step 103: Continue to receive the voice sent by the user when determining that the current voice instruction recognition accuracy is not lower than the preset accuracy rate threshold.

In this embodiment, after detecting that the user issues a voice command, and the current voice command recognition accuracy rate is lower than a preset accuracy rate threshold, the robot device adjusts its working state, for example, a slow motion speed, so that after the adjustment The voice command recognition accuracy reaches a preset accuracy rate threshold, so that the robot device can receive the voice command more accurately without stopping the action, and better coordinate the voice command recognition accuracy and work efficiency. If the adjustment speed of the robot is fast enough, the convergence can be completed quickly, and the user does not affect the task currently performed by the user. The experience is good. If the adjustment speed is slow, the convergence time may be longer, but it will still be better than no. The application of the method.

It should be noted that, in some embodiments, determining whether the current voice instruction recognition accuracy is lower than a preset accuracy threshold may not be performed after detecting that the user issues a voice instruction. For example, the robot device may determine whether the current voice command recognition accuracy rate is lower than a preset accuracy rate threshold after starting a new task, and save the judgment result locally, and read the judgment result after detecting that the user issues a voice command.

Embodiment 2

Please refer to FIG. 2. FIG. 2 is a schematic structural diagram of an embodiment of a robot voice command recognition apparatus according to the present invention. As shown in FIG. 2, this embodiment includes:

The detecting module 201 is configured to detect a voice command sent by the user;

The detection module can continually detect whether the user issues a voice command, for example, when the user's voice is captured, the user is considered to be making a voice command.

The determining module 202 is configured to determine whether the current voice instruction recognition accuracy is lower than a preset accuracy threshold;

The current speech instruction recognition accuracy rate can be determined based on the current noise. For example, the robot voice instruction identification may further include: an obtaining module, configured to acquire a current noise level; and an accuracy determining module configured to determine a current voice instruction recognition accuracy rate according to a current noise level according to a correspondence between the noise level and the recognition accuracy rate.

In this embodiment, after the detecting module 201 detects that the user issues a voice instruction, the trigger determining module 202 performs an operation.

The adjusting module 203 is configured to: after the detecting module detects that the user issues a voice command, and when the determining module determines that the current voice command recognition accuracy rate is lower than a preset accuracy rate threshold, adjust the working state of the robot to the adjusted voice of the robot. The command recognition accuracy reaches a preset accuracy threshold.

The correspondence between the accuracy of the voice command recognition accuracy and the adjustment range can be preset according to experience, so that the working state of the robot can be adjusted according to the difference. For example, the adjustment module 203 may include: an amplitude determining unit, configured to determine an adjustment range of an operating state of the robot according to a difference between the current voice instruction identification accuracy rate and the preset accuracy rate threshold value; and the first adjusting unit is configured to follow The determined adjustment range adjusts the motion of the robot so that the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.

It is also possible to adjust the movement of the robot step by step. For example, the adjustment module 203 may include: a second adjustment unit, configured to step adjust the motion of the robot according to the preset adjustment amplitude, until the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.

Specifically, the adjusting module 203 adjusts the working state of the robot, which may include one or any of the following: releasing the slow motion speed, reducing the motor speed, and turning off the non-human voice frequency steering gear, which These can help reduce the impact of the robot's own or the operating noise of the object on the accuracy of the voice command recognition, can improve the current voice command recognition accuracy, and facilitate the robot device to correctly receive the voice command.

In this embodiment, after the detecting module detects that the user issues a voice command, and when the determining module determines that the current voice command recognition accuracy rate is lower than a preset accuracy rate threshold, the adjusting module adjusts its working state, for example, slow motion speed. In order to make the adjusted speech command recognition accuracy reach the preset accuracy rate threshold, the robot device can receive the voice command more accurately without stopping the action, and the voice command recognition accuracy and work efficiency are well coordinated.

In some embodiments, the determining module 202 may also perform an operation after the detecting module 201 detects that the user issues a voice command. For example, the judging module 202 may determine whether the current speech instruction recognition accuracy rate is lower than a preset accuracy rate threshold after the robot starts a new task, and save the determination result locally, and the detection module 201 detects that the user issues a voice instruction, and the robot voice instruction The identification device can directly read the judgment result.

In some embodiments, the robot voice instruction recognition apparatus may not include the detection module 201 and the determination module 202, but after the user issues a voice by the external device, and when the current voice instruction recognition accuracy is lower than a preset accuracy threshold, The trigger adjustment module 203 adjusts the working state of the robot.

Embodiment 3

Please refer to FIG. 3. FIG. 3 is a schematic structural view of an embodiment of a robot apparatus according to the present invention. As shown in FIG. 3, the robot apparatus 300 includes:

At least one processor 310, exemplified by a processor 310 in FIG. 3; and a memory 320 communicatively coupled to at least one processor 310; wherein the memory stores a program of instructions executable by at least one processor, the program of instructions being at least A processor executes to enable at least one processor to perform the above-described method of robot voice instruction recognition.

The processor 310 and the memory 320 may be connected by a bus or other means, as exemplified by a bus connection in FIG.

The memory 320 is used as a non-volatile computer readable storage medium, and can be used for storing a non-volatile software program, a non-volatile computer executable program, and a module, and the method for recognizing the robot voice instruction in the embodiment of the present application corresponds to Program instructions/modules. Processor 310 Performing various functional applications and data processing of the robot apparatus by executing non-volatile software programs, instructions, and modules stored in the memory 320, that is, a method for realizing robot voice instruction recognition applied to the robot apparatus of the above method embodiment .

The memory 320 may include a storage program area and an storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created by use of the method of the above-described robot voice instruction identification, etc. . Moreover, memory 320 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 320 can include memory remotely located relative to processor 310, which can be connected to the robotic device over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in memory 320, and when executed by one or more processors 310, perform the method of robotic voice instruction recognition applied to the robotic device in any of the method embodiments described above.

A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The above is only the embodiment of the present invention, and is not intended to limit the scope of the invention, and the equivalent structure or equivalent process transformations made by the description of the invention and the drawings are directly or indirectly applied to other related technologies. The fields are all included in the scope of patent protection of the present invention.

Claims

A method for recognizing a robot voice instruction, comprising:

After detecting the user's voice, and when the current voice command recognition accuracy is lower than the preset accuracy threshold, the working state of the robot is adjusted until the adjusted voice command recognition accuracy of the robot reaches a preset accuracy threshold.
The method of claim 1 further comprising:

Get the current noise level;

And determining, according to the current noise level, a current voice instruction recognition accuracy rate according to a correspondence relationship between the noise level and the recognition accuracy rate.
The method according to claim 1, wherein the preset accuracy rate threshold corresponds to a priority of the currently executing task of the robot; and the higher the priority of executing the task, the lower the corresponding accuracy rate threshold.
The method according to claim 1, wherein the adjusting the working state of the robot until the adjusted accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold comprises:

Determining an adjustment range of the working state of the robot according to the difference between the current voice instruction identification accuracy rate and the preset accuracy rate threshold value;

The action of the robot is adjusted according to the determined adjustment range, so that the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.
The method of claim 4, wherein the method further comprises:

When the adjusted accuracy of the voice command recognition of the robot does not reach the preset accuracy threshold, the user is informed that the current environmental noise is large or close to the user.
The method according to claim 5, wherein the robot informs the user that the current environmental noise is large or close to the user when the adjusted voice command recognition accuracy does not reach the preset accuracy threshold, :

When the accuracy of the adjusted voice command recognition of the robot does not reach the preset accuracy threshold and the priority of the currently executed task is lower than the preset priority, the user is informed that the current environmental noise is large or close to the user.
The method according to claim 1, wherein the adjusting the working state of the robot until the adjusted accuracy of the adjusted voice command of the robot reaches a preset accuracy threshold comprises:

The robot's motion is adjusted step by step according to the preset adjustment range, until the robot's adjusted voice command recognition accuracy reaches a preset accuracy threshold.
The method of claim 7, wherein the method further comprises:

When the robot stops moving after adjusting the working state, when the voice command recognition accuracy rate has not reached the preset accuracy rate threshold, the user is informed that the current environmental noise is large or close to the user.
The method according to any one of claims 1-8, wherein the operating state of the adjusting robot comprises one or any of the following: releasing a slow motion speed, reducing a motor speed, and turning off a non-human sound band rudder. machine.
A robot voice instruction recognition device, comprising:

The adjustment module is configured to adjust the working state of the robot to the preset accuracy of the adjusted voice command after the voice is sent by the user and the current voice command recognition accuracy is lower than the preset accuracy threshold. Rate threshold.
The device according to claim 10, wherein the device further comprises:

An acquisition module for obtaining a current noise level;

The accuracy determination module is configured to determine, according to the current noise level, a current voice instruction recognition accuracy rate according to a correspondence between the noise level and the recognition accuracy rate.
The apparatus according to claim 10, wherein the preset accuracy rate threshold corresponds to a priority of the currently executing task of the robot; and the higher the priority of executing the task, the lower the corresponding accuracy rate threshold.
The apparatus according to claim 10, wherein the adjustment module comprises:

And an amplitude determining unit, configured to determine, according to the difference between the current voice instruction identification accuracy rate and the preset accuracy rate threshold, an adjustment range of the working state of the robot;

The first adjusting unit is configured to adjust the motion of the robot according to the determined adjustment range, so that the accuracy of the adjusted voice command recognition of the robot reaches a preset accuracy threshold.
Apparatus according to any one of claims 10-13, wherein said said Adjusting the working state of the robot includes one or any of the following: placing the slow motion speed, reducing the motor speed, and turning off the non-human voice frequency steering gear.
A robot apparatus, comprising:

At least one processor;

a memory coupled to the at least one processor; wherein

The memory stores an instruction program executable by the at least one processor, the instruction program being executed by the at least one processor to cause the at least one processor to:

After detecting the user's voice, and when the current voice command recognition accuracy is lower than the preset accuracy threshold, the working state of the robot is adjusted until the adjusted voice command recognition accuracy of the robot reaches a preset accuracy threshold.