WO2017000786A1

WO2017000786A1 - System and method for training robot via voice

Info

Publication number: WO2017000786A1
Application number: PCT/CN2016/085911
Authority: WO
Inventors: 蔡明峻
Original assignee: 芋头科技（杭州）有限公司
Priority date: 2015-06-30
Filing date: 2016-06-15
Publication date: 2017-01-05
Also published as: CN106326208B; CN106326208A; TWI594136B; TW201719452A; HK1231592A1

Abstract

Disclosed are a system and method for training a robot via voice. The system for training a robot via voice comprises: a receiver unit used for receiving a voice signal; a parsing unit connected to the receiver unit and used for parsing the voice signal, matching the voice signal with a preset statement, and acquiring a conditional statement matching the preset statement and corresponding to the voice signal and an execution statement corresponding to the voice signal; a processing unit connected to the parsing unit and used for combining the conditional statement with the execution statement to produce a target entry; and a storage unit connected to the processing unit and used for storing a preset entry and training a robot on the basis of the preset entry. The processing unit performs a weighted calculation on the basis of the target entry and processes correspondingly on the basis of the result of the weighted calculation.

Description

System and method for training robot by voice

Technical field

The present invention relates to the field of robots, and more particularly to a system and method for training a robot by voice.

Background technique

At present, the method of training robot behavior is limited to the use of programming development to modify the logic of the robot. The developer modifies the program logic of the robot to complete the setting of performing certain actions under certain conditions. This training method is necessary for the underlying development of the robot, but when it enters the upper logic development, it has the defects of low development efficiency and high error rate; this training method is not suitable for ordinary users who do not have the professional skills of programming development, if ordinary users If you want to make a small change to the behavior of the robot, it will take a lot of time to learn.

In summary, the above training method has a narrow application range, low efficiency, and high error rate.

Summary of the invention

In view of the above problems existing in the existing methods of training robots, there is now provided a system and method for implementing support for a robot that supports a robot without speech based on programming.

The specific technical solutions are as follows:

A system for training robots by voice, including:

a receiving unit, configured to receive a voice signal;

An analyzing unit, configured to connect the receiving unit, to parse the voice signal, match the voice signal with a preset statement, and acquire a condition that matches the preset statement and corresponds to the voice signal a statement, and an execution statement corresponding to the voice signal;

a processing unit, coupled to the parsing unit, configured to combine the conditional statement with the execution statement to generate a target entry;

a storage unit, connected to the processing unit, for storing a preset item, and training the robot according to the preset item;

The processing unit performs weight calculation according to the target item, and performs corresponding processing according to the weight calculation result.

Preferably, the parsing unit comprises:

a first conversion module, configured to convert the voice signal into text information;

a semantic analysis module, configured to connect the first conversion module, to parse the text information, match the text information with the preset statement, and obtain a match with the preset statement and a conditional statement corresponding to the text information, and identifying that the conditional statement is a standard conditional statement or a feedback conditional statement;

If the conditional statement is a standard conditional statement, acquiring an execution statement corresponding to the file information;

If the conditional statement is a feedback conditional statement, a weighting operation is performed to cause the robot to perform an operation of the last task.

Preferably, the parsing unit further includes:

A second conversion module is connected to the semantic analysis module for converting the execution statement into a corresponding audio signal and outputting.

Preferably, each of the preset items includes a preset condition statement and a preset execution statement.

Preferably, the processing unit traverses the condition according to the conditional statement in the target entry Storing the preset conditional statement in all the preset entries in the unit to obtain whether the conditional statement is repeated with the preset conditional statement, if not, performing the weighting operation, and Target items are stored in the storage unit to form a new preset item, and the robot is trained according to the preset item; if repeated, the weighting operation is performed, and corresponding processing is performed according to the weight calculation result .

A method of training a robot by voice, comprising the following steps:

S1. collecting a voice signal;

S2. Parsing the voice signal, matching the voice signal with a preset statement, acquiring a conditional statement matching the preset statement and corresponding to the voice signal, and corresponding to the voice signal Execute statement

S3. Combining the conditional statement with the execution statement to generate a target entry;

S4. Perform weight calculation according to the target item, and perform corresponding processing according to the weight calculation result.

Preferably, the step S2 specifically includes:

S21. Converting the voice signal into text information;

S22. Parsing the text information, matching the text information with the preset statement, acquiring a conditional statement matching the preset statement and corresponding to the text information, and identifying the conditional statement Is a standard conditional statement or a feedback conditional statement;

Preferably, the step S2 further includes:

S23. Convert the execution statement into a corresponding audio signal and output.

Preferably, the step S3 specifically includes:

S31. Traverse the preset conditional statement in all the preset entries in the storage unit according to the conditional statement in the target entry;

S32. Acquire a traversal result, and determine whether the conditional statement is repeated with the preset conditional statement.

If the conditional statement does not overlap with the preset conditional statement, step S33 is performed;

If the conditional statement is repeated with the preset conditional statement, step S34 is performed;

S33. Perform the weighting operation, and store the target entry in the storage unit to form a new preset entry, and train the robot according to the preset entry;

S34. Perform the weighting operation, and perform corresponding processing according to the weighting calculation result.

The beneficial effects of the above technical solutions:

In the technical solution, in the system for training the robot by voice, the parsing unit parses the speech signal to obtain a corresponding conditional statement and an execution statement, and the processing unit combines the conditional statement and the execution statement to generate an entry, so that the robot according to the entry The corresponding training is carried out with high efficiency and low error rate. In the method of training the robot by voice, the robot can be trained only by inputting a voice signal, and the operation is simple, and the scope of application is wide and the efficiency is high.

DRAWINGS

1 is a block diagram of an embodiment of a system for training a robot by voice according to the present invention;

2 is a flow chart of an implementation of a method for training a robot by voice according to the present invention;

3 is a flow chart of a method for parsing a voice signal;

4 is a flow chart of a method for performing corresponding processing on the target entry according to a traversal result.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

It should be noted that the embodiments in the present invention and the features in the embodiments may be combined with each other without conflict.

The invention is further illustrated by the following figures and specific examples, but is not to be construed as limiting.

As shown in FIG. 1, a system for training a robot by voice, comprising:

a receiving unit 1 for receiving a voice signal;

An analyzing unit 2 is connected to the receiving unit 1 for parsing the voice signal, matching the voice signal with the preset statement, acquiring a conditional statement matching the preset sentence and corresponding to the voice signal, and corresponding to the voice signal Execute statement

a processing unit 3, connected to the parsing unit 2, for combining a conditional statement with an execution statement to generate a target entry;

a storage unit 4, connected to the processing unit 3, for storing a preset item, and training the robot according to the preset item;

The processing unit 3 performs weight calculation according to the target entry, and performs corresponding processing according to the weight calculation result.

In the present embodiment, the system for training robots by voice can be applied to children's toys. Although children do not have professional programming development skills, children can communicate with robots through natural language and train robots to perform corresponding actions.

In this embodiment, for the optimization process of the robot behavior logic development, a method suitable for the interaction between the ordinary user and the robot is selected, so that the user concentrates on the training process of the robot. Practicing logic itself, rather than developing languages, increases productivity and reduces error rates. The parsing unit 2 parses the speech signal to obtain a corresponding conditional statement and an execution statement, and the processing unit 3 combines the conditional statement and the execution statement to generate an item, so that the robot performs corresponding training according to the item, and the efficiency is high and the error rate is low.

In a preferred embodiment, the parsing unit 2 comprises:

a first conversion module 21, configured to convert the voice signal into text information;

a semantic analysis module 22 is connected to the first conversion module 21 for parsing the text information, matching the text information with the preset statement, acquiring a conditional statement matching the preset sentence and corresponding to the text information, and identifying the condition The statement is a standard conditional statement or a feedback conditional statement;

If the conditional statement is a standard conditional statement, obtaining an execution statement corresponding to the file information;

If the conditional statement is a feedback conditional statement, a weighting operation is performed to cause the robot to perform the operation of the previous task.

In this embodiment, the sentence pattern corresponding to the target item may be:

When A, then B;

If A, then B;

Don't do B when you are at A;

This time should be done B;

This is wrong;

This is not the right thing to do.

Among them, "when A", "if A", "don't be at A", and "this time" are standard conditional statements; "this is wrong" and "this is wrong" Feedback conditional statement.

The whole training process of the system for training the robot by using the voice is: when the training key sentence pattern is recognized, the robot enters the training mode, and the user can use the sentence pattern similar to the above to talk to the robot, and the semantic analysis module 22 of the parsing unit 2 The words spoken by the user are divided into part A and part B. After semantic conversion, part A is converted into conditional development statement, and part B is converted into execution. As a development statement, the relationship between Part A and Part B is added to the local training knowledge base (storage unit 4), and part A and part B are combined to form a new item, if part A and conditional development in the training knowledge base The statement is the same, part B is different from the corresponding execution action development statement in the training knowledge base, then part A is the knowledge item with the same two conditions but performing different actions, and the weight operation is required. The weight operation includes the user's positive and negative feedback, and the additional time Consider and append new knowledge items to the local knowledge base and update the training knowledge base. When the normal natural language communication is recognized, the training mode ends, the robot ends the training and returns to the polling judgment mode, and all the entries in the knowledge base are trained, and when a certain knowledge item is hit, the execution included in the knowledge item is executed. Action development statement.

In this embodiment, the first conversion module 21 can adopt Automatic Speech Recognition (ASR) technology, which can convert vocabulary content in human speech into computer readable content and input it into a computer, and perform with a computer. Interaction.

The semantic analysis module 22 uses artificial intelligence natural language processing (NLP) technology to acquire conditional statements and execution statements in the text information through the NLP technology.

In a preferred embodiment, the parsing unit 2 further includes:

A second conversion module 23 is coupled to the semantic analysis module 22 for converting the execution statement into a corresponding audio signal and outputting it.

In this embodiment, the second conversion module 23 uses TTS (Text To Speech) to convert the text into a voice technology, which is part of the human-machine dialogue, and enables the robot to speak through the TTS.

In a preferred embodiment, each of the preset entries includes a preset conditional statement and a preset execution statement.

In a preferred embodiment, the processing unit 3 traverses the preset conditional statements in all the preset entries in the storage unit according to the conditional statement in the target entry to obtain whether the conditional statement is repeated with the preset conditional statement. If not, Then, a weight operation is performed, and the target item is stored in the storage unit 4 to form a new preset item, and the robot is trained according to the preset item; if it is repeated, the weight calculation is performed, and the corresponding calculation is performed according to the weight calculation result.

In this embodiment, after adding a new knowledge item or an original knowledge item, when receiving the positive and negative feedback of the user, performing weight calculation, arranging the entire training knowledge base, performing compression, etc., to ensure that the robot is in the condition wheel. The efficiency of the judgment.

As shown in FIG. 2, a method for training a robot by voice includes the following steps:

S1. collecting a voice signal;

S2. parsing the voice signal, matching the voice signal with the preset statement, acquiring a conditional statement matching the preset statement and corresponding to the voice signal, and an execution statement corresponding to the voice signal;

In this embodiment, the robot can be trained only by inputting a voice signal, and the operation is simple, the scope of application is wide, and the efficiency is high.

As shown in FIG. 3, in a preferred embodiment, step S2 specifically includes:

S21. Converting the voice signal into text information;

S22. Parsing the text information, matching the text information with the preset statement, obtaining a conditional statement matching the preset sentence and corresponding to the text information, and identifying that the conditional statement is a standard conditional statement or a feedback conditional statement;

In this embodiment, the voice signal is converted into text information by using Automatic Speech Recognition (ASR) technology, which converts vocabulary content in human speech into computer readable input and interacts with a computer. .

The text information can be parsed by the artificial language natural language processing (NLP) technology, and the conditional statements and execution statements in the text information are obtained by the NLP technology.

In a preferred embodiment, step S2 further includes:

S23. Convert the execution statement into a corresponding audio signal and output it.

In this embodiment, the TTS (Text To Speech) technique is used to convert the execution statement into a corresponding audio signal, which is part of the human-machine dialogue, enabling the robot to speak through the TTS.

As shown in FIG. 4, in a preferred embodiment, step S3 specifically includes:

S31. traversing a preset conditional statement in all preset entries in the storage unit according to the conditional statement in the target entry;

S32. Acquire traversal results and perform weight calculation

Determine whether the conditional statement is a duplicate of a preset conditional statement.

S33. Perform a weight operation, and store the target item in the storage unit to form a new preset item, and train the robot according to the preset item;

S34. Perform a weighting operation and perform corresponding processing according to the weight calculation result.

In this embodiment, when the robot hears the user saying "hello" in the afternoon, the training steps of the user training robot to reply "XXX (person name), good afternoon" are as follows:

A1. The user said "Hello" to the robot, "This time should say, XXX, good afternoon"

A2. Perform semantic analysis on the content spoken by the user, and separate the execution statement in the spoken content, that is, “say XXX, good afternoon”, “say” is the TTS service corresponding to the robot, and “XXX” hits the name of the currently interacting user. In the afternoon, hit the current time, "XXX, good afternoon" corresponds to the content of the TTS service;

A3. Generate a new knowledge base entry based on the semantic analysis result, and then add the weight to the local knowledge base after determining the weight;

A4. The robot executes a new knowledge base and ends;

After completing this interactive training, when the user says "hello" to the robot, the robot will return Answer "XXX, good afternoon" to achieve the expected training objectives.

The invention enables the user to free hands when training the robot, and realizes the modification of the robot behavior without writing any code, so that the user concentrates more on the training content itself in the training process, instead of the basic problem of how to write the code. on.

The above is only a preferred embodiment of the present invention, and is not intended to limit the scope of the embodiments and the scope of the present invention, and those skilled in the art should be able to Alternatives and obvious variations are intended to be included within the scope of the invention.

Claims

A system for training a robot by voice, comprising:

a receiving unit, configured to receive a voice signal;

An analyzing unit, configured to connect the receiving unit, to parse the voice signal, match the voice signal with a preset statement, and acquire a condition that matches the preset statement and corresponds to the voice signal a statement, and an execution statement corresponding to the voice signal;

a processing unit, coupled to the parsing unit, configured to combine the conditional statement with the execution statement to generate a target entry;

a storage unit, connected to the processing unit, for storing a preset item, and training the robot according to the preset item;

The processing unit performs weight calculation according to the target item, and performs corresponding processing according to the weight calculation result.
The system for training a robot by voice according to claim 1, wherein the parsing unit comprises:

a first conversion module, configured to convert the voice signal into text information;

a semantic analysis module, configured to connect the first conversion module, to parse the text information, match the text information with the preset statement, and obtain a match with the preset statement and a conditional statement corresponding to the text information, and identifying that the conditional statement is a standard conditional statement or a feedback conditional statement;

If the conditional statement is a standard conditional statement, acquiring an execution statement corresponding to the file information;

If the conditional statement is a feedback conditional statement, a weighting operation is performed to cause the robot to perform an operation of the last task.
The system for training a robot by voice according to claim 2, wherein the parsing unit further comprises:

A second conversion module is connected to the semantic analysis module for converting the execution statement into a corresponding audio signal and outputting.
The system for training a robot by voice according to claim 1, wherein each of said preset entries comprises a preset conditional statement and a preset execution statement.
A system for training a robot by voice according to claim 4, wherein said processing unit traverses all of said preset entries in said storage unit according to said conditional statement in said target entry The preset conditional statement to obtain whether the conditional statement is repeated with the preset conditional statement, if not, performing the weighting operation, and storing the target item in the storage unit to form The new preset item is trained according to the preset item; if it is repeated, the weighting operation is performed, and corresponding processing is performed according to the weight calculation result.
A method for training a robot by voice, characterized in that it comprises the following steps:

S1. collecting a voice signal;

S2. Parsing the voice signal, matching the voice signal with a preset statement, acquiring a conditional statement matching the preset statement and corresponding to the voice signal, and corresponding to the voice signal Execute statement

S3. Combining the conditional statement with the execution statement to generate a target entry;

S4. Perform weight calculation according to the target item, and perform corresponding processing according to the weight calculation result.
The method of training a robot by voice according to claim 6, wherein the step S2 specifically includes:

S21. Converting the voice signal into text information;

S22. Parsing the text information, matching the text information with the preset statement, acquiring a conditional statement matching the preset statement and corresponding to the text information, and identifying the conditional statement Is a standard conditional statement or a feedback conditional statement;

If the conditional statement is a standard conditional statement, obtaining execution corresponding to the file information Statement

If the conditional statement is a feedback conditional statement, a weighting operation is performed to cause the robot to perform an operation of the last task.
The method of training a robot by voice according to claim 7, wherein the step S2 further comprises:

S23. Convert the execution statement into a corresponding audio signal and output.
A method for training a robot by voice according to claim 6, wherein each of said preset entries comprises a preset conditional statement and a preset execution statement.
The method of training a robot by voice according to claim 9, wherein the step S3 specifically includes:

S31. Traverse the preset conditional statement in all the preset entries in the storage unit according to the conditional statement in the target entry;

S32. Acquire a traversal result, and determine whether the conditional statement is repeated with the preset conditional statement.

If the conditional statement does not overlap with the preset conditional statement, step S33 is performed;

If the conditional statement is repeated with the preset conditional statement, step S34 is performed;

S33. Perform the weighting operation, and store the target entry in the storage unit to form a new preset entry, and train the robot according to the preset entry;

S34. Perform the weighting operation, and perform corresponding processing according to the weighting calculation result.