CN111091823A

CN111091823A - Robot control system and method based on voice and human face actions and electronic equipment

Info

Publication number: CN111091823A
Application number: CN201911188246.5A
Authority: CN
Inventors: 赖志林; 陈桂芳; 李睿; 俞锦涛
Original assignee: Guangzhou Saite Intelligent Technology Co Ltd
Current assignee: Guangzhou Saite Intelligent Technology Co Ltd
Priority date: 2019-11-28
Filing date: 2019-11-28
Publication date: 2020-05-01

Abstract

The invention discloses a robot control system based on voice and human face actions, which comprises a voice collector, a video collector, an algorithm analysis processor, a memory, a controller and a driving module, wherein the voice collector and the video collector are connected with the algorithm analysis processor, the memory and the controller are connected with the algorithm analysis processor, and the driving module is connected with the controller; the algorithm analysis processor is used for identifying, analyzing and processing the voice information from the voice collector and the image information from the video collector to obtain action control signals and sending the action control signals to the controller, so that the controller controls the driving module to drive the robot to execute corresponding actions according to the action control signals. The invention recognizes and processes the face action and the voice to know the action change trend of the person, thereby not only improving the recognition accuracy, but also simplifying the calculated amount of the algorithm.

Description

Robot control system and method based on voice and human face actions and electronic equipment

Technical Field

The invention relates to the technical field of voice video and image recognition, in particular to a robot control system and method based on voice and human face actions.

Background

At present, the domestic robot interaction methods acquire related information through voice recognition or by matching with conventional lip language actions so as to acquire execution instructions of the next action of the robot.

However, in the prior art, if the robot is controlled by simply relying on the voice recognition technology, the accuracy is not high, even if the robot is matched with lip language actions, the robot is only limited to simple and conventional lip actions, and the robot still cannot recognize the unusual and complex lip actions, so that the problem of low accuracy still exists. And the algorithm matched with the lip language action is too complex, and the operation processing is slow.

Disclosure of Invention

In order to overcome the defects of the prior art, one of the objectives of the present invention is to provide a robot control system based on voice and human face actions, which can solve the problems of complex algorithm, slow operation processing and low accuracy rate in the prior art.

The second purpose of the present invention is to provide a robot control method based on voice and human face actions, which can solve the problems of complex algorithm, slow operation processing and low accuracy rate in the prior art.

The invention also aims to provide electronic equipment which can solve the problems of complex algorithm, slow operation processing and low accuracy rate in the prior art.

One of the purposes of the invention is realized by adopting the following technical scheme:

the robot control system based on voice and human face actions comprises a voice collector, a video collector, an algorithm analysis processor, a memory, a controller and a driving module, wherein the voice collector and the video collector are connected with the algorithm analysis processor, the memory and the controller are connected with the algorithm analysis processor, and the driving module is connected with the controller;

the algorithm analysis processor is used for identifying, analyzing and processing the voice information from the voice collector and the image information from the video collector to obtain action control signals and sending the action control signals to the controller, so that the controller controls the driving module to drive the robot to execute corresponding actions according to the action control signals.

Preferably, the system further comprises a voice player, and the voice player is connected with the algorithm analysis processor.

Preferably, the system further comprises a human-computer interaction device connected with the algorithmic analysis processor, wherein the algorithmic analysis processor is used for displaying the motion control signal to the human-computer interaction device and receiving a confirmation signal from the human-computer interaction device so as to store the motion control signal in the memory.

The second purpose of the invention is realized by adopting the following technical scheme:

the robot control method based on voice and human face actions comprises the following steps:

receiving a voice signal from a voice collector and receiving an image signal from a video collector;

identifying and analyzing the voice signal, and processing the voice signal into voice content; identifying and analyzing the image information, and processing the image information into character content;

processing and comparing the voice content and the character content to obtain instruction information;

receiving confirmation information input by a user and a control action signal selected by the user, binding the instruction information and the control action signal to form action information, and storing the action information;

matching the newly received instruction information with the stored action information or corresponding control action signals;

and sending the control action to the controller so that the controller controls the driving module to drive the robot to execute the corresponding action according to the control action signal.

Preferably, the "recognizing, analyzing and processing the image information into text" specifically includes the following steps:

carrying out gray level processing on the image information to obtain a lip image of the face;

acquiring an image of any line in the lip width direction in a lip image along with time change, and recording the image as X (t), and collecting an image of any line in the lip width direction in the lip image along with time change, and recording the image as H (t), wherein H (t) is a set of X (t);

acquiring an image of any line in the height direction of the lips in the lip image, recording the image as Y (t), collecting an image of any line in the height direction of the lips in the lip image, recording the image as V (t), wherein V (t) is a set of Y (t);

and comparing the change trends of the lips in the lip image along with time in the height direction and the width direction, and analyzing to obtain the current text content.

The third purpose of the invention is realized by adopting the following technical scheme:

an electronic device having a memory, a processor, and a computer readable program stored in the memory and executable by the processor, wherein the computer readable program when executed by the processor implements a robot control method according to a second aspect of the present invention.

Compared with the prior art, the invention has the beneficial effects that:

the invention recognizes and processes the face action and the voice to know the action change trend of the person, thereby not only improving the recognition accuracy, but also simplifying the calculated amount of the algorithm.

Drawings

Fig. 1 is a flowchart of a robot control method based on voice and human face actions according to the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings and the detailed description below:

the invention provides a robot control system based on voice and human face actions, which comprises a voice collector, a video collector, an algorithm analysis processor, a memory, a controller and a driving module, wherein the voice collector and the video collector are connected with the algorithm analysis processor, the memory and the controller are connected with the algorithm analysis processor, and the driving module is connected with the controller.

In the present invention, the voice collector includes, but is not limited to, a recording tool, a microphone, and a sound pickup, and the selected model may be a commercially available model. The video collector can be a camera, and can collect image information and dynamic video information and collect face image changes in real time.

The embodiment further comprises a voice player, and the voice player is connected with the algorithm analysis processor. The human-computer interaction device is connected with the algorithm analysis processor, and the algorithm analysis processor is used for displaying the action control signal to the human-computer interaction device and receiving a confirmation signal from the human-computer interaction device so as to store the action control signal into the memory.

The invention comprises two scenarios of a learning mode and a control module. In the learning mode, the robot enters a learning state, the voice information is captured through the voice collector, the face video is collected through the video collector, and the mouth dynamic information in the face image is mainly obtained. The video stream is processed through gray level, video information is processed, a multi-frame change trend of the video is obtained, the change trend of the lips along with the change of time in the width direction and the height direction, namely the spatial positions of the lips corresponding to different time is obtained, so that the current text content is analyzed, the voice content is obtained according to the voice information, the voice content and the text content are compared and matched, and finally the voice content and the text content are processed into instruction information. And the user inputs the control action corresponding to the instruction information through the human-computer interaction equipment, and the algorithm analysis processor binds the control action and the instruction information and stores the control action and the instruction information into the storage module.

And binding all control actions possibly involved by the robot with corresponding instruction information through a learning mode. In the control mode, the functions of the voice collector and the video collector are the same as those in the learning mode, and the processing principle of the algorithm analysis processor on the voice information and the video information is also completely the same. The difference is that under the control mode, the algorithm analysis processor analyzes the text content and the voice content to obtain the instruction information, then directly obtains the control action matched with the instruction information from the storage module, and drives the robot to execute the corresponding action through the controller and the driving module. The voice player is, for example, a speaker, and may broadcast instruction information, voice content, and text content. The control actions corresponding to the command information are matched in the storage module by adopting the prior art, for example, the control actions can be realized in a coding mode, the command information of different control actions is coded, and the control actions bound by the command information with consistent codes are searched, namely the matched control actions.

The invention also provides a robot control method based on voice and human face actions, as shown in fig. 1, comprising the following steps:

s1: receiving a voice signal from a voice collector and receiving an image signal from a video collector;

s2: identifying and analyzing the voice signal, and processing the voice signal into voice content; identifying and analyzing the image information, and processing the image information into character content;

in this step, the steps of identifying and analyzing the image information and processing the image information into text content specifically include the following steps:

S3: processing and comparing the voice content and the character content to obtain instruction information;

s4: receiving confirmation information input by a user and a control action signal selected by the user, binding the instruction information and the control action signal to form action information, and storing the action information;

s5: matching the newly received instruction information with the stored action information or corresponding control action signals;

s6: and sending the control action to the controller so that the controller controls the driving module to drive the robot to execute the corresponding action according to the control action signal.

The present embodiment also includes a learning mode and a control mode, and the flow and principle executed in the learning mode and the control mode are the same as those of the robot control system provided by the present invention.

The present invention also provides an electronic device having a memory, a processor, and a computer readable program stored in the memory and executable by the processor, wherein the computer readable program, when executed by the processor, implements the robot control method according to the present invention.

Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.

Claims

1. The robot control system based on voice and human face actions is characterized by comprising a voice collector, a video collector, an algorithm analysis processor, a memory, a controller and a driving module, wherein the voice collector and the video collector are connected with the algorithm analysis processor, the memory and the controller are connected with the algorithm analysis processor, and the driving module is connected with the controller;

2. The robotic control system according to claim 1, further comprising a voice player coupled to the algorithmic analysis processor.

3. The robot control system of claim 1, further comprising a human machine interaction device coupled to the algorithmic analysis processor, the algorithmic analysis processor being configured to post-display the motion control signal to the human machine interaction device and to receive a confirmation signal from the human machine interaction device to store the motion control signal in the memory.

4. The robot control method based on voice and human face actions is characterized by comprising the following steps:

5. The robot control method according to claim 4, wherein the step of recognizing, analyzing, and processing the image information into text specifically comprises the steps of:

6. An electronic device having a memory, a processor, and a computer readable program stored in the memory and executable by the processor, wherein the computer readable program, when executed by the processor, implements the robot control method of claim 4.