CN108406848A

CN108406848A - A kind of intelligent robot and its motion control method based on scene analysis

Info

Publication number: CN108406848A
Application number: CN201810210328.4A
Authority: CN
Inventors: 刘阳; 赵强; 张银磊; 吴雄辉; 佀昶
Original assignee: Anhui Fruit Intelligent Technology Co Ltd
Current assignee: Anhui Fruit Intelligent Technology Co Ltd
Priority date: 2018-03-14
Filing date: 2018-03-14
Publication date: 2018-08-17

Abstract

The invention discloses a kind of intelligent robots and its motion control method based on scene analysis.By vision, the sense of hearing, tactile module, the information in current scene is obtained, such as：Human body information, object information, user speech information etc..Each module data is uniformly aggregated into scene analysis module, and information priorities arrangement and the fusion of several scenes information are carried out by scene analysis module.Scene information module by analysis after, complete modeling to current environment, and the generation of corresponding movement instruction, then pass to movement output module, send commands to each execution unit, carry out the execution of corresponding sports.The present invention is completed by the fusion of multisensor to the identification of scene rather than single vision or speech recognition.To the identification of people in scene Recognition, the identification to object, and scene information is uniformly synthesized to the identification of environment, enhance the adaptability of identification.Enhance the independence and interactivity of robot.

Description

A kind of intelligent robot and its motion control method based on scene analysis

Technical field

The present invention relates to robotic technology field more particularly to a kind of intelligent robots and its movement based on scene analysis Control method.

Background technology

Tool of the robot as neither one sense of independence can only be used for solving the very high work of some standardization level Make, the possibility that more working robot of single action substitutes is higher, and initial application is namely in automobile making.By The development of decades, robot technology constantly improve, intelligence degree is also higher and higher, and application is also extended to from manufacturing industry Consumption market.

Consumer level robot, form that there are two types of mainstreams：One kind is controlled based on movenent performance with mobile phone or controller System；Another kind is based on intelligence system, but most is to be mounted in common trolley or fixed structure.At present on the market There is not yet intelligently to interact with people, but also with the consumer level robot of good autokinetic movement performance.

Invention content

In view of the drawbacks described above of the prior art, technical problem to be solved by the invention is to provide one kind based on scene point The intelligent robot and its motion control method of analysis, respectively energy utonomous working, manual operation and human-computer interaction, intelligence degree Height can make corresponding reaction according to the difference of ambient enviroment.

To achieve the above object, the present invention provides a kind of intelligent robots based on scene analysis, including robot to regard Feel module, sense of hearing module, tactile module, scene analysis module, movement output module, the vision module, sense of hearing module, tactile Module is connect with scene analysis module, and the scene analysis module output end is connect with movement output module, wherein：Vision mould Block includes sequentially connected image reading submodule, image procossing submodule, object detection sub-module；Sense of hearing module include according to The environmental noise processing submodule of secondary connection, current signature detection submodule, auditory information collects submodule；Tactile module includes Sequentially connected tactile sensing submodule, tactile data summarizing module；Scene analysis module is believed comprising sequentially connected sensor Breath fusion submodule, priority screening submodule, action generate submodule.

Further, the object detection sub-module of the vision module include human detection module, object detection module, Detection of obstacles module and the visual information being connect with human detection module, object detection module, detection of obstacles module are converged Total module.

Further, the tactile sensing submodule of the tactile module includes pressure sensor and touch sensor, pressure For sensor arrangement in end effector of robot, touch sensor is arranged in robot matrix surface.

Further, the movement output module includes motor and detects the position sensor of motor position.

A kind of intelligent robot motion control method based on scene analysis, it is characterised in that：

Vision module obtains image information, including：Detection and people relevant information, detection other objects in addition to the human body It whether there is obstacle in front of information, detection robot；

Sense of hearing module obtains auditory information, including：By the voice signal filter preprocessing of the environment of detection, detection sound point Class, and semantic information is extracted to the sound of different classifications；

Tactile module obtains tactile data, including：Robot captures the information of object, and the information whether touched；

Scene analysis module synthesis vision module, sense of hearing module, tactile module information, are merged, priority arrangement, are led to It crosses deep neural network model and generates Scene Semantics information, according to Scene Semantics information, generate robot corresponding actions；

Robot motion is distributed to the movement of each motor by movement output module.

Further, the vision module obtains image information, specifically includes：

Camera constantly updates the image data of buffer area with the rate capture images of 30fps, later modules with The form of multithreading, while calling same frame image in buffer area；

Human detection module predominantly detects and the relevant information of people；First detect picture in whether someone, if nobody, Directly return the result；If someone, the posture and gesture information of human body are detected；Detect whether face simultaneously, if there is Face then detect the including but not limited to expression of face, face whether recognize, face gender, the information at face age；Later, will The information of all human detection modules is transmitted to visual information summarizing module；

Object detection module predominantly detects the information of other objects in addition to the human body, and system prestores the spy of familiar object It levies and marked, object detection module scans in the image collected, looks for whether that there are marked objects, such as Fruit detects market object, and the object number detected and the location information in picture, which are transmitted to visual information, converges Total module；

Obstacle detection module, which predominantly detects, whether there is obstacle in front of robot, this module according to image continuity information, Whether there is doubtful barrier in detection image, and returns to its position and size information to visual information in the picture and summarize mould Block；

Visual information summarizing module receives human detection module, object detection module, detection of obstacles module and inputs Information after information sorting, is transmitted to scene analysis module and is analyzed according to certain format.

Further, the sense of hearing module obtains auditory information, and specific steps include：

After microphone captures the voice signal of environment, environmental noise processing submodule filter preprocessing behaviour is first passed around Make, filters out ambient noise；

Voice and other stored sound are detected by current signature detection submodule；

For stored specific sound, corresponding semantic information is stored in system, which is sent directly to Auditory information collects submodule；For voice, extract what voice included according to the characteristic model of current signature detection submodule Text information is sent to auditory information collects submodule by keyword；

Auditory information collects submodule summarizes all acoustic informations, and scene analysis module is sent to according to fixed format It is analyzed.

Further, the tactile module obtains tactile data, specifically includes：

Pressure sensor is arranged in end effector of robot, by information of voltage judge paw whether capture object and The weight of object；

Touch sensor is arranged in robot matrix surface, by judging that the voltage pulse at each position changes, obtains machine Whether device people is touched and touching position information.

Further, the scene analysis module synthesis vision module, sense of hearing module, tactile module information, merged, Priority arrangement generates Scene Semantics information by deep neural network model, according to Scene Semantics information, generates robot phase It should act, specially：

Vision module, sense of hearing module, tactile module remember timestamp, scene before handling data, in pending data subscript Analysis module is after obtaining information, first by timestamp verification data, and all data is pressed newest time synchronization；

After time synchronization, by there are the progress of the information of correlation after vision module, sense of hearing module, tactile module detection process Fusion；

To after fusion information carry out priority arrangement, according to priority from high to low be people, animal, barrier, other The priority orders of object screen fuse information；

After being screened to fuse information according to priority orders, by advance trained deep neural network model, generate Current scene semantic information is converted to the instruction of robot execution action according still further to certain rule, is sent to action and generates son Module.

The beneficial effects of the invention are as follows：

1, it is completed to the identification of scene rather than single vision or speech recognition by the fusion of multisensor.

2, after obtaining scene information, robot can make reaction on limb action rather than single voice or Screen reacts.

3, to the identification of people in scene Recognition, the identification to object, and scene information is uniformly synthesized to the identification of environment, Enhance the adaptability of identification.

4, the independence and interactivity of robot are enhanced.

The technique effect of the design of the present invention, concrete structure and generation is described further below with reference to attached drawing, with It is fully understood from the purpose of the present invention, feature and effect.

Description of the drawings

Fig. 1 is the overall structure block diagram of the present invention.

Fig. 2 is the vision module work flow diagram of the present invention.

Fig. 3 is the sense of hearing module work flow diagram of the present invention.

Fig. 4 is the tactile module work flow diagram of the present invention.

Fig. 5 is the scene analysis module work flow diagram of the present invention.

Fig. 6 is the movement output module work flow diagram of the present invention.

Specific implementation mode

As shown in Figure 1, a kind of intelligent robot based on scene analysis, including robot vision module, sense of hearing module, touch Feel module, scene analysis module, movement output module, the vision module, sense of hearing module, tactile module with scene analysis mould Block connects, and the scene analysis module output end is connect with movement output module, wherein：Vision module includes sequentially connected figure As reading submodule, image procossing submodule, object detection sub-module；Sense of hearing module includes at sequentially connected environmental noise Manage submodule, current signature detection submodule, auditory information collects submodule；Tactile module includes sequentially connected tactile sensing Submodule, tactile data summarizing module；Scene analysis module includes sequentially connected sensor data fusion submodule, priority Screen submodule, action generates submodule.

In the present embodiment, the object detection sub-module of the vision module includes human detection module, object detection mould Block, detection of obstacles module and the visual information being connect with human detection module, object detection module, detection of obstacles module Summarizing module.

In the present embodiment, the tactile sensing submodule of the tactile module includes pressure sensor and touch sensor, pressure Force snesor is arranged in end effector of robot, and touch sensor is arranged in robot matrix surface.

In the present embodiment, the movement output module includes motor and detects the position sensor of motor position.

As shown in figures 2-6, a kind of intelligent robot motion control method based on scene analysis：

In the present embodiment, the vision module obtains image information, specifically includes：

In the present embodiment, the sense of hearing module obtains auditory information, and specific steps include：

In the present embodiment, the tactile module obtains tactile data, specifically includes：

In the present embodiment, the scene analysis module synthesis vision module, sense of hearing module, tactile module information are melted It closes, priority arrangement, Scene Semantics information is generated by deep neural network model, according to Scene Semantics information, generate machine People's corresponding actions, specially：

The detailed description below principle of the invention：

As shown in Fig. 2, the information that vision module obtains includes：In picture whether someone, whether had robot in picture Object through memory.Such as someone, if there is face.If any face, whether face has recognized, if has mood.People and Posture of the target object relative to robot.

Visual processes are primarily referred to as capturing image to the process for obtaining semantic information in image from camera.First, it takes the photograph As head is with the rate capture images of 30fps, and constantly update the image data of buffer area.Later, modules are with multithreading Form, while calling same frame image in buffer area.

Human detection module predominantly detects and the relevant information of people.First detect picture in whether someone, if nobody, Directly return the result.If someone, the posture and gesture information of human body are detected.Detect whether face simultaneously, if there is Face then detects the expression of face, and whether face recognizes, face gender, the information such as face age.Later, by all human testings The information of module is transmitted to visual information summarizing module.

Object detection module predominantly detects the information of other objects in addition to the human body.System is advance and stores common object The feature of body by scanning on the image, looks for whether there are marked object, such as pet in object detection module, Flowerpot, desk, dustbin etc..Later, the object number that will be detected, and location information in picture are transmitted to vision letter Cease summarizing module.

Obstacle detection module, which predominantly detects, whether there is obstacle in front of robot.This module according to image continuity information, Whether there is doubtful barrier in detection image, and returns to its position and size information to visual information in the picture and summarize mould Block.

Visual information summarizing module receives the information of modules input, according to certain format, after information sorting, Scene analysis module is transmitted to be analyzed.

As shown in figure 3, the information that sense of hearing module obtains includes：Whether sound letter that robot remembered is had in environment Number, if having whether human voice signal, voice signal contain semantic information.Auditory processing refers mainly to be collected into sound from microphone Data are to the process for getting in sound semantic and text message.

After microphone captures the voice signal of environment, the pretreatment operations such as filtering are first passed around, environment is filtered out and makes an uproar Sound detects voice and other stored sound by specific frequency later (such as mewing is barked).For stored Specific sound has stored corresponding semantic information in system, which is sent directly to auditory information summarizing module.For Voice, the keyword that voice includes is extracted according to characteristic model, and text information is sent to auditory information summarizing module.It Auditory information summarizing module summarizes all information afterwards, and being sent to scene analysis module according to fixed format is analyzed.

As shown in figure 4, the information that tactile module obtains includes：Whether robot is touched, and the number touched is with timely Between.Tactile processing refers to that the data obtained from pressure sensor and touch sensor obtain the object being in direct contact with robot Information.

Pressure sensor is arranged in end effector of robot, i.e. robot hand.For judging hand by information of voltage Whether pawl captures object, the weight etc. of object.

Tactile data summarizing module will capture information and touch information summarizes, and scene analysis is sent to according to fixed format Module is analyzed.

As shown in figure 5, the effect of scene analysis module is to integrate each module information, carries out priority arrangement and merge； Judge residing scene at present by intelligent algorithm；According to scene serial number, corresponding actions are generated.

Scene analysis is primarily referred to as summarizing the information that vision, the sense of hearing, tactile obtain, and merges, screening, and final Go out the process of the semantic description best to current scene.

The speed of each module process data is different, therefore is passed to the vision of scene analysis module simultaneously, the sense of hearing, tactile letter It ceases asynchronous.In order to solve this problem, each module all can remember timestamp before handling data in pending data subscript. Scene analysis module is after obtaining information, first by timestamp verification data, and all data is pressed newest time synchronization.

Since the information in modules can have correlation, so to be merged to information.As detected in picture There is happy expression, while hearing there is laugh, this two information will be fused to happy mood simultaneously.

Meanwhile the information that modules obtain is very more, can not be handled in real time, so to carry out priority to information Arrangement.After each module information carries out synthesis, people, animal, barrier, other four classes of object, later according to people can be divided into>Animal >Barrier>The priority of other objects selects information.

According to priority to information sifting after, by advance trained deep neural network model, generate according to subject+ One information of predicate (+object).Such as Ren Mo robots, people laughs at, and barrier occurs etc..Later by current scene semantic information, Robot is converted to according to certain rule and needs the instruction executed, is sent to execution module.

As shown in fig. 6, movement output is mainly that the robot motion instruction that other modules generate is transmitted to motion control Device, motion controller generate the electrical parameter needed for motor and pass to actuating motor, and motor drives executing agency to complete corresponding machine Human action.

In order to allow robot precise motion, each motor all to carry position-detection sensor, acquire the real-time appearance of robot State information.Since robot different parts deposit relevance, in order to ensure that robot motion is consistent with expection, motion controller is also wanted Export the priority time of each actuating motor work.

To sum up, the present invention has the advantage that：

4, the independence and interactivity of robot are enhanced.

The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be in the protection domain being defined in the patent claims.

Claims

1. a kind of intelligent robot based on scene analysis, which is characterized in that including robot vision module, sense of hearing module, touch Feel module, scene analysis module, movement output module, the vision module, sense of hearing module, tactile module with scene analysis mould Block connects, and the scene analysis module output end is connect with movement output module, wherein：Vision module includes sequentially connected figure As reading submodule, image procossing submodule, object detection sub-module；Sense of hearing module includes at sequentially connected environmental noise Manage submodule, current signature detection submodule, auditory information collects submodule；Tactile module includes sequentially connected tactile sensing Submodule, tactile data summarizing module；Scene analysis module includes sequentially connected sensor data fusion submodule, priority Screen submodule, action generates submodule.

2. a kind of intelligent robot based on scene analysis as described in claim 1, it is characterised in that：The vision module Object detection sub-module include human detection module, object detection module, detection of obstacles module and with human testing mould Block, object detection module, the visual information summarizing module of detection of obstacles module connection.

3. a kind of intelligent robot based on scene analysis as described in claim 1, it is characterised in that：The tactile module Tactile sensing submodule includes pressure sensor and touch sensor, and pressure sensor is arranged in end effector of robot, is touched Sensor arrangement is touched in robot matrix surface.

4. a kind of intelligent robot based on scene analysis as described in claim 1, it is characterised in that：The movement output mould Block includes motor and detects the position sensor of motor position.

5. a kind of intelligent robot motion control method based on scene analysis, it is characterised in that：

Vision module obtains image information, including：Detection and the relevant information of people, the letter for detecting other objects in addition to the human body It whether there is obstacle in front of breath, detection robot；

Sense of hearing module obtains auditory information, including：By the voice signal filter preprocessing of the environment of detection, sound classification is detected, And semantic information is extracted to the sound of different classifications；

Scene analysis module synthesis vision module, sense of hearing module, tactile module information, are merged, priority arrangement, pass through depth It spends neural network model and generates Scene Semantics information, according to Scene Semantics information, generate robot corresponding actions；

6. a kind of intelligent robot motion control method based on scene analysis as claimed in claim 5, which is characterized in that institute It states vision module and obtains image information, specifically include：

Camera constantly updates the image data of buffer area with the rate capture images of 30fps, and modules are with multi-thread later The form of journey, while calling same frame image in buffer area；

Human detection module predominantly detects and the relevant information of people；First detect picture in whether someone, if nobody, directly It returns the result；If someone, the posture and gesture information of human body are detected；Face is detected whether simultaneously, if there is face Then detection include but not limited to the expression of face, face whether recognize, face gender, the information at face age；Later, will own The information of human detection module is transmitted to visual information summarizing module；

Object detection module predominantly detects the information of other objects in addition to the human body, and system prestores the feature of familiar object simultaneously And it is marked, object detection module scans in the image collected, looks for whether there are marked object, if inspection Market object is measured, the object number detected and the location information in picture, which are transmitted to visual information, summarizes mould Block；

Obstacle detection module, which predominantly detects, whether there is obstacle in front of robot, this module is according to image continuity information, detection Whether there is doubtful barrier in image, and returns to its position and size information in the picture to visual information summarizing module；

Visual information summarizing module receives human detection module, object detection module, the information of detection of obstacles module input, According to certain format, after information sorting, it is transmitted to scene analysis module and is analyzed.

7. a kind of intelligent robot motion control method based on scene analysis as claimed in claim 5, which is characterized in that institute It states sense of hearing module and obtains auditory information, specific steps include：

After microphone captures the voice signal of environment, environmental noise processing submodule filter preprocessing operation is first passed around, Filter out ambient noise；

For stored specific sound, corresponding semantic information is stored in system, which is sent directly to the sense of hearing Information collects submodule；For voice, the key that voice includes is extracted according to the characteristic model of current signature detection submodule Text information is sent to auditory information collects submodule by word；

Auditory information collects submodule summarizes all acoustic informations, and being sent to scene analysis module according to fixed format carries out Analysis.

8. a kind of intelligent robot motion control method based on scene analysis as claimed in claim 5, which is characterized in that institute It states tactile module and obtains tactile data, specifically include：

Pressure sensor is arranged in end effector of robot, judges whether paw captures object and object by information of voltage Weight；

Touch sensor is arranged in robot matrix surface, by judging that the voltage pulse at each position changes, obtains robot Whether touched and touching position information.

9. a kind of intelligent robot motion control method based on scene analysis as claimed in claim 5, which is characterized in that institute Scene analysis module synthesis vision module, sense of hearing module, tactile module information are stated, is merged, priority arrangement, passes through depth Neural network model generates Scene Semantics information, according to Scene Semantics information, generates robot corresponding actions, specially：

Vision module, sense of hearing module, tactile module remember timestamp, scene analysis before handling data, in pending data subscript Module is after obtaining information, first by timestamp verification data, and all data is pressed newest time synchronization；

After time synchronization, by there are the information of correlation to melt after vision module, sense of hearing module, tactile module detection process It closes；

Priority arrangement is carried out to the information after fusion, is people, animal, barrier, other objects from high to low according to priority Priority orders screen fuse information；

After being screened to fuse information according to priority orders, by advance trained deep neural network model, generate current Scene Semantics information is converted to the instruction of robot execution action according still further to certain rule, is sent to action and generates submodule.