CN108491767A

CN108491767A - Autonomous roll response method, system and manipulator based on Online Video perception

Info

Publication number: CN108491767A
Application number: CN201810182168.7A
Authority: CN
Inventors: 蔡颖鹏; 陈希
Original assignee: Beijing Time Robot Technology Co Ltd
Current assignee: Beijing Time Robot Technology Co Ltd
Priority date: 2018-03-06
Filing date: 2018-03-06
Publication date: 2018-09-04
Anticipated expiration: 2038-03-06
Also published as: CN108491767B

Abstract

The present invention provides a kind of autonomous roll response method and system based on Online Video perception, this method and system are applied to manipulator, specially obtain the video actions sequence at user partial position；By in video actions sequence present frame and forward frame information input shift to an earlier date in trained prediction model, obtain the anticipation result to part；The targeted attitude that anticipation result is used to that manipulator to be made to realize for anticipation action.This programme puts forth effort on the fine identification of the action of the part to user in interactive process, attempt in the action of the part of user from starting to during end, carry out dynamic rolling anticipation, and then pointedly the action of manipulator is adjusted into Mobile state, quick, smooth manipulator response is ultimately formed, to improve the interaction success rate of entire interactive process.

Description

Autonomous roll response method, system and manipulator based on Online Video perception

Technical field

The present invention relates to robotic technology fields, more particularly to a kind of autonomous roll response based on Online Video perception Method, system and manipulator.

Background technology

Manipulator have higher degree of freedom and action execute speed, can be effectively carried out interacted with people it is various bionical Gesture executes scheduled action, is the effective carrier for carrying out various actions.With manipulator production cost it is continuous reduction, Execution efficiency is increasingly promoted, and the application depth and range of manipulator also constantly extend.

In terms of people and manipulator interaction, one important and common scene is that manipulator is moved according to human hands are different Make, carries out targetedly gesture and convert, form interesting human-computer interaction, or even complete specific interactive task.Such as people and machine Finger-guessing game match between tool hand, manipulator dynamically adjust itself posture while perceive human body gesture motion, formed it is quick, Natural reply action.Above-mentioned scene can carry out and popularize in places such as recreation ground, market or even science and technology centers, to rich The rich cultural life of the people, and drive the development and transition of people's livelihood recreation industry, Popular Science Education industry.

During above-mentioned human-computer interaction, to autonomous sensing capability, the strain rate effect ability of arm-and-hand system propose compared with High requirement.

The inventors of the present application found that being followed up according to the gesture motion of external world's perception for manipulator, current method is Human hands are acted and carry out one-off recognition work, and corresponding action is executed according to recognition result, whole process be it is static, Disposably, once error causes the interaction success rate of manipulator and user to be lower without the chance of dynamic adjustment.

Invention content

In view of this, the present invention provides a kind of autonomous roll response method, system and machines based on Online Video perception Tool hand, the interaction success rate for improving manipulator and user.

A kind of autonomous roll response method based on Online Video perception, is applied to manipulator, the autonomous roll response Method includes step：

In the entire action process of user, the video actions sequence of the part of user is obtained；

By in the video actions sequence present frame and forward frame information input shift to an earlier date in trained prediction model, obtain To the anticipation result to the part；The anticipation result is for making the manipulator carry out rolling tune to itself action It is whole, to realize the targeted attitude for anticipation action.

Optionally, the video actions sequence for obtaining user partial position, including：

The video actions sequence of the part is obtained using two-dimensional color camera or gray scale camera.

Optionally, further include step：

The current pose of the manipulator is adjusted according to the anticipation result；

When the anticipation result confidence level reaches preset confidence threshold value, the current pose is adjusted to the mesh Mark posture.

Optionally, the prediction model is obtained by following step：

Acquire the video sequence of a variety of parts；

The video sequence is handled by presetting method, obtains multiple training datas；

Preset function is trained using the method for supervised learning, and using the multiple training data, is obtained described Prediction model.

A kind of autonomous roll response system based on Online Video perception, is applied to manipulator, the autonomous roll response System includes：

Retrieval module, in the entire action process of user, obtaining the video actions of the part of user Sequence；

Action anticipation module, for by the video actions sequence present frame and forward frame information input train in advance In good prediction model, the anticipation result to the part is obtained；The anticipation result is for making the manipulator to certainly Body action carries out rolling adjustment, to realize the targeted attitude for anticipation action.

Optionally, the retrieval module specifically utilizes two-dimensional color camera or gray scale camera to obtain the part portion The video actions sequence of position.

Optionally, further include：

The first adjustment module, the current pose for adjusting the manipulator according to the anticipation result；

Second adjustment module, for when the anticipation result confidence level reaches preset confidence threshold value, working as by described in Preceding pose adjustment is the targeted attitude.

Optionally, further include model training module for training the prediction model, the model training module includes：

Sequence acquisition unit, the video sequence for acquiring a variety of parts；

Data processing unit obtains multiple training datas for being handled by presetting method the video sequence；

Function training unit, for the method using supervised learning, and using the multiple training data to preset function It is trained, obtains the prediction model.

A kind of manipulator, which is characterized in that including autonomous roll response system as described above.

It can be seen from the above technical proposal that the present invention provides a kind of autonomous roll responses based on Online Video perception Method and system, this method and system are applied to manipulator, specially obtain the video actions sequence at user partial position；It will regard Present frame and forward frame information input in frequency action sequence shift to an earlier date in trained prediction model, obtain to the pre- of part Sentence result；The targeted attitude that anticipation result is used to that manipulator to be made to realize for anticipation action.This programme puts forth effort on human-computer interaction To the fine identification of the action of the part of user in journey, it is intended to act the mistake from starting to end in the part of user Cheng Zhong carries out dynamic rolling anticipation, and then is pointedly adjusted into Mobile state to the action of manipulator, ultimately forms quick, flat Sliding manipulator response, to improve the interaction success rate of entire interactive process.

Description of the drawings

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.

The step of Fig. 1 is a kind of autonomous roll response method perceived based on Online Video provided by the embodiments of the present application is flowed Cheng Tu；

The step of Fig. 2 is another autonomous roll response method perceived based on Online Video provided by the embodiments of the present application Flow chart；

Fig. 3 is a kind of step flow chart of the training method of prediction model provided in an embodiment of the present invention；

Fig. 4 is a kind of structural frames of autonomous roll response system based on Online Video perception provided by the embodiments of the present application Figure；

Fig. 5 is the structure of another autonomous roll response system based on Online Video perception provided by the embodiments of the present application Block diagram；

Fig. 6 is the structure of another autonomous roll response system based on Online Video perception provided by the embodiments of the present application Block diagram.

Specific implementation mode

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Embodiment one

The step of Fig. 1 is a kind of autonomous roll response method perceived based on Online Video provided by the embodiments of the present application is flowed Cheng Tu.

Autonomous rolling developing method provided in this embodiment based on Online Video perception is applied to manipulator, specifically Applied in the control device of the manipulator, manipulator is improved for being realized during being controlled manipulator in control device With the purpose of the interaction success rate of user.As shown in Figure 1, the autonomous roll response method of the present embodiment specifically includes following step Suddenly：

S101：Obtain the video actions sequence at user partial position.

Because this programme purpose is that manipulator is made to act for the part especially hand of user, therefore this In obtain the part of user, the especially action sequence of hand first, specifically obtain the video actions sequence of hand Row.

It is specifically to be regarded using two-dimensional color camera or gray scale camera to obtain the part of user in the present embodiment Frequency action sequence.It, can for scheme in compared with the prior art using the 3 d pose of the hand of depth camera capture people Reduce cost, and can avoid depth camera caused by Image Acquisition rate is relatively low can not fast Acquisition high speed gesture ask Topic.

S102：It is prejudged according to the action of video actions sequence pair part.

After obtaining above-mentioned video actions sequence, using advance trained prediction model to the part of user, tool Body is to predict the action of the hand of user.Specifically by above-mentioned video actions sequence inputting prediction model, Video actions sequence is calculated using prediction model, to obtain corresponding prediction result.

Specific in the present embodiment, because the hand of user is in continuous variation, therefore, it is possible to only be observed currently To hand partial act in the case of, the molar behavior of hand is prejudged, that is, is prejudged after the hand makes complete action Final carriage, if determination is " scissors ", " hammer " or " cloth ".

The final carriage of the prediction result, that is, hand motion, for enabling manipulator to realize the mesh for the prediction result Mark posture, the final carriage for the hand motion corresponding to the user that targeted attitude can execute for manipulator.

It can be seen from the above technical proposal that present embodiments providing a kind of autonomous rolling sound perceived based on Online Video Induction method, this method are applied to manipulator, specially obtain the video actions sequence at user partial position；By video actions sequence In present frame and forward frame information input shift to an earlier date in trained prediction model, obtain the anticipation result to part；In advance Result is sentenced for making manipulator realize the targeted attitude for anticipation action.This programme is put forth effort in interactive process to user Part action fine identification, it is intended in the action of the part of user from starting to during end, carry out Dynamic rolling prejudges, and then is pointedly adjusted into Mobile state to the action of manipulator, ultimately forms quick, smooth manipulator Response, to improve the interaction success rate of entire interactive process.

Further include following steps in the present embodiment, as shown in Figure 2：

S103：The current pose of manipulator is adjusted according to anticipation result.

After obtaining prediction result, based on current prediction result, and based on the anticipation to each possible gesture option Confidence level as a result, Optimal Decision-making manipulator targeted attitude, and correspondingly adjustment Current mechanical hand posture leaned on to the targeted attitude Hold together.Current posture is adjusted according to anticipation result, close to the targeted attitude.

It should be noted that the full standard action that current goal posture is often manipulator itself (such as goes out and " cuts Knife ") process a certain posture in centre (such as two fingers slightly " are bent ", this is a middle attitude for " scissors ").This One way, had both improved the response speed that manipulator acts human hands, i.e., human body do the midway that completely acts immediately to Go out response.It in turn ensures that its set complete action will not disposably be finished by manipulator rashly, avoids due to observation Human action is imperfect and causes it to provide errored response and moves.

In the present embodiment the determination mechanism of above-mentioned targeted attitude be based in advance structure maximum confidence action anticipation result with The rule list realization of posture is responded, as shown in table 1.

Human body hand practical posture online	Maximum confidence prejudges result	Manipulator responds targeted attitude
			It clenches fist	Stone	Five fingers open slightly
Five fingers open slightly	Cloth	Two fingers open slightly
			Two fingers open slightly	Scissors	It clenches fist

Table 1

S104：When anticipation result reaches confidence threshold value, current pose is adjusted.

With observation video frame number accumulation, anticipation result gradually converges to final action, when dynamic to some human hands At the time of work judges that confidence level is sufficiently high or a set of hand motion of human body will be finished, final corresponding actions are determined simultaneously Completely executed.

The action of above-mentioned human hands is online roll anticipation and according to the process of anticipation result dynamic adjusting machine tool hand posture such as Shown in table 2.

Table 2

Prediction model in the present embodiment is trained to obtain by following step, as shown in Figure 3：

S5001：Acquire the video sequence of a variety of activities.

Specifically, the video sequence for acquiring multiclass hand instruction action acts for every class, acquires and preserve more parts not With the video sequence data of external shooting environmental and different action executors.In the present embodiment, action classification include " stone ", Three kinds of action sequences of " scissors ", " cloth ", each action sequence midway contains from the initial attitude clenched fist to arms swing, to micro- It is micro- spread one's fingers, certain fingers open etc. part son action, ultimately form complete action sequence.

S5002：By obtaining multiple training datas to the processing of video sequence.

To the training video sequence of above-mentioned steps acquisition, using sides such as affine transformation, contrast stretching, hand skin color transformation Formula carries out further data enhancing, expands training data, obtains multiple training datas, constitute corresponding training dataset, to carry The universality of high training set.

S5003：It is trained to obtain prediction model using multiple training datas.

After obtaining corresponding multiple training datas, temporally list entries grader frame by frame from front to back, and give simultaneously Fixed corresponding action classification label so that sorter model acts the dynamic of integrity degree in input different length sequence, and difference When making sequence, corresponding complete action can be prejudged and be identified.Ideally, the sequence frame observed with model Increase, observation is more complete, and action anticipation accuracy rate is higher.

In the present embodiment, we are used as using depth of round neural network (recurrent neural network) and are connect It is incremented by video sequence by dynamic and its corresponding complete action classification marks fallout predictor way of realization as input.The network passes through The loop structure of orientation is introduced, it being capable of associated Series Modeling and forecasting problem before and after the processing.Here pass through the side of supervised learning Formula trains network parameter using category label as supervision message, so that network has the video sequence of receiving portion observation Under the conditions of action prediction ability, be suitable for the present embodiment the problem of scene.

Embodiment two

Fig. 4 is a kind of structural frames of autonomous roll response system based on Online Video perception provided by the embodiments of the present application Figure.

Autonomous rolling toning system provided in this embodiment based on Online Video perception is applied to manipulator, specifically Applied in the control device of the manipulator, manipulator is improved for being realized during being controlled manipulator in control device With the purpose of the interaction success rate of user.As shown in figure 4, the autonomous roll response system of the present embodiment specifically includes retrieval Module 10 and action anticipation module 20.

Retrieval module is used to obtain the video actions sequence at user partial position.

The module is specifically that the local portion of user is obtained using two-dimensional color camera or gray scale camera in the present embodiment The video actions sequence of position.Compared with the prior art in using depth camera capture people hand 3 d pose scheme come It says, cost can be reduced, and can avoid depth camera can not fast Acquisition high speed hand caused by Image Acquisition rate is relatively low The problem of gesture.

Action anticipation module according to the action of video actions sequence pair part for being prejudged.

It can be seen from the above technical proposal that present embodiments providing a kind of autonomous rolling sound perceived based on Online Video System is answered, which is applied to manipulator, specially obtains the video actions sequence at user partial position；By video actions sequence In present frame and forward frame information input shift to an earlier date in trained prediction model, obtain the anticipation result to part；In advance Result is sentenced for making manipulator realize the targeted attitude for anticipation action.This programme is put forth effort in interactive process to user Part action fine identification, it is intended in the action of the part of user from starting to during end, carry out Dynamic rolling prejudges, and then is pointedly adjusted into Mobile state to the action of manipulator, ultimately forms quick, smooth manipulator Response, to improve the interaction success rate of entire interactive process.

System in the present embodiment further includes the first adjustment module 30 and second adjustment module 40, as shown in Figure 5：

The first adjustment module is used to be adjusted the current pose of manipulator according to anticipation result.

Table 1

Second adjustment module is used to, when anticipation result reaches confidence threshold value, adjust current pose.

Table 2

System in the present embodiment further includes model training module 50, and the module is for training prediction model, the prediction mould Type specifically includes sequence acquisition unit 51, data processing unit 52 and function training unit 53, as shown in Figure 6：

Sequence acquisition unit is used to acquire the video sequence of a variety of activities.

Data processing unit is used for by obtaining multiple training datas to the processing of video sequence.

Function training unit using multiple training datas for being trained to obtain prediction model.

Embodiment three

A kind of manipulator is present embodiments provided, which is a complete system, and minimum includes corresponding movement Component and control device, the control device are provided with the self-service rolling corresponding system provided in embodiment as above.This is independently rolled Dynamic response system is specifically used for obtaining the video actions sequence at user partial position；By the present frame in video actions sequence with before Shift to an earlier date in trained prediction model to frame information input, obtains the anticipation result to part；Anticipation result is for making machine Tool hand realizes the targeted attitude for anticipation action.This programme puts forth effort on the dynamic of the part to user in interactive process The fine identification made, it is intended in the action of the part of user from starting to during end, dynamic rolling anticipation is carried out, into And pointedly the action of manipulator is adjusted into Mobile state, quick, smooth manipulator response is ultimately formed, so as to carry The interaction success rate of high entire interactive process.

Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with The difference of other embodiment, the same or similar parts between the embodiments can be referred to each other.

It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can be provided as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.

The embodiment of the present invention be with reference to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in flow and/or box combination.These can be provided Computer program instructions are set to all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine so that is held by the processor of computer or other programmable data processing terminal equipments Capable instruction generates for realizing in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes The device of specified function.

These computer program instructions, which may also be stored in, can guide computer or other programmable data processing terminal equipments In computer-readable memory operate in a specific manner so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one flow of flow chart or multiple flows and/or one side of block diagram The function of being specified in frame or multiple boxes.

These computer program instructions can be also loaded into computer or other programmable data processing terminal equipments so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one flow of flow chart or multiple flows And/or in one box of block diagram or multiple boxes specify function the step of.

Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.

Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also include other elements that are not explicitly listed, or further include for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device including the element.

Technical solution provided by the present invention is described in detail above, specific case used herein is to this hair Bright principle and embodiment is expounded, the explanation of above example is only intended to help understand the present invention method and its Core concept；Meanwhile for those of ordinary skill in the art, according to the thought of the present invention, in specific implementation mode and application There will be changes in range, in conclusion the content of the present specification should not be construed as limiting the invention.

Claims

1. a kind of autonomous roll response method based on Online Video perception, is applied to manipulator, which is characterized in that described autonomous Roll response method includes step：

By in the video actions sequence present frame and forward frame information input shift to an earlier date in trained prediction model, obtain pair The anticipation result of the part；The anticipation result carries out rolling adjustment for making the manipulator act itself, with Realize the targeted attitude for anticipation action.

2. autonomous roll response method as described in claim 1, which is characterized in that the video for obtaining user partial position Action sequence, including：

3. autonomous roll response method as described in claim 1, which is characterized in that further include step：

When the anticipation result confidence level reaches preset confidence threshold value, the current pose is adjusted to the target appearance State.

4. such as the autonomous roll response method of claims 1 to 3 any one of them, which is characterized in that the prediction model passes through Following step obtains：

Acquire the video sequence of a variety of parts；

Preset function is trained using the method for supervised learning, and using the multiple training data, obtains the prediction Model.

5. a kind of autonomous roll response system based on Online Video perception, is applied to manipulator, which is characterized in that described autonomous Roll response system includes：

Retrieval module, in the entire action process of user, obtaining the video actions sequence of the part of user；

Action anticipation module, for by the video actions sequence present frame and forward frame information input it is trained in advance In prediction model, the anticipation result to the part is obtained；The anticipation result is for keeping the manipulator dynamic to itself Rolling adjustment is carried out, to realize the targeted attitude for anticipation action.

6. autonomous roll response system as claimed in claim 5, which is characterized in that the retrieval module specifically utilizes two Dimension color camera or gray scale camera obtain the video actions sequence of the part.

7. autonomous roll response system as claimed in claim 5, which is characterized in that further include：

Second adjustment module, for when the anticipation result confidence level reaches preset confidence threshold value, by the current appearance State is adjusted to the targeted attitude.

8. such as the autonomous roll response system of claim 5~7 any one of them, which is characterized in that further include for training The model training module of prediction model is stated, the model training module includes：

Function training unit carries out preset function for the method using supervised learning, and using the multiple training data Training, obtains the prediction model.

9. a kind of manipulator, which is characterized in that including the autonomous roll response system of such as claim 5~8 any one of them.