CN107443384A

CN107443384A - A kind of simulation result of changing is to the visual movement control method of real world

Info

Publication number: CN107443384A
Application number: CN201710835081.0A
Authority: CN
Inventors: 夏春秋
Original assignee: Shenzhen Vision Technology Co Ltd
Current assignee: Shenzhen Vision Technology Co Ltd
Priority date: 2017-09-15
Filing date: 2017-09-15
Publication date: 2017-12-08

Abstract

To the visual movement control method of real world, its main contents includes a kind of conversion simulation result proposed in the present invention：Data generate and training method, its process is, a series of linear paths are generated first with the theory of inverse kinematics to be trained, the information such as color, position to producing image obey the sampling of certain distribution law in the training process, in addition the data that the ambient noise (impurity) artificially manufactured comes the approaching to reality world are added, the controlled training of visual movement is finally carried out using convolutional network and long memory network in short-term.The present invention can overcome the difficulty of extensive collection real world data, there is provided a path generating method based on cartesian space, while improve the accuracy of visual movement control and its train the expansibility of scale.

Description

A kind of simulation result of changing is to the visual movement control method of real world

Technical field

The present invention relates to visual spatial attention field, is transported more particularly, to a kind of vision for changing simulation result to real world Flowing control method.

Background technology

View-based access control model robot movement control be it is a kind of using visual information to robot movement implement feedback control Important method, cover the research fields such as machine vision, image procossing, robot dynamics, control theory.It is simultaneously as near The rise of deep learning especially convolutional neural networks method over year, brings great convenience, in the past to extraction feature and analysing content The conventional method of identification characteristics of image feedback identifying content is substituted.But due to the basis needed for the study of neutral net First, magnanimity training data, and the generation quantity of such robot movement control data and its limited in reality, it is therefore desirable to Generate a large amount of computer pictures to be trained, and change simulation result into real world, to carry out the control of visual movement.

Visual movement control is mainly used in robot field.Required with the progress and the mankind of science and technology for robot Improve constantly, robot technology will constantly improve and substantial leap occurs, and robot product will be applied to mankind's life Living and scientific research various aspects, turn into an irreplaceable intelligence tool of the mankind.It can industrial production, civilian service, The various aspects such as military combat, Science Explorations are widely used, and the replacement as the mankind undertakes very important effect.This Outside, it is not suitable in the mankind in environment such as deep sea drilling, resource exploration, the wild environment mapping field of work, view-based access control model fortune Dynamic control method will have huge application value.

However, visual movement control is still challenging.Firstly because visual sensing device can not be according to job requirements The description details of given environment, secondly the flexibility of observation is caused to be deteriorated, separately because of the problem of blocking sight in robot movement Outside, the continuous fortune of the object pose motion random tool hand of online observation arm end effector and its ambient background also be present Move, in great visual range, the problem of covering image kinetic characteristic.The shortage of the true training data of magnanimity is also given The training band of mechanical hand carrys out difficulty.

The present invention proposes a kind of new frame that real world data is transformed into based on simulation result.Utilize inverse kinematics Theory generate a series of linear paths and be trained, the information such as color, position to producing image is carried out in the training process The sampling of certain distribution law is obeyed, adds the data that the ambient noise (impurity) artificially manufactured comes the approaching to reality world in addition, The controlled training of visual movement is finally carried out using convolutional network and long memory network in short-term.The present invention can overcome to adopt on a large scale Collect the difficulty of real world data, there is provided a path generating method based on cartesian space, while improve visual movement The accuracy of control and its expansibility for training scale.

The content of the invention

For solving the problems, such as to carry out Visual Feedback Control motion in single or complex environment, it is an object of the invention to A kind of simulation result of changing is provided to the visual movement control method of real world, it is proposed that one kind is transformed into based on simulation result The new frame of real world data.

To solve the above problems, the present invention provides a kind of simulation result of changing to the visual movement controlling party of real world Method, its main contents include：

(1) data generate；

(2) training method.

Wherein, described data generation, under conditions of independent of real world data, utilizes phased mission system mode Create an end-to-end controller and carry out visual movement control program, be specially：1) generated in simulation process some most short Courses of action；2) gone to train mechanical speed with these path datas；3) gone using pipeline and instrument layout figure controller by 2) Mechanical speed match mechanical torque；4) data approach real world data will be generated using domain random device.

Described phased mission system mode, the construction of linear path is carried out using cartesian space, and records mechanical speed Degree, joint angles, gripper switching action, object (small cubes, can be picked up by gripper or let-down) position, gripper position And photography photo, it is divided into the data sampling that 5 stages carry out each style accordingly, to be combined into physical condition needed for real world.

5 described stages, respectively successively with condition needed for the method for sampling generation of certain distribution law is obeyed, specifically For：

1) road sign is placed above object, plans a paths and converts to the speed domain of machinery, including adjustment machine Joint angles between tool arm and gripper；

2) when gripper touches road sign, gripper performs closed procedure；

3) road sign in 1) is arranged on to the distance of one section of very little above object, and according to the route planning one in 1) with The path of linear correlation catch object and lift upwards；

4) a basketry position (receiving object) is set, plans a last linear path, object is lifted at road sign Rise and be moved to above basketry；

5) when snatch object is located above basketry, gripper performs and opens operation；

After the completion of above-mentioned 5 steps, check whether object falls and specifying in basketry, if so, preserving this several intended paths； Above-mentioned steps can repeat, untill the plan in path is the most reasonable.

Described domain random device, in order to overcome the gap of simulation process and real world data, by the possibility in environment domain The key factor being related to is enumerated and initialized, and is specially：

1) color of object, basketry and mechanical arm is sampled with the method for normal distribution, and its average is as close possible to true Real world's color average (redgreenblue average)；

2) position of video camera, light source, basketry and object is sampled with equally distributed method；

3) length of mechanical arm obeys being uniformly distributed for small range；

4) the joint angles Normal Distribution of starting point, its average are arranged to start position；

5) Berlin noise is added using sine wave signal, simulates desktop and background texture material；

6) increase obeys equally distributed random original-shape object as impurity (atypical noise)；

Generate after above-mentioned environment includes factor or noise, use disturbance rule make it that the training of model is random closer to domain.

Described disturbance rule, mechanical arm start position and road sign position with decisive action factor are randomized Disturbance is to strengthen the robustness of training, specially：The mistake (non-vision error) artificially manufactured is added into route planning so that Mechanical arm is familiar with the processing method of vision dead zone in real world, while removes the background of single tone, increases multiple color tones Texture and material is as background so that the image of camera intake has unstability.

Described training method, including the output of network structure, network and loss function.

Described network structure, learning network is formed using convolutional network and long memory network in short-term, is specially：

1) image passes sequentially through 8 convolutional layers after input layer, wherein the core size of preceding 7 convolutional layers is 3 × 3, the The core size of 8 convolutional layers is 2 × 2；

2) the convolution step-length of each convolutional layer is both configured to 2；

3) output of last layer of convolutional layer is beaten directly to turn into one-dimensional vector and be input to and grows memory network module in short-term；

4) a full articulamentum being made up of 128 neurons is passed through in last output, then export dimension be 1 × 15 result.

Described network output, dimension are that 15 numerical value are included in 1 × 15 result, and wherein 1-6 represents mechanical speed, 7- 9 presentation classes operation (opening operation of gripper, closed procedure or without operation), 10-13 represent that object space, 14-16 represent The position of gripper；In test phase, the positional information of object space and gripper can't be used to test, but if training Mistake, available for debugging network.

Described loss function, for training network to optimal value, G, gripper position are operated to mechanical speed V, gripper GP and object space CP seek total loss function

Wherein, because all loss function items all have same dimension and magnitude, therefore not use ratio coefficient.

Brief description of the drawings

Fig. 1 is a kind of simulation result of changing of the present invention to the system flow chart of the visual movement control method of real world.

Fig. 2 be the present invention it is a kind of change simulation result to the visual movement control method of real world simulation result with very Real World data corresponds to schematic diagram.

Embodiment

It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase Mutually combine, the present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.

Fig. 1 is a kind of simulation result of changing of the present invention to the system flow chart of the visual movement control method of real world. Mainly include data to generate；Training method.

Wherein, data are generated, and under conditions of independent of real world data, one is created using phased mission system mode Individual end-to-end controller carries out visual movement control program, is specially：1) some most short operation roads are generated in simulation process Footpath；2) gone to train mechanical speed with these path datas；3) gone using pipeline and instrument layout figure controller by the machinery in 2) Speeds match is to mechanical torque；4) data approach real world data will be generated using domain random device.

Phased mission system mode, the construction of linear path is carried out using cartesian space, and records mechanical speed, joint Angle, gripper switching action, object (small cubes, can be picked up by gripper or let-down) position, gripper position and photography Photo, it is divided into the data sampling of 5 each styles of stages progress accordingly, to be combined into physical condition needed for real world.

5 stages, respectively successively with condition needed for the method for sampling generation of certain distribution law is obeyed, specially：

2) when gripper touches road sign, gripper performs closed procedure；

Domain random device, in order to overcome the gap of simulation process and real world data, by may relate to for environment domain Key factor is enumerated and initialized, and is specially：

Disturb rule, to mechanical arm start position and road sign position with decisive action factor carry out randomization disturbance with Strengthen the robustness of training, be specially：The mistake (non-vision error) artificially manufactured is added into route planning so that mechanical arm It is familiar with the processing method of vision dead zone in real world, while removes the background of single tone, increases the texture material of multiple color tones Matter is as background so that the image of camera intake has unstability.

Training method, including the output of network structure, network and loss function.

Network structure, learning network is formed using convolutional network and long memory network in short-term, is specially：

Network exports, and dimension is that 15 numerical value are included in 1 × 15 result, and wherein 1-6 represents that mechanical speed, 7-9 represent Sort operation (opening operation of gripper, closed procedure or without operation), 10-13 represent that object space, 14-16 represent gripper Position；In test phase, the positional information of object space and gripper can't be used to test, can but if training error For debugging network.

Loss function, for training network to optimal value, G, gripper position GP and thing are operated to mechanical speed V, gripper Body position CP seeks total loss function

Fig. 2 be the present invention it is a kind of change simulation result to the visual movement control method of real world simulation result with very Real World data corresponds to schematic diagram.As illustrated, these images are corresponding in the training process respectively, show here It is that under single true environment data, a variety of linearly related paths and background can be generated based on cartesian space, fully intends The characteristic of true environment data is closed, so as to provide mass data for the conjunction training of grabbing of gripper.

For those skilled in the art, the present invention is not restricted to the details of above-described embodiment, in the essence without departing substantially from the present invention In the case of refreshing and scope, the present invention can be realized with other concrete forms.In addition, those skilled in the art can be to this hair Bright to carry out various changes and modification without departing from the spirit and scope of the present invention, these improvement and modification also should be regarded as the present invention's Protection domain.Therefore, appended claims are intended to be construed to include preferred embodiment and fall into all changes of the scope of the invention More and change.

Claims

1. a kind of simulation result of changing is to the visual movement control method of real world, it is characterised in that mainly includes data and gives birth to Into (one)；Training method (two).

2. generate (one) based on the data described in claims 1, it is characterised in that in the bar independent of real world data Under part, create an end-to-end controller using phased mission system mode and carry out visual movement control program, be specially：1) exist Some most short operation paths are generated in simulation process；2) gone to train mechanical speed with these path datas；3) pipeline and instrument are used Device layout drawing controller goes the mechanical speed in 2) matching mechanical torque；4) data fitting will be generated using domain random device Approaching to reality World data.

3. based on the phased mission system mode described in claims 2, it is characterised in that carry out linear road using cartesian space The construction in footpath, and record mechanical speed, joint angles, gripper switching action, object and (small cubes, can be grabbed by gripper Rise or let-down) position, gripper position and photography photo, be divided into the data sampling that 5 stages carry out each style accordingly, with combination Into physical condition needed for real world.

4. based on 5 stages described in claims 3, it is characterised in that being adopted successively with obeying certain distribution law respectively Condition needed for quadrat method generation, it is specially：

1) road sign is placed above object, plans a paths and converts to the speed domain of machinery, including adjustment mechanical arm Joint angles between gripper；

2) when gripper touches road sign, gripper performs closed procedure；

3) road sign in 1) is arranged on to the distance of one section of very little above object, and according to the line therewith of the route planning one in 1) Property related path catch object and lift upwards；

4) a basketry position (receiving object) is set, plans a last linear path, object is lifted simultaneously at road sign It is moved to above basketry；

After the completion of above-mentioned 5 steps, check whether object falls and specifying in basketry, if so, preserving this several intended paths；It is above-mentioned Step can repeat, untill the plan in path is the most reasonable.

5. based on the domain random device described in claims 2, it is characterised in that in order to overcome simulation process and real world number According to gap, the key factor that may relate in environment domain is enumerated and initialized, be specially：

1) color of object, basketry and mechanical arm is sampled with the method for normal distribution, and its average is as close possible to true generation Boundary's color average (redgreenblue average)；

6. based on the disturbance rule described in claims 5, it is characterised in that to the mechanical arm starting point with decisive action factor Position and road sign position carry out randomization disturbance to strengthen the robustness of training, are specially：It is (non-to regard to add the mistake artificially manufactured Feel error) into route planning so that mechanical arm is familiar with the processing method of vision dead zone in real world, while removes single color The background of tune, increase the texture and material of multiple color tones as background so that the image of camera intake has unstability.

7. based on the training method (two) described in claims 1, it is characterised in that including network structure, network output and damage Lose function.

8. based on the network structure described in claims 7, it is characterised in that use convolutional network and long memory network group in short-term Into learning network, it is specially：

1) image passes sequentially through 8 convolutional layers after input layer, wherein the core size of preceding 7 convolutional layers is the 3 × 3, the 8th The core size of convolutional layer is 2 × 2；

4) a full articulamentum being made up of 128 neurons is passed through in last output, and it is 1 × 15 then to export a dimension As a result.

9. based on the network output described in claims 7, it is characterised in that dimension is to include 15 numbers in 1 × 15 result Value, wherein 1-6 represent mechanical speed, 7-9 presentation classes operation (opening operation of gripper, closed procedure or without operation), 10- 13 represent that object space, 14-16 represent the position of gripper；In test phase, the positional information of object space and gripper is simultaneously Test is not used in, but if training error, available for debugging network.

10. based on the loss function described in claims 7, it is characterised in that for training network to optimal value, to mechanical speed V, gripper operation G, gripper position GP and object space CP seek total loss function