CN111421536B

CN111421536B - Rocker operation control method based on touch information

Info

Publication number: CN111421536B
Application number: CN202010177354.9A
Authority: CN
Inventors: 王宗涛; 方斌; 孙富春; 刘华平
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2020-03-13
Filing date: 2020-03-13
Publication date: 2021-07-09
Anticipated expiration: 2040-03-13
Also published as: CN111421536A

Abstract

The invention discloses a rocker operation control method based on touch information, which comprises the following steps: constructing a rocker operation control data set, wherein each piece of data in the data set respectively consists of a moving target of a rocker, a touch image, each joint angle of a mechanical arm and a change value of the joint angle; constructing a rocker operation control model based on touch, which consists of a touch characteristic extraction network and a multi-mode fusion network, wherein the touch characteristic extraction network takes a touch image as input, the multi-mode fusion network takes the touch image characteristic, a rocker motion target value belonging to the same label with the touch image of the input touch characteristic extraction network and each joint angle of the mechanical arm as common input, and takes the change value of each joint angle of the mechanical arm as output; training and verifying the constructed rocker operation control model; and the high-level decision network performs real-time closed-loop control on the rocker operation according to the application scene by using the trained rocker operation control model. The invention improves the accuracy of the rocker operation by using the tactile information.

Description

Rocker operation control method based on touch information

Technical Field

The invention belongs to the technical field of robot operation, and particularly relates to a rocker operation control method based on touch information.

Background

Robot operation is a research hotspot in recent years. The rocker is used as an input device and has important functions in the VR field, the game field, the mobile robot field and the remote operation field. Joysticks have originated in the field of gaming, and are used mostly for interaction with the gaming environment. And then has important function in the field of robot control. Especially to unmanned aerial vehicle's control, the control mode that adopts all is rocker control at present. The robot also has great contribution in various long-distance mobile robots.

With the development of artificial intelligence, deep reinforcement learning becomes mature day by day, and particularly, the DQN algorithm proposed by Deepmind in 2016 makes a great contribution to the development of artificial intelligence. In order to make the robot have the same operation capability as a human and make the robot have intelligence, the advantages of deep learning in perception and the advantages of reinforcement learning in decision making are getting more and more attention. The research related to the robot is still in the development stage, and the robot is particularly important to the perception of the outside.

The touch sense is an important sensing ability and plays an important role in the field of fine operation of the robot, deviation is possibly generated in the sense of vision and the sense of hearing under a complex environment, and the accuracy of a system can be ensured by a touch sense signal. Haptic and visual sense are complementary and in combination provide a complete sense of information. The touch sensor is also developed from a conventional resistive and capacitive type to a photoelectric type. The sensing accuracy of the tactile sensor is also continuously improved.

At present, a touch sensor is mainly added into a rocker controller in the field of rocker operation control to apply touch signals, so that the experience of man-machine interaction is improved, but the application of the robot for independently controlling the rocker by using the touch signals is not reported.

Disclosure of Invention

The invention aims to improve the perception capability of a robot in operation, and provides a rocker operation control method based on touch information.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides a rocker operation control method based on touch information, wherein a rocker is controlled by a mechanical arm, and the rocker operation control method is characterized in that a touch sensor which is directly contacted with the rocker is arranged at the tail end of the mechanical arm and is used for acquiring a touch image of the rocker; the lever operation control method includes the steps of:

1) building a rocker operation control dataset

Determining a motion target value of the rocker, namely an output value range S of the rocker according to an application scene of the rocker; the rocker is driven by the mechanical arm to enable the rocker to be in any motion target value g ' ═ g1 ' within the motion target value range 'G2 '), g1 ', g2 ' are respectively a vertical motion target value and a left-right motion target value of the rocker, g1 ' belongs to S, g2 ' belongs to S, and each joint angle x of the current mechanical arm is read_iI is 1,2, …, n, n is the total number of joint angles of the mechanical arm; respectively applying a random variation value delta x to each joint angle of the mechanical arm_iWhen each joint angle of the mechanical arm reaches x_i+Δx_iAcquiring a touch image at the moment through a touch sensor, recording a motion target value g of the rocker at the moment as (g1, g2), wherein g1 belongs to S, and g2 belongs to S, and reducing the acquired touch image to obtain a touch image I; the motion target g of the rocker, the tactile image I and each joint angle x of the mechanical arm_iAnd its variation value Deltax_iAs a piece of data, and the joint angle change value Deltax_iA label as the piece of data; repeating the steps, traversing all the motion target values in the motion target value range S of the rocker, constructing a rocker operation control data set by using all the obtained data, and dividing each piece of data in the data set into a training set and a verification set according to a certain proportion;

2) building touch-based rocker operation control model

Connecting the tactile feature extraction network with the multi-mode fusion network to serve as a joystick operation control model based on the tactile sense; the haptic characteristic extraction network is a convolutional neural network, the haptic image I collected in the step 1) is used as input, and the extracted haptic image characteristic j with the dimension m is used as output; the multi-mode fusion network is formed by sequentially connecting a plurality of hidden layers and 1 full-connection layer, and the tactile image characteristics j, the rocker motion target value belonging to the same label with the tactile image I of the input tactile characteristic extraction network and each joint angle x of the mechanical arm_iCombining the vectors into a vector with the dimension of N ═ m + N +2, and taking the vector as the input of the multi-mode fusion network to obtain an N-dimensional vector (y)_i,y₂,…,y_i,…y_n)，y_iThe variation value of the ith joint angle of the mechanical arm is obtained;

3) training and verifying constructed rocker operation control model

The tactile images belonging to the same label in the training set obtained in the step 1) areI. Target value g of rocker motion and each joint angle x of mechanical arm_iRespectively inputting the parameters into the rocker operation control model constructed in the step 2), and then training the rocker operation control model through a chain rule and a back propagation algorithm, wherein the loss function adopts a mean square error function to constrain the training process of the rocker operation control model; judging the training effect of the rocker operation control model by using the test set to assist in adjusting the parameters of the network model so as to obtain the trained rocker operation control model;

4) the high-level decision network performs real-time closed-loop control on the rocker operation according to the application scene by using the trained rocker operation control model

The high-level decision network learns the corresponding high-level actions by using a deep reinforcement learning method according to the specific high-level tasks; continuously interacting with the environment of the executed high-level task through high-level actions, generating corresponding rewards after the interaction of each high-level action and the high-level environment, and finding out a strategy for maximizing the accumulated rewards R, namely the current motion target value of the rocker through a high-level decision network; taking the current motion target value, the collected current touch image and the current joint angle of the mechanical arm as the input of the trained rocker operation control model to obtain the change value of each joint angle of the mechanical arm, and executing each joint angle of the mechanical arm to x_i+Δx_iThe mechanical arm drives the rocker to reach a corresponding motion target value, the motion target value is subjected to linear mapping to obtain corresponding high-rise actions, and the high-rise actions are interacted with a high-rise environment to form closed-loop control; and continuously circulating the process until the high-level task is finished.

Further, the touch sensor comprises a shell, a camera and a lighting lamp which are arranged in the inner cavity of the shell, an elastomer which is positioned on the surface of the shell and is in contact with the rocker, and a colored coating which is attached to the surface of the elastomer; the camera is positioned on one side of the elastic body and has a certain gap with the elastic body so as to meet the imaging requirement, and is used for capturing a tactile image of the surface texture change of the elastic body; the illuminating lamps are distributed around the camera; the elastic body is made of transparent elastic material; the colored coating is in direct contact with the rocker to prevent the messy background from penetrating the elastomer and interfering with the tactile image.

The invention has the characteristics and beneficial effects that:

according to the rocker operation control method based on the touch information, provided by the invention, the fine control of the robot on the rocker operation is realized through the slight change of the touch image, and the accuracy of the rocker operation is improved. The invention utilizes the touch sense which is an important perception mode of human body, can provide reliable interaction information and further perfects the intellectualization of the robot.

Drawings

Fig. 1 is an overall flowchart of a joystick operation control method based on tactile information according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of a tactile sensor used in the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

For better understanding of the present invention, an application example of a joystick operation control method based on tactile information proposed by the present invention is explained in detail below.

The invention discloses a joystick operation control method based on touch information, and the whole process is shown in figure 1. The joystick in this embodiment is controlled by a robotic arm, and a tactile sensor is provided at the end of the robotic arm in direct contact with the joystick, the orientation of which is read by a wireless module (the wireless module in this embodiment employs a 9-channel receiver model RF 209S). The mechanical arm is a driving mechanism of the method, the tail end of the mechanical arm is fixed with the touch sensor, the touch sensor is controlled by the mechanical arm to have 3 degrees of freedom, the repeated motion precision of the touch sensor is guaranteed to be below 0.2mm, and the precision degree of low-layer motion control is guaranteed. The touch sensor is used for acquiring a touch image of the rocker, and the structure of the touch sensor is shown in FIG. 1, and the touch sensor comprises a shell 2, a camera 1 and an LED lamp 3 which are arranged in the inner cavity of the shell 2, an elastic body 4 which is positioned on the surface of the shell 2 and is in contact with the rocker, and a colored coating 5 which is attached to the surface of the elastic body 4; the shell 2 is used for accommodating and supporting all parts in the shell, is formed by 3D printing of resin materials, can adopt different shapes according to the requirements of actual use scenes, and has the shape and the size similar to that of the real fingers of adults; the camera 1 is positioned on one side of the elastic body 4 and has a certain gap with the elastic body 4 to meet the imaging requirement, and is used for capturing a tactile image of the surface texture change of the elastic body 4; the LED lamps 3 are distributed around the camera 1 and used for exposure, so that adverse effects on the quality of the touch images caused by insufficient light are prevented; the elastic body 4 is made of transparent Polydimethylsiloxane (PDMS), the size of the elastic body 4 is a cube of 30mm multiplied by 5mm, the colored coating 5 covered on the surface of the elastic body 4 is directly contacted with the rocker, the colored coating 5 is made of PDMS mixed color-changing ink material, the size of the colored coating is 30mm multiplied by 1mm, the colored coating 5 is used for preventing the disordered background from penetrating through the elastic body 4 to generate interference on a touch image, the colored coating generally selects pure color as ground color, on one hand, the interference generated by background light can be shielded, and on the other hand, the change degree of the elastic body 4 in the touch image can be enhanced. The method of the embodiment specifically comprises the following steps:

1) building a rocker operation control dataset

Determining a motion target value of the rocker, namely an output value (the motion target value can be the motion direction of the rocker) range S of the rocker according to the application scene of the rocker; the rocker is driven by the mechanical arm to enable any motion target value g ' of the rocker within the motion target value range to be (g1 ', g2 '), g1 ' and g2 ' to be the up-down motion target value and the left-right motion target value of the rocker respectively, g1 ' belongs to S, g2 ' belongs to S, and each joint angle x of the current mechanical arm is read_iI is 1,2, …, n, n is the total number of joint angles of the mechanical arm; respectively applying a random variation value delta x to each joint angle of the mechanical arm_iWhen each joint angle of the mechanical arm reaches x_i+Δx_iMeanwhile, a touch image at the moment is collected through the touch sensor, the wireless module records the motion target value g ═ g1, g2, g1 ∈ S, g2 ∈ S at the moment of the rocker, and in order to reduce the subsequent used networkThe parameter quantity of the network is that the collected tactile images are reduced to obtain tactile images I; the motion target g of the rocker, the tactile image I and each joint angle x of the mechanical arm_iAnd its variation value Deltax_iAs a piece of data, and the joint angle change value Deltax_iA label as the piece of data; and repeating the steps, traversing all the motion target values in the motion target value range S of the rocker, constructing a rocker operation control data set by using all the obtained data, and dividing each piece of data in the data set into a training set and a verification set according to a certain proportion (such as 7: 3).

The rocker in the embodiment is applied to a greedy snake game, the rocker has four moving directions of up, down, left and right, and the moving target value range S [ -190,190], [ -190,190] ]. The size of the tactile image collected by the tactile sensor is 480 multiplied by 480 pixels, the size of the reduced tactile image I is 100 multiplied by 100 pixels, and 5000 data are totally collected in the rocker operation control data set.

2) Building touch-based rocker operation control model

And connecting the tactile feature extraction network with the multi-mode fusion network to serve as a tactile-based rocker operation control model. In this embodiment, the haptic feature extraction network uses ResNet-16, and the dimension of the extracted haptic image feature j is m 512. The multi-mode fusion network is formed by sequentially connecting 4 hidden layers L1-L4 and 1 full connecting layer (the number of layers of the hidden layers and the full connecting layers and the dimension of the extracted features can be set according to actual needs, wherein the feature dimension extracted by each structural layer is reduced by half in sequence so as not to reduce too much and lose effective information), and the joystick motion target value of the same label of the tactile image feature j and the tactile image I input into the tactile feature extraction network and each joint angle x of the mechanical arm are combined to form the multi-mode fusion network_iCombining into a vector with 520 dimensions of N ═ (m + N +2), taking the vector as the input of the multi-mode fusion network, obtaining a first 256-dimensional vector after passing through a first hidden layer L1, obtaining a second 128-dimensional vector after passing through a second hidden layer L2, and obtaining a second 128-dimensional vector after passing through a second hidden layer L2The vector passes through a third hidden layer L3 to obtain a 64-dimensional third vector, the third vector passes through a fourth hidden layer L4 to obtain a 32-dimensional fourth vector, and the fourth vector passes through a full-connection layer to obtain a 6-dimensional vector (y)_i,y₂,…,y_i,…y_n)，y_iThe change value of the ith joint angle of the mechanical arm.

3) Training and verifying constructed rocker operation control model

The tactile image I belonging to the same label in the training set obtained in the step 1), the rocker motion target value g and each joint angle x of the mechanical arm_iRespectively inputting the input information into the rocker operation control model constructed in the step 2), and then training the rocker operation control model through a chain rule and a back propagation algorithm (the training process is mainly to output y of the rocker operation control model_iWith corresponding input data labels deltax_iFitting), wherein the loss function employs a mean square error function to constrain the training process of the joystick operation control model. Judging the training effect of the rocker operation control model by using the test set to assist in adjusting the parameters of the network model so as to obtain the trained rocker operation control model; specifically, a rocker motion target value, a touch image and current joint angles of the mechanical arm in the test set are used as input of a rocker operation control model, and an output value of y ═ is obtained by running (y)₁,y₂,y₃,y₄,y₅,y₆) And (Δ x) matching the obtained y with a corresponding label Δ x of the input joystick operation control model data₁,Δx₂,Δx₃,Δx₄,Δx₅,Δx₆) Making a comparison when

Namely, when the average error is within 0.2mm, the trained rocker operation control model is considered to reach the set precision requirement.

High-level decision network utilizes depth according to specific high-level tasksThe method of the reinforcement learning learns the corresponding high-level action. The high-level actions continuously interact with the environment of the executed high-level task, each high-level action has a corresponding reward R after interacting with the high-level environment, and a high-level decision network is responsible for finding a strategy of maximizing the accumulated reward R; specifically, the application scenario of the real-time game is a greedy snake game controlled by a rocker operation, the input of the high-level strategy network is a game picture, the output is the action in four directions on a game plane, the game score is used as the reward value of the high-level strategy network, and the purpose of the high-level strategy network is to maximize the score of the whole game. Corresponding four actions in the game to the positive and negative of the motion target value of the rocker, (for example, corresponding to (0,100), (0, -100), (100,0) and (-100,0) of g respectively at the upper part, the lower part, the left part and the right part, using the current motion target value obtained by a high-level decision network, the collected current tactile image and the current joint angle of the mechanical arm as the input of the rocker operation control model after training to obtain the change value of the joint angle of the mechanical arm, and then executing the joint angle of the mechanical arm to x_i+Δx_iThe mechanical arm drives the rocker to reach a corresponding motion target value, the motion target value is subjected to linear mapping to obtain actions in four directions on a game plane, namely high-rise actions, and the high-rise actions interact with a high-rise environment to form closed-loop control. And continuously circulating the process until the high-level task is finished.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent structures or equivalent flow transformations that are made by using the contents of the specification and the drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A rocker operation control method based on touch information is characterized in that a touch sensor in direct contact with a rocker is arranged at the tail end of the mechanical arm and used for acquiring a touch image of the rocker; the touch sensor comprises a shell, a camera and a lighting lamp which are arranged in the inner cavity of the shell, an elastomer which is positioned on the surface of the shell and is in contact with the rocker, and a colored coating which is attached to the surface of the elastomer; the camera is positioned on one side of the elastic body and has a certain gap with the elastic body so as to meet the imaging requirement, and is used for capturing a tactile image of the surface texture change of the elastic body; the illuminating lamps are distributed around the camera; the elastic body is made of transparent elastic material; the colored coating is directly contacted with the rocker and is used for preventing a disordered background from penetrating through the elastomer to generate interference on a tactile image;

the lever operation control method includes the steps of:

1) building a rocker operation control dataset

Determining a motion target value of the rocker, namely an output value range S of the rocker according to an application scene of the rocker; the rocker is driven by the mechanical arm to enable any motion target value g ' of the rocker within the motion target value range to be (g1 ', g2 '), g1 ' and g2 ' to be the up-down motion target value and the left-right motion target value of the rocker respectively, g1 ' belongs to S, g2 ' belongs to S, and each joint angle x of the current mechanical arm is read_iI is 1,2, …, n, n is the total number of joint angles of the mechanical arm; respectively applying a random variation value delta x to each joint angle of the mechanical arm_iWhen each joint angle of the mechanical arm reaches x_i+Δx_iAcquiring a touch image at the moment through a touch sensor, recording a motion target value g of the rocker at the moment as (g1, g2), wherein g1 belongs to S, and g2 belongs to S, and reducing the acquired touch image to obtain a touch image I; the motion target g of the rocker, the tactile image I and each joint angle x of the mechanical arm_iAnd its variation value Deltax_iAs a piece of data, and the joint angle change value Deltax_iA label as the piece of data; repeating the steps, traversing all the motion target values in the motion target value range S of the rocker, constructing a rocker operation control data set by using all the obtained data, and dividing each piece of data in the data set into a training set and a verification set according to a certain proportion;

2) building touch-based rocker operation control model

Connecting the tactile feature extraction network with the multi-mode fusion network to serve as a joystick operation control model based on the tactile sense; wherein the haptic feature extraction networkA convolutional neural network, taking the tactile image I collected in the step 1) as input, and taking the extracted tactile image feature j with the dimension m as output; the multi-mode fusion network is formed by sequentially connecting a plurality of hidden layers and 1 full-connection layer, and the tactile image characteristics j, the rocker motion target value belonging to the same label with the tactile image I of the input tactile characteristic extraction network and each joint angle x of the mechanical arm_iCombining the vectors into a vector with the dimension of N ═ m + N +2, and taking the vector as the input of the multi-mode fusion network to obtain an N-dimensional vector (y)_i,y₂,…,y_i,…y_n)，y_iThe variation value of the ith joint angle of the mechanical arm is obtained;

3) training and verifying constructed rocker operation control model

The tactile image I belonging to the same label in the training set obtained in the step 1), the rocker motion target value g and each joint angle x of the mechanical arm_iRespectively inputting the parameters into the rocker operation control model constructed in the step 2), and then training the rocker operation control model through a chain rule and a back propagation algorithm, wherein the loss function adopts a mean square error function to constrain the training process of the rocker operation control model; judging the training effect of the rocker operation control model by using the test set to assist in adjusting the parameters of the network model so as to obtain the trained rocker operation control model;

2. The joystick operation control method according to claim 1, wherein in step 2, the tactile feature extraction network employs ResNet-16; the multi-mode fusion network is formed by sequentially connecting 4 hidden layers L1-L4 and 1 full connection layer, and the dimensionality of each hidden layer is reduced by half in sequence.