CN114723827A

CN114723827A - Grabbing robot target positioning system based on deep learning

Info

Publication number: CN114723827A
Application number: CN202210460922.5A
Authority: CN
Inventors: 李双全; 邓世航
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2022-07-08

Abstract

The invention discloses a grabbing robot target positioning system based on deep learning, and relates to the technical field of robots; the image acquisition module is connected with the data processing module through a USB3.0 data line, the data processing module is connected with the control module through a TCP protocol, a target detection algorithm is adopted as a Yolo target detection algorithm, a Yolov5 network image enters prediction after passing through a backhaul, at the moment, the network performs up-sampling, the image is sequentially decomposed into 13 × 13, 26 × 26 and 52 × 52 parts, and three detection heads respectively perform detection of three sizes, namely large, medium and small; for the detection of small objects, an upper sampling layer is added, an anchor point is set, and a detection probe is added; the invention carries out the detection of smaller articles through the improved scheme, is convenient for quick measurement and improves the accuracy; the grabbing robot target positioning system with higher precision and certain robustness is realized.

Description

Grabbing robot target positioning system based on deep learning

Technical Field

The invention belongs to the technical field of robots, and particularly relates to a grabbing robot target positioning system based on deep learning.

Background

The three-dimensional measurement technology is a visualization process of measuring three-dimensional coordinates of points on the surface of an object in a specified coordinate system and storing data through a computer. Three-dimensional measurement techniques have developed dramatically and gradually into commercialization in recent years. In the industrial field, product component machining and assembly often requires precise measurement and modeling of the components. For example, in the aircraft manufacturing industry, the mode of inspecting the product quality by using the traditional two-dimensional pattern and simulation means cannot meet the manufacturing requirements of the novel aircraft.

The existing target detection method is shown in fig. 1; existing three-dimensional measurement techniques can be broadly classified into contact measurement and non-contact measurement according to whether or not contact is made with an object to be measured during measurement. For non-contact measurement, the measurement principle can be divided into acoustic measurement, electromagnetic measurement and optical measurement. The most common acoustic three-dimensional measurement technique is a method of measuring by using ultrasonic waves. For electromagnetic measurement, the electromagnetic wave band adopted in the measurement is divided, and there are mainly CT, microwave radar and other measurement methods.

As shown in fig. 2, which is a three-dimensional measurement method, among all three-dimensional measurement methods, the optical-based three-dimensional measurement technology has great advantages in speed and precision, and is different from lidar, ultrasonic and infrared sensors, and the optical camera has low hardware cost, low trial threshold, small size, and is easy to be used in combination with other sensors. More importantly, on the background of increasingly skilled capability and technology of processing mass information by a computer and under the increasingly high-level information requirements of academia and industry, the camera can capture more information at lower cost, so that the camera stands out of various sensors and becomes the highest choice in cost performance.

The existing target detection algorithm and the three-dimensional measurement method have the phenomena of inaccurate detection, imperfect hardware, low working efficiency and poor stability.

Disclosure of Invention

To solve the problems of the background art; the invention aims to provide a grabbing robot target positioning system based on deep learning.

The invention relates to a grabbing robot target positioning system based on deep learning, which comprises an image acquisition module, a data processing module and a control module, wherein the image acquisition module is used for acquiring an image; the image acquisition module is connected with the data processing module through a USB3.0 data line, the data processing module is connected with the control module through a TCP protocol, a target detection algorithm is adopted as a Yolo target detection algorithm, a Yolov5 network image enters prediction after passing through a backhaul, at the moment, the network performs up-sampling, the image is sequentially decomposed into 13 × 13, 26 × 26 and 52 × 52 parts, and three detection heads respectively perform detection of three sizes, namely large, medium and small; for the detection of small objects, adding an upper sampling layer, setting an anchor point, adding a detection probe, dividing the image into 104 × 104, and detecting the smaller objects; meanwhile, a monocular vision estimation method is adopted to assist the three-dimensional measurement.

Compared with the prior art, the invention has the beneficial effects that:

the improved scheme is used for detecting smaller articles, meanwhile, the rapid measurement is facilitated, and the accuracy is improved.

Secondly, the grabbing robot target positioning system with higher precision and certain robustness is realized.

Drawings

For ease of illustration, the invention is described in detail by the following detailed description and the accompanying drawings.

FIG. 1 is a flow chart of a method for detecting an object in the background art;

FIG. 2 is a flow chart of a three-dimensional measurement method in the background art;

fig. 3 is a schematic structural diagram of the present invention.

Detailed Description

In order that the objects, aspects and advantages of the invention will become more apparent, the invention will be described by way of example only, and with reference to the accompanying drawings. It is to be understood that such description is merely illustrative and not intended to limit the scope of the present invention. The structures, proportions, and dimensions shown in the drawings and described in the specification are only for the purpose of understanding and reading the present disclosure, and are not intended to limit the scope of the present disclosure, which is defined in the claims, and therefore, the present disclosure is not limited to the essential meanings of the technology, and any modifications of the structures, changes of the proportions, or adjustments of the dimensions, should be within the scope of the present disclosure without affecting the efficacy and attainment of the same. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the structures and/or processing steps closely related to the scheme according to the present invention are shown in the drawings, and other details not so relevant to the present invention are omitted.

The specific implementation mode adopts the following technical scheme:

firstly, a target positioning system is a system combining a target detection method and a three-dimensional measurement method, and the following scheme is designed for designing the system.

1. A preliminary target positioning system is designed according to the existing target detection method and three-dimensional measurement method, and the system is shown in fig. 3.

2. After the preliminary system is completed, the system is modified for the problems existing in the conventional target detection algorithm and the three-dimensional measurement algorithm. The performance of the grabbing robot is improved, and the whole system has certain robustness.

3. After the system is designed, subsequent check is carried out to check the performance of the system, and the problems existing in the system are further improved.

Secondly, researching a target detection algorithm:

the target detection algorithm at the present stage is mainly based on deep learning, and can be basically divided into a one-stage target detection algorithm and a two-stage target detection algorithm, wherein the accuracy of the two-stage target detection algorithm is generally stronger than that of the one-stage target detection algorithm. However, the real-time performance of the two-stage algorithm is inferior to that of the one-stage algorithm, and the working efficiency of the grabbing robot is inevitably influenced if the real-time performance of the target detection algorithm is insufficient in consideration of the working efficiency of the grabbing robot.

2.1, a Yolo target detection algorithm:

yolo is a very representative algorithm in single-stage target detection algorithms, and has been developed to the 5 th generation, which adopts CSPDarknet53 after the 3 rd generation in the backbone network, which is a network framework developed on the residual error network; the 1 × 1 convolution kernel is introduced, and by using the convolution kernel together with the conventional 3 × 3 convolution kernel, the parameter amount is reduced by nearly half compared with the conventional convolution kernel only using 3 × 3, so that the Yolo network has extremely high speed. Meanwhile, the Yolo network has higher accuracy and is suitable for being used as a neural network for grabbing a robot vision system.

On one hand, the Yolo network has a poor detection effect on small objects. On the other hand, the Yolo network sets anchor points in advance, and then regresses the offset of the anchor points and the width and height of the object through the neural network to determine the identification frame of the object. However, the frame thus identified can only identify the approximate range of the object, and cannot identify which point in the frame is on the object. Therefore, when facing an object with a large length-width ratio, there is a large background in the frame when the object is tilted, and the center of the frame is not necessarily on the object. Target positioning is trapped.

2.2, an improved scheme:

the Yolov5 network image enters into prediction after passing through the backhaul, at this time, the network will perform up-sampling, and the image is decomposed into 13 × 13, 26 × 26, 52 × 52 parts in turn, and three detection heads respectively perform detection of three sizes, namely large, medium and small. Therefore, for the detection of small objects, an upper sampling layer can be added, an anchor point is arranged, a detection probe is added, the image is divided into 104 x 104, and the detection of the smaller objects is carried out.

Aiming at the improvement scheme that the above-mentioned provided Yolo detection box is difficult to determine which points are problems on the object, a rotation angle is added to the content of the original data set. And modifying the network to change the original n +1+4(n: the number of types of objects, 1: whether the objects exist at the position, 4: xywh, xy is a correction value of an anchor point, wh is the width and the height of the objects) of the image output from the backbone into n +1+4+1 after upsampling, and adding a regression term of a rotation angle to enable the detection frame to rotate along with the rotation of the objects.

In addition, a super-resolution reconstruction technology utilizing multi-frame fused images can be used, and the super-resolution reconstruction technology is started when the visual system cannot detect the object and has poor detection effect, so as to assist the detection of the images.

Thirdly, the robustness problem of the system:

the key of three-dimensional measurement is the acquisition of depth information, and when the system cannot acquire the depth information or acquires wrong depth information, the measured three-dimensional information is wrong. Therefore, it is necessary to add an auxiliary algorithm to improve the robustness of the system. The subject aims to adopt a monocular vision estimation method to assist three-dimensional measurement, and the purpose of improving the robustness of the system is achieved.

And fourthly, research of three-dimensional measurement:

in the traditional three-dimensional measurement method, binocular stereo vision has poor performance in an environment lacking texture, and mismatching and the like can occur in an environment with periodic texture. Structured light is very sensitive to light and is not as sensitive as binocular stereo vision in resolution, and as distance increases, accuracy also decreases. In order to solve the above problems, binocular stereo IR depth ranging is studied. This kind of camera possesses two IR depth sensor, can use speckle structure light to assist like this when utilizing parallel binocular stereovision to measure the degree of depth, has utilized speckle structure light to mark for the full field of vision, and this makes each facula all unique in the field of vision, so does not have and can't find the characteristic point, perhaps because the mismatch that a large amount of repeated textures caused, very big promotion the matching effect, and then increased substantially the acquisition effect of Z axle degree of depth information. But the baseline of the camera can also be made very small, only 50 mm. The three-dimensional measurement method is suitable for being used as a three-dimensional measurement scheme of the grabbing robot.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims

1. The utility model provides a snatch robot target positioning system based on deep learning which characterized in that: the system comprises an image acquisition module, a data processing module and a control module; the image acquisition module is connected with the data processing module through a USB3.0 data line, the data processing module is connected with the control module through a TCP protocol, a target detection algorithm is adopted as a Yolo target detection algorithm, a Yolov5 network image enters prediction after passing through a backhaul, at the moment, the network performs up-sampling, the image is sequentially decomposed into 13 × 13, 26 × 26 and 52 × 52 parts, and three detection heads respectively perform detection of three sizes, namely large, medium and small; for the detection of small objects, adding an upper sampling layer, setting an anchor point, adding a detection probe, dividing the image into 104 × 104, and detecting the smaller objects; meanwhile, a monocular vision estimation method is adopted to assist the three-dimensional measurement.