CN117292199A - Segment bolt identification and positioning method for lightweight YOLOV7 - Google Patents

Segment bolt identification and positioning method for lightweight YOLOV7 Download PDF

Info

Publication number
CN117292199A
CN117292199A CN202311308300.1A CN202311308300A CN117292199A CN 117292199 A CN117292199 A CN 117292199A CN 202311308300 A CN202311308300 A CN 202311308300A CN 117292199 A CN117292199 A CN 117292199A
Authority
CN
China
Prior art keywords
lightweight
yolov7
bolt
segment
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311308300.1A
Other languages
Chinese (zh)
Inventor
肖艳秋
贺振东
王一鸣
刘洁
莫海川
王鹏鹏
崔光珍
孙春亚
黄荣杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou University of Light Industry
Original Assignee
Zhengzhou University of Light Industry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou University of Light Industry filed Critical Zhengzhou University of Light Industry
Priority to CN202311308300.1A priority Critical patent/CN117292199A/en
Publication of CN117292199A publication Critical patent/CN117292199A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a segment bolt identification and positioning method of lightweight YOLOV7, which comprises the steps of constructing a lightweight YOLOV7 model, and carrying out target identification detection on a segment inner groove bolt to be spliced; the lightweight YOLOV7 model is adopted as a basis, the backbone network is replaced by mobilenet v3, and a mixed attention mechanism is introduced; combining orb feature matching to generate a measurement target point, and finding out the pixel position of the bolt on the target duct piece; and acquiring the position of the target bolt by using an industrial camera, and acquiring the position coordinate of the bolt according to the relation between the world coordinate system and the calibration camera coordinate. According to the invention, the backbone network is replaced by the MobileNet V3, so that the detection speed of the network is increased; the mixed attention mechanism is introduced, so that the attention degree of the target bolt is increased; YOLOV7 adds a new feature fusion layer on a shallow basis to preserve feature information to the maximum.

Description

Segment bolt identification and positioning method for lightweight YOLOV7
Technical Field
The invention relates to a segment automatic assembling technology, in particular to a segment bolt identification and positioning method of a lightweight YOLOV 7.
Background
With the continuous development of industrial automation, segment assembly is an important infrastructure construction link, and is important for improving production efficiency and quality. At present, the system plays a key role in the construction of urban highways, railways and bridges. In the segment assembling process, the identification and the positioning of the segment bolts are a key task, and the accurate identification and the positioning of the segment bolts can help workers to quickly and accurately perform assembly work, so that the installation efficiency and the installation quality are improved. However, the conventional manual identification method not only requires a lot of manpower input, is time-consuming and labor-consuming, but also is susceptible to human error.
In recent years, rapid developments in computer vision and deep learning techniques have provided new solutions for automated segment bolt identification and positioning. Among them, the object detection technology is widely applied to object recognition and positioning tasks in industrial scenes. Particularly, a target detection method based on deep learning, such as RCNN (RCNN) and YOLO (You Only Look Once) series models, becomes a hot spot for research and application due to the rapid detection speed and high accuracy.
Accurate positioning of the duct piece bolts is a key for realizing automatic duct piece assembly. Wada proposes a method that combines six degrees of freedom of rotation, heave, slip, pitch, yaw and roll with a laser sensor to improve the efficiency of measuring the segment. Wang et al designed a novel hydraulic system based on electrohydraulic proportional control technology, established a kinematic and dynamic model, determined the displacement of each actuator, and finally calculated the position of the segment. The method requires higher precision of a control algorithm and a sensor, has weaker anti-interference capability and is not suitable for positioning the segment bolt under a complex background. Zhang et al, establishes a mathematical model of the moving object, performs real-time positioning detection on the object in the whole process, and provides real-time information for feeding back the position information of the object by utilizing the machine vision fusion position information characteristic. The target position detection method can be used for positioning the segment bolts.
With the continuous advancement of deep learning research in the field of vision, YOLO series and fast-Rcnn target recognition algorithms are emerging for target object detection. Based on the object detected by the deep learning, pixel information of the object on an imaging plane of the camera can be known. DU et al propose a real-time identification and detection method of rolling non-cooperative targets, combining the detected images with PNP algorithm to obtain 6D gestures of the targets, selecting segmented images by using depth information, and transmitting the relative positions of capture points to a manipulator. Finally, the physical experiment under different illumination conditions is completed on the six-degree-of-freedom air floating platform by using a rotating non-cooperative target; the method can be applied to the identification of the duct piece, and can measure the coordinates of the target object under the three-dimensional coordinate system by combining with other sensor information to finish pose measurement.
Detection of segment bolts is critical to the whole assembly process. Because the design of YOLOV7 is simple and effective, the candidate frame generation and screening process in the traditional two-stage detection method is avoided by converting the target detection task into the regression problem, and the calculation amount and complexity are reduced. The method uses a single neural network to simultaneously predict the position and the category of the target, and has better interpretability and practicability. However, the current identification of bolts in a pipe sheet has the following two key problems: 1) The background is complex, and the data set is scarce; 2) The model training speed is slower, and the accuracy is lower.
Disclosure of Invention
In order to solve the defects in the prior art, the invention aims to provide a light-weight YOLOV7 segment bolt identification and positioning method.
In order to achieve the purpose of the invention, the technical scheme adopted by the invention is as follows:
a method for identifying and positioning a lightweight YOLOV7 segment bolt comprises the following steps:
(1) Constructing a lightweight YOLOV7 model, and carrying out target identification detection on bolts in the inner grooves of the segments to be spliced; adopting a lightweight YOLOV7 model as a basic target detector, replacing a backbone network with MobileNetV3, and introducing a mixed attention mechanism;
(2) Generating a measurement target point by combining orb feature matching through an improved lightweight YOLOV7 algorithm, and finding out the pixel position of the bolt on the target duct piece; and then, acquiring the position of the target bolt by using an industrial camera, and acquiring the accurate position coordinate of the bolt in the three-dimensional world by combining the depth information and the pixel information acquired by the two.
Further, in step (1), the improved lightweight YOLOV7 network structure includes an input layer, a backbone network, a neck network, and a regressive output layer, and the backbone network includes an improved MobileNetV3 module and an SPPCSPC module.
Further, the improved mobilenet v3 network comprises a back-residual structure, a mixed attention module CBAM, an intermediate extension module, an up-sampling module, a perceptron structure, and a hole convolution.
Further, the mixed attention module comprises a channel attention module and a space attention module, and the channel attention module and the space attention module respectively carry out channel attention and space attention; the channel information of the feature map is then aggregated by two pooling operations, which are then concatenated and convolved by a standard convolution layer to generate an attention map, with the two modules being in series.
Further, the inverse residual structure includes a 1x1 convolution kernel for channel number expansion, followed by a dw convolution kernel, and finally a 1x1 convolution kernel for channel reduction.
Further, the improved multi-layer perceptron structure in the MobileNet V3 module is characterized in that a 5×5 convolution layer in the bottleneck module is added with a 1×1 convolution layer to serve as a full-connection layer of the perceptron, an h-swish activation function is introduced to form a perceptron to be embedded into a deep network, and a cavity convolution is introduced into the MobileNet V3 module.
Further, in the step (2), an industrial camera is mounted on the segment erector, and an industrial camera coordinate system O is established A -X A Y A Z A ,O A Is a camera optical center point; and calibrating the internal and external parameters of the camera, acquiring a parameter matrix of the camera, and acquiring the accurate coordinates of the complete segment bolt in the three-dimensional world according to the relationship between the world coordinate system and the calibrated camera coordinates.
Further, the acquisition of the parameter matrix k form in the industrial camera is as follows:
wherein k is x 、k y Is the scale factor of the camera in the X-axis and Y-axis, (u) 0 ,v 0 ) The intersection point of the optical axis center line of the industrial camera and the imaging plane;
describing the relation between the bolt and the image point by using an internal reference matrix, and then the pixel coordinates (u, v) are as follows:
further, the process of calibrating the parameters outside the camera is expressed by the following formula:
P=R*X+T
where P is the point coordinates in the camera coordinate system, R is the rotation matrix, X is the point coordinates in the world coordinate system, and T is the translation vector.
Compared with the prior art, the invention adopts a lightweight YOLOV7 model as a basic target detector, replaces a backbone network with mobilenet v3, and increases the detection speed of the network; the mixed attention mechanism is introduced, the attention degree of the target bolt is increased in the detection process by fusing the spatial attention and the channel attention, and the performance of the model in the segment bolt recognition and positioning task is improved; YOLOV7 adds a new feature fusion layer on a shallow basis to preserve feature information to the maximum.
Drawings
FIG. 1 is an improved lightweight YOLOV7 network;
FIG. 2 is a modified MobileNet V3 network;
fig. 3 is a mixed attention module CBAM.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples. The following examples are only for more clearly illustrating the technical solutions of the present invention and are not intended to limit the scope of protection of the present application.
The invention relates to a method for identifying and positioning a lightweight YOLOV7 segment bolt, which comprises the following steps:
(1) Constructing a lightweight YOLOV7 model, and carrying out target identification detection on bolts in the inner grooves of the segments to be spliced; the lightweight YOLOV7 model is adopted as a basic target detector, a backbone network is replaced by MobileNetV3, and a mixed attention mechanism is introduced to improve the performance of the model in segment bolt recognition and positioning tasks;
the improved lightweight YOLOV7 network overall network structure diagram is shown in fig. 1, firstly, the backbone network of YOLOV7 is replaced by mobilenet v3, the self-contained attention module SE of mobilenet v3 is replaced by a mixed attention module, and the attention degree of a target bolt is increased while the detection precision of the network is improved. Secondly, because the size of the target bolt is greatly changed in the assembling process, in order to enhance the characteristic fusion effect of the target, an additional pre-measuring head and the residual connection from a shallower backbone network are added so as to reserve the characteristic information to the maximum extent. Meanwhile, the low-level and high-resolution characteristic information is introduced into the characteristic fusion layer, so that the added prediction head is more sensitive to a long-distance segment bolt target. The four different scale features are more suitable for the scale change of the segment bolts.
The network structure of the improved YOLOV7 comprises four parts of an Input layer (Input), a Backbone network (Backbone), a Neck network (ck) and a regression output layer (Head). The improved Backbone network (Backbone) includes a MobileNetV3 module and an SPPCSPC module.
The SPPCSPC module adds parallel MaxPool operation for a plurality of times in a series of convolutions, so that the problems of image distortion and the like caused by image processing operation are avoided, and the problem that the convolutional neural network extracts the repeated characteristics of the picture is solved. SPP realizes feature extraction of the same feature map on different scales, and is beneficial to improving detection precision. The CSP reduces the calculated amount and improves the reasoning speed.
The backbone network also comprises a Focus module, wherein before the image is input into the backbone network, features of different layers are extracted from the image through depth convolution, and the feature image is downsampled through slicing operation in focusing, so that the original image information is reserved as much as possible.
The neck network adopts a structure of combining a feature pyramid network FPN and a path aggregation network PAN.
As shown in fig. 2, the modified MobileNetV3 network includes an inverted residual structure, a mixed attention module CBAM, an intermediate extension module, and an upsampling module. The MLP layer in the MLP in the network model is a perceptron structure constructed on the basis of 5×5 convolution, can flexibly add network layers, and the full-connection layer is composed of 1×1 convolution. Simultaneously, expansion convolution is introduced into the last two bottleneck modules of the MobileNet V3 to obtain receptive fields with different sizes, and more multi-scale information is obtained.
MobileNetV3 uses network structure search to automatically find the optimal network structure. By searching through a large scale of candidate network structures, an efficient network structure suitable for the mobile device can be found. The network structure obtained after searching can carry out migration learning on different tasks and data sets. The MobileNet V3 introduces an inverse residual structure to construct a depth separable convolution block, and the structure can increase the nonlinear transformation capability while keeping the model lightweight. The inverse residual structure includes a 1x1 convolution kernel for channel number expansion, followed by a dw (depthwise) convolution kernel, and finally by a 1x1 convolution kernel for channel reduction.
The mixed attention module CBAM, as shown in fig. 3, comprises 2 independent sub-modules, a channel attention module (Channel Attention Module, CAM) and a spatial attention module (Spartial Attention Module, SAM), which perform channel and spatial attention, respectively. The channel information of the feature map is then aggregated by two pooling operations, which are then concatenated and convolved by a standard convolution layer to generate an attention map, with the two modules being in series.
Given intermediate feature map F ε R C×H×W The CBAM derives a one-dimensional channel attention map M therefrom c ∈R C×1×1 And two-dimensional spatial attention map M s ∈R 1×H×W
Wherein,representing element-based multiplication.
(2) Generating a measurement target point by combining orb feature matching through an improved lightweight YOLOV7 algorithm, and finding out the pixel position of the bolt on the target duct piece; and then, acquiring the position of the target bolt by using an industrial camera, and acquiring the accurate position coordinate of the bolt in the three-dimensional world by combining the depth information and the pixel information acquired by the two.
The improved orb characteristic matching algorithm can perform stable characteristic matching with scale invariance and rotation invariance, and a matching library is built by using bolt pictures of different angles, so that duct piece bolts of different angles and distances can be matched. An improved orb feature matching algorithm comprising the steps of:
1) Detecting characteristic points;
extracting characteristic points by using a FAST-9 algorithm, and defining moments of an image block in one image block of an intra-segment groove bolt as follows:
finding the centroid of the image block by moment:
a direction vector OC is constructed from the center O and the centroid O of the image block, and the feature point direction is defined:
θ=atan2(m 01 ,m 10 )
2) Calculating a characteristic point calculation operator;
the method is realized by using a BRIEF algorithm, taking a neighborhood window of S by taking a characteristic point as a center, comparing the pixel values of two points by selecting the two points p (x) and p (y) on the window, and carrying out the following assignment:
3) Matching the characteristic points;
and finding out the corresponding relation of the feature points between different images, measuring the distances of descriptors for all the feature points in the two images, and then sequencing to obtain the nearest one as a matching point.
By installing the industrial camera on the segment erector, an industrial camera coordinate system O is established A -X A Y A Z A ,O A Is the camera optical center point. And calibrating the internal and external parameters of the camera by using a Zhang's calibration method to obtain a parameter matrix of the camera. By combining the depth information and the pixel information obtained by the two, the accurate coordinates of the complete segment bolt in the three-dimensional world can be obtained according to the relation between the world coordinate system and the calibration camera coordinates, and the automatic assembly of the segment is completed.
(2.1) calibrating parameters in a camera;
the camera imaging principle is aperture imaging, and an object in the three-dimensional world is mapped into an inverted real image on an imaging plane provided with a photosensitive element through a camera aperture. The real image converts the optical signal into an electric signal through a photosensitive element, and a digital image is obtained through conversion and amplification processing. Obtaining an industrial camera internal parameter matrix k, which is in the form of:
wherein k is x 、k y Is the scale factor of the camera in the X-axis and Y-axis, (u) 0 ,v 0 ) Is the intersection point of the optical axis center line of the industrial camera and the imaging plane. For example, the bolt has a coordinate (x 1 ,y 1 ,z 1 ) Describing the relation between the bolt and the image point by using an internal reference matrix, the pixel coordinates (u, v) are as follows:
(2.2) Camera external parameter calibration
The external parameter calibration of the camera refers to the process of determining the position and orientation of the camera in the world coordinate system. Through external parameter calibration, a camera coordinate system and a world coordinate system can be corresponding, so that the position of an object in the world coordinate system can be determined in an image. The external parameters of the camera are typically represented by a Rotation Matrix (Rotation Matrix) and translation vectors (Translation Vector). The rotation matrix describes a rotational relationship between the camera coordinate system and the world coordinate system, and the translation vector describes a translational relationship between the camera coordinate system and the world coordinate system.
The process of external parameter calibration can be expressed by the following formula:
P=R*X+T
where P is the point coordinates in the camera coordinate system, R is the rotation matrix, X is the point coordinates in the world coordinate system, and T is the translation vector. The rotation matrix R and translation vector T can be solved by the known point coordinates X in the world coordinate system and the corresponding point coordinates P in the camera coordinate system.
(2.3) describing the positional relationship of the camera in the world coordinate system by using the internal and external parameter matrix of the camera as follows:
wherein R and t respectively represent the conversion relation between each axis of the world coordinate system and the origin and each axis of the coordinate system and the optical center relative to the origin of the camera optical center; m is M 1 An internal parameter matrix called camera, M 2 Referred to as the external parameter matrix of the camera.
In order to verify the effectiveness of the improved YOLOV7 algorithm, the segment bolt images under different backgrounds are selected as data sets, the data sets are composed of at least one segment inner groove bolt image, the segment assembly site environment is complex, the sample size is small, the data are expanded to 2000 samples by using geometric class transformation (overturning, rotating, cutting, deforming and scaling), labeling is carried out one by one, and the data sets are created.
The training set is sent into a neural network, 4 samples are fed into each batch, the accuracy rate greatly fluctuates during iteration for 20 times, the accuracy rate is about 0.8 along with the reduction of a loss value, and the over-fitting phenomenon occurs during 100 iterations; training the neural network added with the mixed attention mechanism by using the same data set, and generating the same overfitting phenomenon; the data is expanded to 2000 sample sizes by utilizing data enhancement, and a mixed attention mechanism is added for testing.
Although the training is slightly slow after the attention mechanism is added in the training process, the training effect after the attention mechanism is added is better than the previous result, and for the same segment image, when the neural network predicts that the neural network is a 'backup segment', the attention degree of the network to the key region before and after the attention mechanism is added is observed. When no attention mechanism is added, the neural network focuses not only on the region in part of the "target segment" but also on the extraneous region in some images when identifying the "backup segment", so an attention mechanism is added to improve this problem.
The industrial camera is installed on the duct piece splicing machine, and the duct piece splicing machine lifting mechanism consists of a left lifting hydraulic cylinder and a right lifting hydraulic cylinder, so that the industrial camera is installed and fixed at the lower end of the right lifting hydraulic cylinder; the optical center axis of the industrial camera is parallel to the axis of the lifting hydraulic cylinder, so that the industrial camera can rotate and lift along with the rotating mechanism and the right lifting hydraulic cylinder; establishing an industrial camera coordinate system O A -X A Y A Z A ,O A Is the industrial camera optical center point, Y A The axis being parallel to the axis of the translation mechanism, Z A The axis being parallel to the axis of the lifting mechanism, X A The axis can be derived from the left hand rule; and acquiring bolt information at different positions and angles by using an industrial camera.
The positions of the segment bolts are measured from different angles and distances, and the measurement results of the positions of the segment bolts are shown in a table 1.
TABLE 1 numerical table of actual errors of segment bolts
Sequence number Actual value of coordinates/mm Coordinate measurement/mm Error per mm of axis
1 (-180,360,423) (-181.215,359.303,421.656) (1.215,0.697,1.344)
2 (300,200,700) (299.426,199.156,701.223) (0.574,0.844,1.223)
3 (400,326,532) (400.568,325.344,533.662) (0.568,0.656,1.662)
4 (500,486,320) (500.582,487.102,320.633) (0.582,1.102,0.633)
5 (-200,100,500) (-200.326,101.546,500.692) (0.326,1.546,0.692)
6 (350,230,450) (350.852,231.521,450.237) (0.852,1.521,0.273)
As can be seen from the above measurement results, the measured values have an error of not more than 3mm in the 3 axes. Therefore, the measuring method can meet the requirement of segment grabbing.
Compared with the prior art, the lightweight YOLOV7 model is designed and optimized, so that the calculation efficiency and the real-time performance of the model are improved; the backbone network is replaced by the MobileNet V3, so that the detection speed of the network is increased; in addition, a mixed attention mechanism is introduced, and the recognition and positioning capability of the model to the key segment bolts is enhanced by fusing spatial attention and channel attention. The invention collects and constructs a segment bolt data set for model training and evaluation; the recognition accuracy reaches 96% based on the improved YOLOV7 algorithm; the maximum error of the position coordinates of the segment bolts is not more than 3mm, and the industrial requirement is met.
While the applicant has described and illustrated the embodiments of the present invention in detail with reference to the drawings, it should be understood by those skilled in the art that the above embodiments are only preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not to limit the scope of the present invention, but any improvements or modifications based on the spirit of the present invention should fall within the scope of the present invention.

Claims (9)

1. The method for identifying and positioning the lightweight YOLOV7 duct piece bolt is characterized by comprising the following steps of:
(1) Constructing a lightweight YOLOV7 model, and carrying out target identification detection on bolts in the inner grooves of the segments to be spliced; adopting a lightweight YOLOV7 model as a basic target detector, replacing a backbone network with MobileNetV3, and introducing a mixed attention mechanism;
(2) Generating a measurement target point by combining orb feature matching through an improved lightweight YOLOV7 algorithm, and finding out the pixel position of the bolt on the target duct piece; and then, acquiring the position of the target bolt by using an industrial camera, and acquiring the accurate position coordinate of the bolt in the three-dimensional world by combining the depth information and the pixel information acquired by the two.
2. The lightweight yoov 7 segment bolt identification and locating method of claim 1 wherein in step (1) the modified lightweight yoov 7 network structure comprises an input layer, a backbone network, a neck network, a regression output layer, the backbone network comprising a modified MobileNetV3 module and SPPCSPC module.
3. The method for identifying and positioning the segment bolts of the lightweight YOLOV7 according to claim 2, wherein the modified MobileNetV3 network comprises an inverted residual structure, a mixed attention module CBAM, an intermediate expansion module, an up-sampling module, a perceptron structure, and a hole convolution.
4. The method for recognizing and positioning a segment bolt of a lightweight yoov 7 of claim 3 wherein the hybrid attention module includes a channel attention module and a spatial attention module for channel and spatial attention, respectively; the channel information of the feature map is then aggregated by two pooling operations, which are then concatenated and convolved by a standard convolution layer to generate an attention map, with the two modules being in series.
5. A lightweight YOLOV7 segment bolt identification and location method according to claim 3 wherein the inverted residual structure comprises a 1x1 convolution kernel for channel number expansion followed by a dw convolution kernel and finally a 1x1 convolution kernel for channel reduction.
6. The method for recognizing and positioning the segment bolts of the lightweight YOLOV7 according to claim 3, wherein the improved multi-layer perceptron structure in the mobilenet v3 module is characterized in that a 5×5 convolution layer in the bottleneck module is added with a 1×1 convolution layer to serve as a full connection layer of the perceptron, an h-swish activation function is introduced to form a sensor which is embedded into a deep network, and a cavity convolution is introduced into the mobilenet v3 module.
7. The method for recognizing and positioning a segment bolt of a lightweight YOLOV7 according to claim 1, wherein in the step (2), an industrial camera is mounted on a segment erector to establish an industrial camera coordinate system O A -X A Y A Z A ,O A Is a camera optical center point; and calibrating the internal and external parameters of the camera, acquiring a parameter matrix of the camera, and acquiring the accurate coordinates of the complete segment bolt in the three-dimensional world according to the relationship between the world coordinate system and the calibrated camera coordinates.
8. The method for identifying and positioning the segment bolts of the lightweight YOLOV7 according to claim 7, wherein the acquisition of the parameter matrix k in the industrial camera is in the form of:
wherein k is x 、k y Is the scale factor of the camera in the X-axis and Y-axis, (u) 0 ,v 0 ) The intersection point of the optical axis center line of the industrial camera and the imaging plane;
describing the relation between the bolt and the image point by using an internal reference matrix, and then the pixel coordinates (u, v) are as follows:
9. the method for identifying and positioning the segment bolts of the lightweight YOLOV7 according to claim 7, wherein the process of calibrating the parameters outside the camera is represented by the following formula:
P=R*X+T
where P is the point coordinates in the camera coordinate system, R is the rotation matrix, X is the point coordinates in the world coordinate system, and T is the translation vector.
CN202311308300.1A 2023-10-10 2023-10-10 Segment bolt identification and positioning method for lightweight YOLOV7 Pending CN117292199A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311308300.1A CN117292199A (en) 2023-10-10 2023-10-10 Segment bolt identification and positioning method for lightweight YOLOV7

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311308300.1A CN117292199A (en) 2023-10-10 2023-10-10 Segment bolt identification and positioning method for lightweight YOLOV7

Publications (1)

Publication Number Publication Date
CN117292199A true CN117292199A (en) 2023-12-26

Family

ID=89256942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311308300.1A Pending CN117292199A (en) 2023-10-10 2023-10-10 Segment bolt identification and positioning method for lightweight YOLOV7

Country Status (1)

Country Link
CN (1) CN117292199A (en)

Similar Documents

Publication Publication Date Title
CN110059558B (en) Orchard obstacle real-time detection method based on improved SSD network
CN111340797B (en) Laser radar and binocular camera data fusion detection method and system
CN109655019B (en) Cargo volume measurement method based on deep learning and three-dimensional reconstruction
CN111223088B (en) Casting surface defect identification method based on deep convolutional neural network
CN111062915B (en) Real-time steel pipe defect detection method based on improved YOLOv3 model
CN110675418B (en) Target track optimization method based on DS evidence theory
CN113221647B (en) 6D pose estimation method fusing point cloud local features
CN112419429B (en) Large-scale workpiece surface defect detection calibration method based on multiple viewing angles
CN115439458A (en) Industrial image defect target detection algorithm based on depth map attention
CN115965960A (en) Weld joint identification method based on deep learning and 3D point cloud
CN115424237A (en) Forward vehicle identification and distance detection method based on deep learning
CN115049945A (en) Method and device for extracting lodging area of wheat based on unmanned aerial vehicle image
CN114332083A (en) PFNet-based industrial product camouflage flaw identification method
CN111339967A (en) Pedestrian detection method based on multi-view graph convolution network
Hou et al. A pointer meter reading recognition method based on YOLOX and semantic segmentation technology
CN114639115A (en) 3D pedestrian detection method based on fusion of human body key points and laser radar
CN116310902A (en) Unmanned aerial vehicle target detection method and system based on lightweight neural network
CN116935356A (en) Weak supervision-based automatic driving multi-mode picture and point cloud instance segmentation method
CN117292199A (en) Segment bolt identification and positioning method for lightweight YOLOV7
CN116051808A (en) YOLOv 5-based lightweight part identification and positioning method
CN116385477A (en) Tower image registration method based on image segmentation
Gao et al. Low saliency crack detection based on improved multimodal object detection network: an example of wind turbine blade inner surface
CN112069997B (en) Unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net
Shi et al. A fast workpiece detection method based on multi-feature fused SSD
Wang et al. A binocular vision method for precise hole recognition in satellite assembly systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination