CN114972882A - Wear surface damage depth estimation method and system based on multi-attention machine system - Google Patents
Wear surface damage depth estimation method and system based on multi-attention machine system Download PDFInfo
- Publication number
- CN114972882A CN114972882A CN202210689847.XA CN202210689847A CN114972882A CN 114972882 A CN114972882 A CN 114972882A CN 202210689847 A CN202210689847 A CN 202210689847A CN 114972882 A CN114972882 A CN 114972882A
- Authority
- CN
- China
- Prior art keywords
- wear surface
- depth
- damage
- relu
- estimation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 230000011218 segmentation Effects 0.000 claims abstract description 65
- 238000000605 extraction Methods 0.000 claims abstract description 63
- 238000012549 training Methods 0.000 claims abstract description 36
- 230000007246 mechanism Effects 0.000 claims abstract description 28
- 238000005457 optimization Methods 0.000 claims abstract description 13
- 230000003044 adaptive effect Effects 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 83
- 238000010586 diagram Methods 0.000 claims description 24
- 238000001514 detection method Methods 0.000 claims description 19
- 238000003708 edge detection Methods 0.000 claims description 19
- 230000004927 fusion Effects 0.000 claims description 13
- 238000013507 mapping Methods 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 7
- 238000004519 manufacturing process Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 5
- 238000005259 measurement Methods 0.000 claims description 4
- 230000009466 transformation Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 238000005211 surface analysis Methods 0.000 abstract description 10
- 238000012876 topography Methods 0.000 description 9
- 238000004590 computer program Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 238000005299 abrasion Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000011065 in-situ storage Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 1
- 238000001218 confocal laser scanning microscopy Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000003801 milling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
Abstract
The invention discloses a wear surface damage depth estimation method and system based on a multi-attention machine mechanism, wherein a wear surface basic feature extraction layer is constructed by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU; fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model; obtaining a loss function of the wear surface depth estimation model in a weighting mode; selecting a wear surface image with a typical damage area as a training sample, taking a loss function as an optimization target, training a wear surface depth estimation model by adopting an adaptive moment estimation method, inputting a single wear surface image into the wear surface depth estimation model, and obtaining a damage area segmentation result graph and a depth information result graph of the wear surface. The method and the device effectively realize the estimation of the three-dimensional depth information from a single wear surface image, and solve the problems of high acquisition difficulty, low efficiency and high complexity of the depth information in the technical field of wear surface analysis.
Description
Technical Field
The invention belongs to the technical field of wear surface analysis in the field of machine fault diagnosis, and particularly relates to a wear surface damage depth estimation method and system based on a multi-attention machine system.
Background
The wear of the friction pairs reduces the operational reliability and stability of the mechanical equipment and may even lead to serious failures. The wear surface is a direct product of the wear process, and the morphology of the damaged area can characterize the wear evolution mechanism and the wear severity. Therefore, wear surface analysis techniques are considered to be the most direct and reliable technical means for critical tribological system condition monitoring and fault diagnosis. Under the promotion of the concept of 'foreseeing and preventing' maintenance of mechanical equipment, the wear surface analysis technology is rapidly developing towards the in-situ and three-dimensional directions, and the in-situ on-machine detection technology based on the industrial endoscope becomes a main technical means. However, the accuracy and efficiency of wear surface analysis are restricted by the complicated shape and inconsistent size of the damage, and how to acquire three-dimensional morphology information from a single two-dimensional surface image becomes a difficult point in the research of wear surface analysis technology.
The wear surface topography analysis technology based on machine vision realizes three-dimensional topography information extraction based on two-dimensional wear surface images. For example, a wear surface image obtained by a scanning electron microscope is taken as a research object, and surface three-dimensional reconstruction is realized by using a multi-view geometric constraint method. The fusion technique of shadow shape restoration and stereo vision enables estimation of wear surface depth information. The method for enhancing the shadow shape recovery transformation based on the complex wavelet is used for acquiring the surface three-dimensional roughness of the milling mechanical part. In addition, photometric stereo vision is innovatively used in the extraction of three-dimensional topography information of the in-situ wear surface. However, the method depends on auxiliary information such as multiple views, multiple ideal assumptions, multiple light sources and the like, but a complex mechanical equipment industrial environment and an endoscope imaging system cause difficulty in building a multi-vision system and acquiring images, and limit application scenarios of the method in endoscope-based wear surface detection. For this reason, three-dimensional topography reconstruction remains an important task for wear surface analysis.
In recent years, the monocular depth estimation method provides a better research prospect for the depth estimation of a single worn surface by establishing the mapping relation between the pixel value and the depth value of a single two-dimensional image. However, the problems of fuzzy damage area edges and inaccurate damage shape estimation still exist in the depth estimation of the damage area of the wear surface. The key reason is that the existing monocular depth estimation model processes the whole area of the wear surface with the same weight, but the damage area is only a small part of the wear surface, so that the three-dimensional appearance of the damage area cannot be effectively reconstructed.
Generally, the existing three-dimensional reconstruction technology of the wear surface achieves certain engineering effect in condition monitoring and fault diagnosis. However, due to the restriction of the endoscope volume and the complex imaging environment inside the mechanical equipment, the application of the three-dimensional topography acquisition technology based on multi-view and multi-light-source assistance is limited, and the problems of fuzzy edges of a wear damage area, inaccurate damage topography estimation and the like exist in a single-image depth estimation model, so that the accuracy of extracting the three-dimensional topography of the wear surface damage area is reduced.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a wear surface damage depth estimation method and system based on a multiple attention mechanism, aiming at the defects in the prior art, wherein the multiple attention mechanism is introduced into a U-net network architecture to extract a feature map of a wear area of a wear surface with more attention, so as to realize damage depth information estimation from a single wear surface image, and provide a more effective three-dimensional topography information acquisition method for a wear surface analysis technology.
The invention adopts the following technical scheme:
the method for estimating the damage depth of the wear surface based on the multi-attention machine system comprises the following steps:
s1, constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
s2, constructing a damaged area segmentation branch network and a depth information estimation branch network based on the wear surface basic feature extraction layer constructed in the step S1 by combining the U-Net structure and the wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model;
s3, constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged region segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed in the step S2 in a weighting mode;
s4, acquiring a two-dimensional wear surface image, and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking the loss function constructed in the step S3 as an optimization target, training the wear surface depth estimation model constructed in the step S2 by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result graph and a depth information result graph of the wear surface.
Specifically, step S1 specifically includes:
s101, constructing two convolution blocks with a Conv-ReLU standard structure, and realizing primary feature extraction of the wear surface image;
s102, extracting Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network, wherein the Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network share 4 layers of convolution blocks as high-level semantic feature extraction blocks of the worn surface, and combining the two Conv-ReLU convolution blocks in the step S101 to establish a basic feature extraction network to realize basic feature extraction of the worn surface image.
Further, in step S101, constructing two convolution blocks with the standard structure of Conv-ReLU specifically includes:
the first convolution block and the second convolution block adopt convolution blocks with the structure of Conv-ReLU to perform primary image feature extraction, and each convolution block comprises two layers of convolution operation; performing convolution operation on each layer of a first convolution block by adopting 3 convolution kernels with the size of 3 multiplied by 3 and a step size stride of 1, and performing characteristic nonlinear mapping by adopting a ReLU activation function to obtain an F0 characteristic diagram; and each layer of the second convolution block adopts 64 convolution kernels with the size of 3 × 3 and the step length of stride 2 and stride 1 respectively to carry out convolution operation, then a ReLU activation function is adopted to carry out characteristic nonlinear mapping to obtain an F1 characteristic diagram, and finally a characteristic diagram of 64 channels is output to the subsequent four convolution blocks.
Specifically, in step S2, a damaged region segmentation branch network is established by using convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and a damaged region segmentation result graph corresponding to the size of the two-dimensional wear surface is extracted; and establishing a depth information estimation branch network based on the U-Net network architecture, and estimating the depth information of the damaged area.
Further, the establishing of the damaged area division branch network specifically includes:
the convolution layers with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU are connected in series; for a first structure Conv-BN-ReLU, firstly carrying out convolution operation through a convolution kernel with the size of 3 multiplied by 3 and a step size stride equal to 1, then carrying out normalization processing on characteristics by adopting BN batch normalization operation, accelerating network convergence, and finally carrying out characteristic nonlinear mapping by adopting a ReLU activation function to obtain a characteristic diagram after noise suppression; for the second structure Convtransp-BN-ReLU, the difference from the structure Conv-BN-ReLU is that a Convtransp deconvolution operation of 3 × 3 size is used, and the step size stride is 2, finally achieving a 2-fold enlargement of the input feature map; then cascading the feature graph with the feature graph from the basic feature extraction layer; and finally, mapping the P0 feature maps of the 32 channels to a damaged area segmentation map of the 3-dimensional channels by adopting a convolution block with the structure of Conv-ReLU.
Further, the establishment of the depth information estimation branch network based on the U-Net network architecture specifically includes:
adopting convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU as upsampling operation; expanding the input feature map by two times to be used as the input of a coordinate attention module; embedding the amplified feature map into a coordinate attention module, acquiring a feature image containing position coordinate information, and performing cascade combination on the feature image and the feature map extracted by the efficient pyramid module; and extracting a characteristic image with local damage area information from the characteristic image of the coding layer through a high-efficiency pyramid segmentation attention module, and fusing and cascading the characteristic image with the characteristics extracted by the coordinate attention module to obtain a wear surface characteristic image with position information, space information and channel information.
Specifically, step S3 specifically includes:
s301, constructing a depth estimation loss function of self-adaptive distribution of weight of a damaged area
S302, aiming at the difficult problem of predicted edge blurring caused by large edge depth change of the damaged area, in order to improve the estimation precision of the edge depth value of the damaged area, a two-dimensional Haar wavelet three-layer transformation is adopted to construct an edge detection loss function
S303, dividing the wear surface area into three types of background, scratch and pit, and selecting a three-type cross entropy function as a loss function of the damage area segmentation network branch
S304, adopting a structure consistency loss functionImproving the similarity between the predicted depth map and the measurement result of the laser confocal microscope;
and S305, obtaining a loss function of the wear surface depth estimation model through a weighted summation mode on the basis of the depth information mean square error loss obtained in the step S301, the edge detection loss function obtained in the step S302, the loss function of the damaged area segmentation network branch obtained in the step S303 and the structure consistency loss function obtained in the step S304.
In particular, the loss function of the wear surface depth estimation modelComprises the following steps:
wherein, y represents the wear surface estimated depth map,representing the depth map measured by a confocal laser microscope, p being the predicted pixel class,and lambda is the weight coefficient of the edge loss function of the control depth map for the actual pixel point category.
Specifically, step S4 specifically includes:
s401, acquiring a two-dimensional wear surface image with a damaged area, acquiring a depth image of the corresponding area and a mark image of the corresponding damaged area, and manufacturing a training sample and a test sample;
s402, using ResNet-50 network weight trained on an ImageNet data set as an encoder initialization weight parameter, and training a wear surface depth estimation model by adopting the training sample manufactured in the step S401;
and S403, setting learning rate parameters, optimizing and training the wear surface depth estimation model in the step S402 by adopting an adaptive moment estimation method, inputting the test sample manufactured in the step S401 into the optimally trained wear surface depth estimation model, and realizing the wear surface damage region depth information estimation based on the multiple attention mechanism surface feature extraction and damage region positioning guidance.
In a second aspect, an embodiment of the present invention provides a wear surface damage depth estimation system based on a multi-attention machine system, including:
the extraction module is used for constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a trunk and combining the convolution blocks of two layers of Conv-ReLU;
the fusion module is used for constructing a damaged area segmentation branch network and a depth information estimation branch network based on a wear surface basic feature extraction layer constructed by the extraction module by combining a U-Net structure architecture and wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to serve as a wear surface depth estimation model;
the function module is used for constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged area segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed by the fusion module in a weighting mode;
the estimation module is used for acquiring a two-dimensional wear surface image and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking a loss function constructed by a function module as an optimization target, training a wear surface depth estimation model constructed by a fusion module by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result image and a depth information result image of the wear surface.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention relates to a wear surface damage depth estimation method based on a multi-attention machine system, which takes a single two-dimensional wear surface image as a research object, selects a ResNet-50 coding layer as a basic coding layer trunk and extracts characteristic graphs of wear surfaces under different scales; establishing a damaged area segmentation branch network and a depth information estimation branch network by combining an efficient pyramid segmentation attention module (EPSA) module, a Coordinate Attention (CA) module and the like to detect a damaged area of the wear surface and estimate depth information of the damaged area; the network model adopts the weighted sum of four types of loss functions including depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss as an overall loss function, parameter training is carried out by using an Adam optimization algorithm to obtain a final surface depth estimation model, the depth information estimation of a damaged area of a wear surface is realized, and more effective information is provided for wear mechanism and state analysis
Further, step S1 constructs a basic feature extraction network structure based on the Conv-ReLU rolling block and the ResNet-50 feature extraction rolling block, so as to extract semantic features of the wear surface, provide semantic feature maps of different scales for a subsequent damaged region segmentation network and a depth estimation network, and contribute to improving the damaged region depth estimation accuracy.
Further, step S101 adopts convolution blocks of two layers of Conv-ReLU to perform wear surface primary semantic feature extraction, extracts a 64-channel F1 feature map from the damage surface map input as 3 channels, provides rich primary semantic feature maps for subsequent ResNet-50 feature extraction convolution blocks, and extracts a 64-channel F1 feature map to a feature map F5 with 2048 channels by the ResNet-50 feature extraction convolution block, thereby implementing wear surface multi-level semantic feature extraction.
Further, in step S2, a damaged region segmentation branch network is established by using convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and a damaged region segmentation result map corresponding to the size of the two-dimensional wear surface is extracted, so as to provide more damaged region feature maps for the depth estimation network, constrain and position the damaged region, and improve the depth estimation accuracy of the depth estimation network for the damaged region; and establishing a depth information estimation branch network based on the U-Net network architecture, and fully utilizing the multi-level semantic features and the damaged area segmentation feature map in the step S1 to realize and improve the depth information estimation precision of the damaged area.
Further, a damage region segmentation branch network is constructed to obtain the position of the damage region and a corresponding characteristic diagram, so that rich damage region characteristic diagrams are provided for a subsequent depth estimation branch network.
Further, based on a U-Net network architecture, the multi-level semantic features extracted in step S1, the feature map extracted by the damaged region segmentation branch network, and the features extracted by the multi-attention mechanism module are continuously fused and superimposed, the feature map is continuously restored to the depth estimation result map with the same size as the input worn surface image by using upsampling, and meanwhile, higher damaged region texture detail information can be maintained.
Further, in step S3, a damage region base depth loss function is first constructed based on step S301To constrain the differences between the depth estimation results and the label depth map of the LSCM acquisition; secondly, adding an edge detection loss function on the basis of the step S301To improve the estimated damage region edge definition; after step S301 and step S302, a loss function is employedClassifying the estimated pixel points to segment the damaged area, and finally estimating and extracting the depth information of the damaged area; finally adopting a structure consistency loss functionSo as to improve the overall accuracy of the final estimated depth image of the damaged area.
Further, the final loss function is composed of depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss function; wherein depth information mean square error lossA loss function of the depth estimation branch is used for ensuring the accuracy of the depth map estimation; damage detection cross entropy lossAccurately dividing the damaged area; edge detection mean square error lossThe method is used for improving the edge definition of the damaged area in the depth map; the structural consistency loss function is used to improve the overall topography accuracy of the final wear surface depth map.
Further, a wear surface damage area marking map and a depth map are made to serve as designed depth estimation network input data, and the loss function designed in the step S3 is used as an optimization target to ensure that the estimated result of the final model is closer to the depth map acquired by the laser confocal microscope; and (4) training the wear surface depth estimation model constructed in the step S2 by adopting an adaptive moment estimation method to improve the convergence rate of the network, so that the depth estimation model can be trained quickly and a better depth estimation result can be obtained.
It is understood that the beneficial effects of the second aspect can be referred to the related description of the first aspect, and are not described herein again.
In conclusion, the three-dimensional depth information of a single abrasion surface image is effectively estimated based on the monocular depth estimation model, and the problems of high difficulty in acquiring the depth information, low efficiency and high complexity in the technical field of abrasion surface analysis are solved.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a framework diagram of a wear surface depth estimation step;
FIG. 2 is a flow chart of the production of a two-dimensional wear surface map, a wear surface depth map, and a damaged area signature map;
FIG. 3 is a three-dimensional visualization of typical wear surface damage maps and their corresponding depth maps, damage region signature maps, and depths;
FIG. 4 is a network architecture diagram of a wear surface damage region depth estimation model based on multiple self-attention mechanism fusion;
FIG. 5 is a comparison of loss values during model training with different attention mechanism models;
fig. 6 is a depth estimation result map of three types of wear surfaces and a corresponding depth three-dimensional visualization map.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be understood that the terms "comprises" and/or "comprising" indicate the presence of the stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and including such combinations, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that although the terms first, second, third, etc. may be used to describe preset ranges, etc. in embodiments of the present invention, these preset ranges should not be limited to these terms. These terms are only used to distinguish preset ranges from each other. For example, the first preset range may also be referred to as a second preset range, and similarly, the second preset range may also be referred to as the first preset range, without departing from the scope of the embodiments of the present invention.
The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.
The invention provides a wear surface damage depth estimation method based on a multi-attention machine system, which takes a two-dimensional wear surface image as a research object, makes a corresponding damage area marking image, and makes a corresponding standard depth image by using a laser confocal microscope; on the basis, constructing a wear surface depth estimation network model of a double-task model based on a U-Net framework; extracting a network by taking a ResNet-50 network as a basic feature, extracting a wear surface basic feature, and inputting the wear surface basic feature into a damaged area segmentation and depth information estimation double-branch network; merging the feature maps of a Coordinate Attention (CA) module, a high-Efficiency Pyramid Segmentation Attention (EPSA) module and a damaged area segmentation branch into a depth estimation network branch to obtain a feature map with damaged area information; the constructed network model adopts depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and a structure consistency loss function, and obtains a final worn surface depth estimation model by using an Adam optimization algorithm training model through weighting and taking the weighted sum as an overall loss function, so as to realize depth estimation of a worn surface damage area; the method is based on the monocular depth estimation model, effectively realizes the estimation of the three-dimensional depth information of a single abrasion surface image, and solves the problems of high difficulty, low efficiency and high complexity in the depth information acquisition in the technical field of abrasion surface analysis.
Referring to fig. 1, a depth estimation model of a damaged area of a wear surface based on fusion of multiple self-attention mechanisms is shown, where three-dimensional morphology information of the wear surface is the basis of an abrasive particle analysis technique, which directly affects the precision of damage detection and state identification of the wear surface. However, the three-dimensional reconstruction method based on multiple views, multiple light sources and the like is difficult to be applied to the measurement environment of the size and the narrowness of an industrial endoscope, and the problems of inaccurate shape estimation of a damaged area, fuzzy edge and the like still exist in the information estimation of the damaged depth of the worn surface based on the monocular depth estimation method, so that the damaged shape reconstruction accuracy of the monocular depth estimation model in the practical application is greatly reduced. According to the method, a multiple attention mechanism is integrated into a depth estimation branch coding layer based on a U-Net network framework, so that the characterization capability of a wear surface characteristic graph on a damage area is improved, then, damage area segmentation branches are introduced into a wear surface depth estimation model, further more damage area positioning characteristic graphs are provided for the depth estimation branches, and finally, a double-branch task model is constructed, so that the detection and depth information estimation of the wear surface damage area are realized.
The invention relates to a wear surface damage depth estimation method based on a multi-attention machine mechanism, which comprises the following steps of:
s1, constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a backbone and combining the convolution blocks of two layers of Conv-ReLU to realize automatic extraction of wear surface multi-scale features as feature input in the step S2;
s101, constructing two convolution blocks with a Conv-ReLU standard structure, and realizing primary feature extraction of the wear surface image;
referring to fig. 2, the basic feature extraction layer includes six volume blocks, wherein the first and second volume blocks adopt convolution blocks with a structure of Conv-ReLU to realize primary feature extraction of the wear surface image;
the construction of two convolution blocks with the standard structure Conv-ReLU specifically comprises the following steps:
the first convolution block and the second convolution block adopt convolution blocks with the structure of Conv-ReLU to carry out primary image feature extraction, and each convolution block comprises two layers of convolution operation; each layer of the first convolution adopts 3 convolution kernels with the size of 3 multiplied by 3 and the step size stride being 1 to carry out convolution operation, and then adopts a ReLU activation function to carry out characteristic nonlinear mapping to obtain an F0 characteristic diagram; the second convolution block adopts 64 convolution kernels with the size of 3 × 3 in each layer, the step length is stride 2, stride 1 respectively, the convolution operation is carried out, then the ReLU activation function is adopted to carry out characteristic nonlinear mapping, an F1 characteristic diagram is obtained, and finally the characteristic diagram of 64 channels is output to the following four convolution blocks.
S102, extracting Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network, wherein the Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network share 4 layers of convolution blocks as high-level semantic feature extraction blocks of the worn surface, and combining the two Conv-ReLU convolution blocks in the step S101 to establish a basic feature extraction network to realize basic feature extraction of the worn surface image.
And (3) performing further wear surface feature extraction on the last four convolution blocks by adopting 4 layers of convolution blocks including Res1, Res2, Res3 and Res4 in a ResNet-50 feature extraction network, wherein the number of feature channels output by each convolution block is 256, 512, 1024 and 2048 respectively, so that an F2-F5 feature map is obtained, and high-level feature extraction of the wear surface image is realized.
S2, constructing a damaged area segmentation branch network and a depth information estimation branch network based on the wear surface basic feature extraction layer constructed in the step S1 by combining the U-Net structure and the wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model;
s201, establishing a damaged area segmentation branch by adopting convolution blocks with structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and extracting a damaged area segmentation result graph corresponding to the size of the two-dimensional wear surface.
The damage region division branches into a decoder of a U-Net framework, and the forward propagation process of the damage region division is opposite to that of the encoder. When a decoder is constructed, upsampling is carried out on input features by adopting upsampling operation in each step, an input feature layer is amplified by 2 times, and finally, the input features are mapped into a damaged area segmentation result graph with the same size as an image of a wear surface through 5 times of upsampling operation and jump connection operation.
The specific steps of establishing the damaged area division branch network are as follows:
s2011, upsampling operation
The structure is mainly composed of convolution layers of Conv-BN-ReLU and Convtransp-BN-ReLU which are connected in series; for a first structure Conv-BN-ReLU, firstly carrying out convolution operation through a convolution kernel with the size of 3 multiplied by 3 and a step size stride equal to 1, then carrying out normalization processing on characteristics by adopting BN batch normalization operation, accelerating network convergence, and finally carrying out characteristic nonlinear mapping by adopting a ReLU activation function to obtain a characteristic diagram after noise suppression; for the second structure Convtransp-BN-ReLU, the difference from the structure Conv-BN-ReLU is that a Convtransp deconvolution operation of 3 × 3 size is used, and the step size stride is 2, finally achieving a 2-fold enlargement of the input feature map;
s2012, jump connection operation
After the up-sampling operation, the feature map and the feature map from the basic feature extraction layer are cascaded, so that the feature map can have higher resolution in the up-sampling process.
S2013, convolution Block 1 operation
And finally, mapping the P0 feature maps of the 32 channels to a damaged area segmentation map of the 3-dimensional channels by adopting a convolution block with the structure of Conv-ReLU.
S202, establishing a depth information estimation branch network based on a U-Net network architecture to estimate depth information of a damaged area;
the depth information estimation branch structure also corresponds to the coding layer, and the convolution blocks using Conv-BN-ReLU and Convtransp-BN-ReLU perform an upsampling operation to restore the feature map size to be the same as the wear surface image. In order to fully extract the characteristic image with the local damage area information, a Coordinate Attention (CA) module and an Efficient Pyramid Segmentation Attention (EPSA) module are respectively introduced, and a characteristic map from a damage area segmentation branch is merged in the last convolution block operation, so that the attention to the typical wear area information of the wear surface in the wear surface depth estimation process is further strengthened, and the damage area depth information estimation is finally realized.
S2021, adopting convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU as an upsampling operation;
and inputting the feature graph obtained by up-sampling the features from the feature basic coding layer and amplified by 2 times into a Coordinate Attention (CA) module for further attention feature extraction, and acquiring a feature image containing position coordinate information.
S2022, embedding a Coordinate Attention (CA) module, and acquiring a feature image containing position coordinate information;
when jumping connection is carried out, the basic coding layer features of corresponding positions are extracted by a high-Efficiency Pyramid Segmentation Attention (EPSA) module to obtain feature images with local damage area information, and then the feature images and the feature images containing position coordinate information are fused and cascaded to obtain a wear surface feature map with damage area position information, space information and channel information.
S2023, extracting the feature image with the local damage area information from the feature image of the coding layer through an Efficient Pyramid Segmentation Attention (EPSA) module, and fusing and cascading the feature image with the features extracted by the CA module to obtain the wear surface feature image with position information, space information and channel information.
After five times of up-sampling-CA-EPSA-jump connection cascade operation, the feature maps from the damaged area segmentation network are fused, and finally the C0 feature maps of 64 channels are mapped to the wear surface depth map of the 1-dimensional channel by the convolution block with the Conv-ReLU structure.
S3, aiming at the damaged area segmentation and depth information estimation double-branch structure constructed in the step S2, constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function, and obtaining an integral model loss function in a weighting mode to improve the depth estimation precision of the damaged area of the model;
the loss function is the target of network optimization training, and network parameter optimization is guided by back propagation through errors between prediction results of the damage segmentation graph and the depth graph and a real labeled graph. The constructed damaged region segmentation branch and the depth information estimation branch respectively correspond to damage detection cross entropy loss and depth information mean square error loss, and in order to solve the problems of edge blurring and damaged region morphology blurring of a predicted depth map, a Haar wavelet-based edge detection mean square error loss and structure consistency loss function are adopted, and an integral model loss function is constructed in a weighting mode.
S301, by calculating the mean square error between the estimated depth map of the worn surface and the depth map obtained by the laser confocal microscope, the difference between the predicted depth map and a target value is constrained, more weights are distributed to the damaged area, the problem that the worn surface area is not uniformly distributed is solved, and a depth estimation loss function of self-adaptive distribution of the weights of the damaged area is constructed, so that the accuracy of estimation of the depth information of the damaged area of the model is improved. As shown in formula (1) and formula (2);
a=Num other areas /Num damage areas (2)
wherein i represents the coordinate position of the pixel point, N represents the number of the image pixel points, and y i An estimated depth value representing the wear surface,representing the depth value measured by a laser confocal microscope, wherein a is a super parameter for controlling the weight of each pixel point, and Num represents the number of the pixel points in the area;
s302, aiming at the difficult problem of predicted edge blurring caused by large edge depth change of the damaged area, in order to improve the estimation precision of the edge depth value of the damaged area, a two-dimensional Haar wavelet three-layer transformation is adopted to construct an edge detection loss function, as shown in a formula (3);
wherein V, C, H correspond to the vertical high frequency coefficient, the diagonal high frequency coefficient, and the horizontal high frequency coefficient of the depth image, respectively, L is a low frequency coefficient, y represents the wear surface estimated depth map,representing a depth map measured by a confocal laser microscope.
S303, considering that the typical damage types of the wear surface are scratches and pits, dividing the wear surface area into three types of background, scratches and pits, and selecting a three-classification cross entropy function as a loss function of a damage area segmentation network branch, as shown in a formula (4);
in the formula, N is the number of image pixel points, and M is the number of region categories equal to 3; p is a radical of ic Is an area label vector with the length of 3, the values are 0 and 1,representing the probability that the prediction sample belongs to c.
S304, in order to improve the similarity between the predicted depth map and the measurement result of the laser confocal microscope, a structural consistency loss function is adopted, as shown in a formula (5);
in the formula, mu y Andrespectively represent y andis determined by the average value of (a) of (b),andrespectively represent y andthe variance of (a);the expression is y andcovariance of c 1 And c 1 Is a constant that prevents the denominator from being zero, y represents the wear surface estimated depth map,representing a depth map measured by a confocal laser microscope.
S305, obtaining a loss function of the whole model through a weighted summation mode on the basis of depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss function, as shown in formula (6).
Wherein λ is a weight coefficient of the edge loss function of the control depth map.
S4, acquiring a two-dimensional wear surface image, and making a corresponding damage area marking map and a depth map; selecting not less than 500 groups of wear surface images with typical damage areas as training samples, taking the model loss function constructed in the step S3 as an optimization target, training the constructed wear surface depth estimation model by adopting an adaptive moment estimation method (Adam), taking a single wear surface image as the input of the model based on the trained depth estimation model, and finally outputting a damage area segmentation result graph and a depth information result graph of the wear surface.
Referring to fig. 3, the surface to be measured acquires a surface image of a certain position by using a surface image in-situ acquisition system, and marks a damaged area of the surface image by using image marking software label to obtain a damaged area marking map, and fig. 4 shows three types of measured wear surface maps.
S401, acquiring a two-dimensional wear surface image with a damaged area through a handheld digital microscope, acquiring a depth image of a corresponding area through a laser confocal microscope, acquiring a marking image of the corresponding damaged area through image marking software, and realizing the manufacture of training and testing samples;
s402, using ResNet-50 network weights trained on the ImageNet data set as an encoder initialization weight parameter, and further training a wear surface depth estimation model by adopting the training sample manufactured in the step S401;
and S403, setting learning rate parameters, optimizing and training a wear surface depth estimation model by adopting an adaptive moment estimation method (Adam), and realizing wear surface damage region depth information estimation based on multiple attention mechanism surface feature extraction and damage region positioning guidance.
Setting a learning rate parameter to be 0.0003, and optimizing a wear surface depth estimation model constructed by training by adopting an adaptive moment estimation method (Adam), wherein the training iteration times and the number of input images in each batch are respectively 200 and 2; and comparing the convergence rates of the network models under different attention mechanism combinations, wherein the training process is shown in FIG. 5; therefore, the depth information estimation of the damaged area of the worn surface based on the extraction of the surface features of the multiple attention mechanism and the positioning guidance of the damaged area is realized.
In another embodiment of the present invention, a system for estimating a damage depth of a wear surface based on a multi-attention machine system is provided, where the system can be used to implement the method for estimating a damage depth of a wear surface based on a multi-attention machine system, and specifically, the system for estimating a damage depth of a wear surface based on a multi-attention machine system includes an extraction module, a fusion module, a function module, and an estimation module.
The extraction module is used for constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
the fusion module is used for constructing a damaged area segmentation branch network and a depth information estimation branch network based on a wear surface basic feature extraction layer constructed by the extraction module by combining a U-Net structure architecture and wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to serve as a wear surface depth estimation model;
the function module is used for constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged area segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed by the fusion module in a weighting mode;
the estimation module is used for acquiring a two-dimensional wear surface image and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking a loss function constructed by a function module as an optimization target, training a wear surface depth estimation model constructed by a fusion module by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result image and a depth information result image of the wear surface.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present invention compares the variation of the training convergence loss function value of the model with the attention mechanism and different combinations of the attention mechanism, as shown in fig. 5. The number 1 represents a model without an attention mechanism, the number 2 represents a model with only one CA coordinate attention mechanism, the number 3 represents a model with only one EPSA efficient pyramid segmentation attention module, and the number 4 represents a depth estimation model with two types of CA and EPSA coordinate attention mechanism modules. The training convergence speed is fastest and the minimum loss error value is 38.6 when the final number 4 is obtained from the graph, and the errors of the numbers 1, 2 and 3 are 83.2, 69.1 and 45.6 respectively, so that the superiority of the model constructed by the method is further proved.
Based on the model trained by the attention mechanism combination set by the number 4, three damage wear surface maps, namely scratches, pits, scratches and pit maps, are input, corresponding depth maps are obtained through prediction, and the difference between the wear surface depth information estimation result map and the depth map acquired by the laser confocal microscope by different methods is compared to verify the effectiveness of the constructed damage area depth estimation model of the wear surface, and as a result, as shown in fig. 6, it can be seen from the figure that the damage area depth estimation model constructed by the invention obtains the minimum estimation error, compared with the depth map acquired by the laser confocal microscope, the root mean square error RMSE is 1.49 μm and is far smaller than 61.64 μm of the SE-Net method and 41.33 μm of the BS-Net method estimation, the effectiveness of the model constructed by the invention is further proved.
In conclusion, the wear surface damage depth estimation method and system based on the multi-attention mechanism construct a dual-task branch model for damaged area segmentation and depth information estimation, realize synchronous acquisition of the depth information of the damaged area of the wear surface and the corresponding position and damage type of the damaged area under a single U-net network frame, and provide effective guide information for state monitoring and fault diagnosis based on the wear surface; a comprehensive loss function taking depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss as optimization targets is constructed, and the sensitivity of depth estimation deviation of a damaged area is enhanced, so that the depth information precision of the damaged area estimated by the model is improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.
Claims (10)
1. The method for estimating the damage depth of the wear surface based on the multi-attention machine system is characterized by comprising the following steps of:
s1, constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
s2, constructing a damaged area segmentation branch network and a depth information estimation branch network based on the wear surface basic feature extraction layer constructed in the step S1 by combining the U-Net structure and the wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model;
s3, constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged region segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed in the step S2 in a weighting mode;
s4, acquiring a two-dimensional wear surface image, and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking the loss function constructed in the step S3 as an optimization target, training the wear surface depth estimation model constructed in the step S2 by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result graph and a depth information result graph of the wear surface.
2. The method for estimating damage depth of a wear surface based on a multi-attention machine system according to claim 1, wherein the step S1 is specifically as follows:
s101, constructing two convolution blocks with a Conv-ReLU standard structure, and realizing primary feature extraction of the wear surface image;
s102, extracting Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network, wherein the Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network share 4 layers of convolution blocks as high-level semantic feature extraction blocks of the worn surface, and combining the two Conv-ReLU convolution blocks in the step S101 to establish a basic feature extraction network to realize basic feature extraction of the worn surface image.
3. The method for estimating damage depth of a wear surface based on a multi-attention mechanism as claimed in claim 2, wherein in step S101, constructing two convolution blocks with a standard structure of Conv-ReLU is specifically:
the first convolution block and the second convolution block adopt convolution blocks with the structure of Conv-ReLU to perform primary image feature extraction, and each convolution block comprises two layers of convolution operation; performing convolution operation on each layer of a first convolution block by adopting 3 convolution kernels with the size of 3 multiplied by 3 and a step size stride of 1, and performing characteristic nonlinear mapping by adopting a ReLU activation function to obtain an F0 characteristic diagram; the second convolution block carries out convolution operation by adopting 64 convolution kernels with the size of 3 × 3 and the step length of stride 2 and stride 1 respectively, then carries out characteristic nonlinear mapping by adopting a ReLU activation function to obtain an F1 characteristic diagram, and finally outputs the characteristic diagram of 64 channels to the following four convolution blocks.
4. The method for estimating damage depth to a wear surface based on a multi-attention mechanism as claimed in claim 1, wherein in step S2, a damage region segmentation branch network is established by using convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and a damage region segmentation result map corresponding to the size of the two-dimensional wear surface is extracted; and establishing a depth information estimation branch network based on the U-Net network architecture, and estimating the depth information of the damaged area.
5. The method for estimating the damage depth of the wear surface based on the multi-attention machine mechanism according to claim 4, wherein the establishing of the damage region segmentation branch network specifically comprises:
the convolution layers with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU are connected in series; for a first structure Conv-BN-ReLU, firstly carrying out convolution operation through a convolution kernel with the size of 3 multiplied by 3 and a step size stride equal to 1, then carrying out normalization processing on characteristics by adopting BN batch normalization operation, accelerating network convergence, and finally carrying out characteristic nonlinear mapping by adopting a ReLU activation function to obtain a characteristic diagram after noise suppression; for the second structure Convtransp-BN-ReLU, the difference from the structure Conv-BN-ReLU is that a Convtransp deconvolution operation of 3 × 3 size is used, and the step size stride is 2, which finally achieves a 2-fold enlargement of the input feature map; then cascading the feature graph with the feature graph from the basic feature extraction layer; and finally, mapping the P0 feature maps of the 32 channels to a damaged area segmentation map of the 3-dimensional channels by adopting a convolution block with the structure of Conv-ReLU.
6. The method for estimating the damage depth of the wear surface based on the multi-attention mechanism according to claim 4, wherein the establishing of the depth information estimation branch network based on the U-Net network architecture is specifically as follows:
convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU are adopted as upsampling operation; expanding the input feature map by two times to be used as the input of a coordinate attention module; embedding the amplified feature map into a coordinate attention module, acquiring a feature image containing position coordinate information, and performing cascade combination on the feature image and the feature map extracted by the efficient pyramid module; and extracting a characteristic image with local damage area information from the characteristic image of the coding layer through a high-efficiency pyramid segmentation attention module, and fusing and cascading the characteristic image with the characteristics extracted by the coordinate attention module to obtain a wear surface characteristic image with position information, space information and channel information.
7. The method for estimating damage depth of a wear surface based on a multi-attention machine system according to claim 1, wherein the step S3 is specifically as follows:
s301, constructing a depth estimation loss function of self-adaptive distribution of weight of a damaged area
S302, aiming at the difficult problem of predicted edge blurring caused by large edge depth change of the damaged area, in order to improve the estimation precision of the edge depth value of the damaged area, a two-dimensional Haar wavelet three-layer transformation is adopted to construct an edge detection loss function
S303, dividing the wear surface area into three types of background, scratch and pit, and selecting a three-type cross entropy function as a loss function of the damage area segmentation network branch
S304, adopting a structure consistency loss functionImproving the similarity between the predicted depth map and the measurement result of the laser confocal microscope;
and S305, obtaining a loss function of the wear surface depth estimation model through a weighted summation mode on the basis of the depth information mean square error loss obtained in the step S301, the edge detection loss function obtained in the step S302, the loss function of the damaged area segmentation network branch obtained in the step S303 and the structure consistency loss function obtained in the step S304.
8. The method of claim 1, wherein the wear surface damage depth estimation model is a loss function of a wear surface depth estimation modelComprises the following steps:
9. The method for estimating damage depth of a wear surface based on a multi-attention machine system according to claim 1, wherein the step S4 is specifically as follows:
s401, acquiring a two-dimensional wear surface image with a damaged area, acquiring a depth image of the corresponding area and a mark image of the corresponding damaged area, and manufacturing a training sample and a test sample;
s402, using ResNet-50 network weight trained on an ImageNet data set as an encoder initialization weight parameter, and training a wear surface depth estimation model by adopting the training sample manufactured in the step S401;
and S403, setting learning rate parameters, optimizing and training the wear surface depth estimation model in the step S402 by adopting an adaptive moment estimation method, inputting the test sample manufactured in the step S401 into the optimally trained wear surface depth estimation model, and realizing the wear surface damage region depth information estimation based on the multiple attention mechanism surface feature extraction and damage region positioning guidance.
10. A multi-attention mechanism based wear surface damage depth estimation system, comprising:
the extraction module is used for constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
the fusion module is used for constructing a damaged area segmentation branch network and a depth information estimation branch network based on a wear surface basic feature extraction layer constructed by the extraction module by combining a U-Net structure architecture and wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to serve as a wear surface depth estimation model;
the function module is used for constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged area segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed by the fusion module in a weighting mode;
the estimation module is used for acquiring a two-dimensional wear surface image and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking a loss function constructed by a function module as an optimization target, training a wear surface depth estimation model constructed by a fusion module by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result image and a depth information result image of the wear surface.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210689847.XA CN114972882B (en) | 2022-06-17 | 2022-06-17 | Wear surface damage depth estimation method and system based on multi-attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210689847.XA CN114972882B (en) | 2022-06-17 | 2022-06-17 | Wear surface damage depth estimation method and system based on multi-attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114972882A true CN114972882A (en) | 2022-08-30 |
CN114972882B CN114972882B (en) | 2024-03-01 |
Family
ID=82964067
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210689847.XA Active CN114972882B (en) | 2022-06-17 | 2022-06-17 | Wear surface damage depth estimation method and system based on multi-attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114972882B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115471505A (en) * | 2022-11-14 | 2022-12-13 | 华联机械集团有限公司 | Intelligent carton sealing machine regulation and control method based on visual identification |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2020103715A4 (en) * | 2020-11-27 | 2021-02-11 | Beijing University Of Posts And Telecommunications | Method of monocular depth estimation based on joint self-attention mechanism |
CN112381770A (en) * | 2020-11-03 | 2021-02-19 | 西安交通大学 | Wear surface three-dimensional topography measuring method based on fusion convolution neural network |
CN112967327A (en) * | 2021-03-04 | 2021-06-15 | 国网河北省电力有限公司检修分公司 | Monocular depth method based on combined self-attention mechanism |
CN114119694A (en) * | 2021-11-10 | 2022-03-01 | 中国石油大学(华东) | Improved U-Net based self-supervision monocular depth estimation algorithm |
-
2022
- 2022-06-17 CN CN202210689847.XA patent/CN114972882B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112381770A (en) * | 2020-11-03 | 2021-02-19 | 西安交通大学 | Wear surface three-dimensional topography measuring method based on fusion convolution neural network |
AU2020103715A4 (en) * | 2020-11-27 | 2021-02-11 | Beijing University Of Posts And Telecommunications | Method of monocular depth estimation based on joint self-attention mechanism |
CN112967327A (en) * | 2021-03-04 | 2021-06-15 | 国网河北省电力有限公司检修分公司 | Monocular depth method based on combined self-attention mechanism |
CN114119694A (en) * | 2021-11-10 | 2022-03-01 | 中国石油大学(华东) | Improved U-Net based self-supervision monocular depth estimation algorithm |
Non-Patent Citations (2)
Title |
---|
SHUO WANG等: "Optimized CNN model for identifying similar 3D wear particles in few samples", WEAR, 15 November 2020 (2020-11-15) * |
岑仕杰;何元烈;陈小聪;: "结合注意力与无监督深度学习的单目深度估计", 广东工业大学学报, no. 04, 14 July 2020 (2020-07-14) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115471505A (en) * | 2022-11-14 | 2022-12-13 | 华联机械集团有限公司 | Intelligent carton sealing machine regulation and control method based on visual identification |
Also Published As
Publication number | Publication date |
---|---|
CN114972882B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107085716B (en) | Cross-view gait recognition method based on multi-task generation countermeasure network | |
CN108647585B (en) | Traffic identifier detection method based on multi-scale circulation attention network | |
Deschaud et al. | A fast and accurate plane detection algorithm for large noisy point clouds using filtered normals and voxel growing | |
CN110287932B (en) | Road blocking information extraction method based on deep learning image semantic segmentation | |
Wang et al. | Modeling indoor spaces using decomposition and reconstruction of structural elements | |
Lee et al. | Perceptual organization of 3D surface points | |
CN111414923B (en) | Indoor scene three-dimensional reconstruction method and system based on single RGB image | |
Dal Poz et al. | Automated extraction of road network from medium-and high-resolution images | |
CN111882531B (en) | Automatic analysis method for hip joint ultrasonic image | |
CN111311611B (en) | Real-time three-dimensional large-scene multi-object instance segmentation method | |
CN111507921B (en) | Tunnel point cloud denoising method based on low-rank recovery | |
CN114758288A (en) | Power distribution network engineering safety control detection method and device | |
EP3872761A2 (en) | Analysing objects in a set of frames | |
CN115601661A (en) | Building change detection method for urban dynamic monitoring | |
CN116310219A (en) | Three-dimensional foot shape generation method based on conditional diffusion model | |
CN114565594A (en) | Image anomaly detection method based on soft mask contrast loss | |
CN115457195A (en) | Two-dimensional and three-dimensional conversion method, system, equipment and medium for distribution network engineering drawings | |
CN114972882A (en) | Wear surface damage depth estimation method and system based on multi-attention machine system | |
CN114283326A (en) | Underwater target re-identification method combining local perception and high-order feature reconstruction | |
Xiao et al. | A topological approach for segmenting human body shape | |
CN112184731A (en) | Multi-view stereo depth estimation method based on antagonism training | |
CN113920254B (en) | Monocular RGB (Red Green blue) -based indoor three-dimensional reconstruction method and system thereof | |
CN115797378A (en) | Prostate contour segmentation method based on geometric intersection ratio loss | |
CN111462177B (en) | Multi-clue-based online multi-target tracking method and system | |
CN110490235B (en) | Vehicle object viewpoint prediction and three-dimensional model recovery method and device facing 2D image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |