CN114972882A - Wear surface damage depth estimation method and system based on multi-attention machine system - Google Patents

Wear surface damage depth estimation method and system based on multi-attention machine system Download PDF

Info

Publication number
CN114972882A
CN114972882A CN202210689847.XA CN202210689847A CN114972882A CN 114972882 A CN114972882 A CN 114972882A CN 202210689847 A CN202210689847 A CN 202210689847A CN 114972882 A CN114972882 A CN 114972882A
Authority
CN
China
Prior art keywords
wear surface
depth
damage
relu
estimation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210689847.XA
Other languages
Chinese (zh)
Other versions
CN114972882B (en
Inventor
王硕
邵涛
武通海
王青华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202210689847.XA priority Critical patent/CN114972882B/en
Publication of CN114972882A publication Critical patent/CN114972882A/en
Application granted granted Critical
Publication of CN114972882B publication Critical patent/CN114972882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level

Abstract

The invention discloses a wear surface damage depth estimation method and system based on a multi-attention machine mechanism, wherein a wear surface basic feature extraction layer is constructed by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU; fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model; obtaining a loss function of the wear surface depth estimation model in a weighting mode; selecting a wear surface image with a typical damage area as a training sample, taking a loss function as an optimization target, training a wear surface depth estimation model by adopting an adaptive moment estimation method, inputting a single wear surface image into the wear surface depth estimation model, and obtaining a damage area segmentation result graph and a depth information result graph of the wear surface. The method and the device effectively realize the estimation of the three-dimensional depth information from a single wear surface image, and solve the problems of high acquisition difficulty, low efficiency and high complexity of the depth information in the technical field of wear surface analysis.

Description

Wear surface damage depth estimation method and system based on multi-attention machine system
Technical Field
The invention belongs to the technical field of wear surface analysis in the field of machine fault diagnosis, and particularly relates to a wear surface damage depth estimation method and system based on a multi-attention machine system.
Background
The wear of the friction pairs reduces the operational reliability and stability of the mechanical equipment and may even lead to serious failures. The wear surface is a direct product of the wear process, and the morphology of the damaged area can characterize the wear evolution mechanism and the wear severity. Therefore, wear surface analysis techniques are considered to be the most direct and reliable technical means for critical tribological system condition monitoring and fault diagnosis. Under the promotion of the concept of 'foreseeing and preventing' maintenance of mechanical equipment, the wear surface analysis technology is rapidly developing towards the in-situ and three-dimensional directions, and the in-situ on-machine detection technology based on the industrial endoscope becomes a main technical means. However, the accuracy and efficiency of wear surface analysis are restricted by the complicated shape and inconsistent size of the damage, and how to acquire three-dimensional morphology information from a single two-dimensional surface image becomes a difficult point in the research of wear surface analysis technology.
The wear surface topography analysis technology based on machine vision realizes three-dimensional topography information extraction based on two-dimensional wear surface images. For example, a wear surface image obtained by a scanning electron microscope is taken as a research object, and surface three-dimensional reconstruction is realized by using a multi-view geometric constraint method. The fusion technique of shadow shape restoration and stereo vision enables estimation of wear surface depth information. The method for enhancing the shadow shape recovery transformation based on the complex wavelet is used for acquiring the surface three-dimensional roughness of the milling mechanical part. In addition, photometric stereo vision is innovatively used in the extraction of three-dimensional topography information of the in-situ wear surface. However, the method depends on auxiliary information such as multiple views, multiple ideal assumptions, multiple light sources and the like, but a complex mechanical equipment industrial environment and an endoscope imaging system cause difficulty in building a multi-vision system and acquiring images, and limit application scenarios of the method in endoscope-based wear surface detection. For this reason, three-dimensional topography reconstruction remains an important task for wear surface analysis.
In recent years, the monocular depth estimation method provides a better research prospect for the depth estimation of a single worn surface by establishing the mapping relation between the pixel value and the depth value of a single two-dimensional image. However, the problems of fuzzy damage area edges and inaccurate damage shape estimation still exist in the depth estimation of the damage area of the wear surface. The key reason is that the existing monocular depth estimation model processes the whole area of the wear surface with the same weight, but the damage area is only a small part of the wear surface, so that the three-dimensional appearance of the damage area cannot be effectively reconstructed.
Generally, the existing three-dimensional reconstruction technology of the wear surface achieves certain engineering effect in condition monitoring and fault diagnosis. However, due to the restriction of the endoscope volume and the complex imaging environment inside the mechanical equipment, the application of the three-dimensional topography acquisition technology based on multi-view and multi-light-source assistance is limited, and the problems of fuzzy edges of a wear damage area, inaccurate damage topography estimation and the like exist in a single-image depth estimation model, so that the accuracy of extracting the three-dimensional topography of the wear surface damage area is reduced.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a wear surface damage depth estimation method and system based on a multiple attention mechanism, aiming at the defects in the prior art, wherein the multiple attention mechanism is introduced into a U-net network architecture to extract a feature map of a wear area of a wear surface with more attention, so as to realize damage depth information estimation from a single wear surface image, and provide a more effective three-dimensional topography information acquisition method for a wear surface analysis technology.
The invention adopts the following technical scheme:
the method for estimating the damage depth of the wear surface based on the multi-attention machine system comprises the following steps:
s1, constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
s2, constructing a damaged area segmentation branch network and a depth information estimation branch network based on the wear surface basic feature extraction layer constructed in the step S1 by combining the U-Net structure and the wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model;
s3, constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged region segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed in the step S2 in a weighting mode;
s4, acquiring a two-dimensional wear surface image, and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking the loss function constructed in the step S3 as an optimization target, training the wear surface depth estimation model constructed in the step S2 by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result graph and a depth information result graph of the wear surface.
Specifically, step S1 specifically includes:
s101, constructing two convolution blocks with a Conv-ReLU standard structure, and realizing primary feature extraction of the wear surface image;
s102, extracting Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network, wherein the Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network share 4 layers of convolution blocks as high-level semantic feature extraction blocks of the worn surface, and combining the two Conv-ReLU convolution blocks in the step S101 to establish a basic feature extraction network to realize basic feature extraction of the worn surface image.
Further, in step S101, constructing two convolution blocks with the standard structure of Conv-ReLU specifically includes:
the first convolution block and the second convolution block adopt convolution blocks with the structure of Conv-ReLU to perform primary image feature extraction, and each convolution block comprises two layers of convolution operation; performing convolution operation on each layer of a first convolution block by adopting 3 convolution kernels with the size of 3 multiplied by 3 and a step size stride of 1, and performing characteristic nonlinear mapping by adopting a ReLU activation function to obtain an F0 characteristic diagram; and each layer of the second convolution block adopts 64 convolution kernels with the size of 3 × 3 and the step length of stride 2 and stride 1 respectively to carry out convolution operation, then a ReLU activation function is adopted to carry out characteristic nonlinear mapping to obtain an F1 characteristic diagram, and finally a characteristic diagram of 64 channels is output to the subsequent four convolution blocks.
Specifically, in step S2, a damaged region segmentation branch network is established by using convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and a damaged region segmentation result graph corresponding to the size of the two-dimensional wear surface is extracted; and establishing a depth information estimation branch network based on the U-Net network architecture, and estimating the depth information of the damaged area.
Further, the establishing of the damaged area division branch network specifically includes:
the convolution layers with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU are connected in series; for a first structure Conv-BN-ReLU, firstly carrying out convolution operation through a convolution kernel with the size of 3 multiplied by 3 and a step size stride equal to 1, then carrying out normalization processing on characteristics by adopting BN batch normalization operation, accelerating network convergence, and finally carrying out characteristic nonlinear mapping by adopting a ReLU activation function to obtain a characteristic diagram after noise suppression; for the second structure Convtransp-BN-ReLU, the difference from the structure Conv-BN-ReLU is that a Convtransp deconvolution operation of 3 × 3 size is used, and the step size stride is 2, finally achieving a 2-fold enlargement of the input feature map; then cascading the feature graph with the feature graph from the basic feature extraction layer; and finally, mapping the P0 feature maps of the 32 channels to a damaged area segmentation map of the 3-dimensional channels by adopting a convolution block with the structure of Conv-ReLU.
Further, the establishment of the depth information estimation branch network based on the U-Net network architecture specifically includes:
adopting convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU as upsampling operation; expanding the input feature map by two times to be used as the input of a coordinate attention module; embedding the amplified feature map into a coordinate attention module, acquiring a feature image containing position coordinate information, and performing cascade combination on the feature image and the feature map extracted by the efficient pyramid module; and extracting a characteristic image with local damage area information from the characteristic image of the coding layer through a high-efficiency pyramid segmentation attention module, and fusing and cascading the characteristic image with the characteristics extracted by the coordinate attention module to obtain a wear surface characteristic image with position information, space information and channel information.
Specifically, step S3 specifically includes:
s301, constructing a depth estimation loss function of self-adaptive distribution of weight of a damaged area
Figure BDA0003701204660000041
S302, aiming at the difficult problem of predicted edge blurring caused by large edge depth change of the damaged area, in order to improve the estimation precision of the edge depth value of the damaged area, a two-dimensional Haar wavelet three-layer transformation is adopted to construct an edge detection loss function
Figure BDA0003701204660000042
S303, dividing the wear surface area into three types of background, scratch and pit, and selecting a three-type cross entropy function as a loss function of the damage area segmentation network branch
Figure BDA0003701204660000043
S304, adopting a structure consistency loss function
Figure BDA0003701204660000044
Improving the similarity between the predicted depth map and the measurement result of the laser confocal microscope;
and S305, obtaining a loss function of the wear surface depth estimation model through a weighted summation mode on the basis of the depth information mean square error loss obtained in the step S301, the edge detection loss function obtained in the step S302, the loss function of the damaged area segmentation network branch obtained in the step S303 and the structure consistency loss function obtained in the step S304.
In particular, the loss function of the wear surface depth estimation model
Figure BDA0003701204660000051
Comprises the following steps:
Figure BDA0003701204660000052
wherein, y represents the wear surface estimated depth map,
Figure BDA0003701204660000053
representing the depth map measured by a confocal laser microscope, p being the predicted pixel class,
Figure BDA0003701204660000054
and lambda is the weight coefficient of the edge loss function of the control depth map for the actual pixel point category.
Specifically, step S4 specifically includes:
s401, acquiring a two-dimensional wear surface image with a damaged area, acquiring a depth image of the corresponding area and a mark image of the corresponding damaged area, and manufacturing a training sample and a test sample;
s402, using ResNet-50 network weight trained on an ImageNet data set as an encoder initialization weight parameter, and training a wear surface depth estimation model by adopting the training sample manufactured in the step S401;
and S403, setting learning rate parameters, optimizing and training the wear surface depth estimation model in the step S402 by adopting an adaptive moment estimation method, inputting the test sample manufactured in the step S401 into the optimally trained wear surface depth estimation model, and realizing the wear surface damage region depth information estimation based on the multiple attention mechanism surface feature extraction and damage region positioning guidance.
In a second aspect, an embodiment of the present invention provides a wear surface damage depth estimation system based on a multi-attention machine system, including:
the extraction module is used for constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a trunk and combining the convolution blocks of two layers of Conv-ReLU;
the fusion module is used for constructing a damaged area segmentation branch network and a depth information estimation branch network based on a wear surface basic feature extraction layer constructed by the extraction module by combining a U-Net structure architecture and wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to serve as a wear surface depth estimation model;
the function module is used for constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged area segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed by the fusion module in a weighting mode;
the estimation module is used for acquiring a two-dimensional wear surface image and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking a loss function constructed by a function module as an optimization target, training a wear surface depth estimation model constructed by a fusion module by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result image and a depth information result image of the wear surface.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention relates to a wear surface damage depth estimation method based on a multi-attention machine system, which takes a single two-dimensional wear surface image as a research object, selects a ResNet-50 coding layer as a basic coding layer trunk and extracts characteristic graphs of wear surfaces under different scales; establishing a damaged area segmentation branch network and a depth information estimation branch network by combining an efficient pyramid segmentation attention module (EPSA) module, a Coordinate Attention (CA) module and the like to detect a damaged area of the wear surface and estimate depth information of the damaged area; the network model adopts the weighted sum of four types of loss functions including depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss as an overall loss function, parameter training is carried out by using an Adam optimization algorithm to obtain a final surface depth estimation model, the depth information estimation of a damaged area of a wear surface is realized, and more effective information is provided for wear mechanism and state analysis
Further, step S1 constructs a basic feature extraction network structure based on the Conv-ReLU rolling block and the ResNet-50 feature extraction rolling block, so as to extract semantic features of the wear surface, provide semantic feature maps of different scales for a subsequent damaged region segmentation network and a depth estimation network, and contribute to improving the damaged region depth estimation accuracy.
Further, step S101 adopts convolution blocks of two layers of Conv-ReLU to perform wear surface primary semantic feature extraction, extracts a 64-channel F1 feature map from the damage surface map input as 3 channels, provides rich primary semantic feature maps for subsequent ResNet-50 feature extraction convolution blocks, and extracts a 64-channel F1 feature map to a feature map F5 with 2048 channels by the ResNet-50 feature extraction convolution block, thereby implementing wear surface multi-level semantic feature extraction.
Further, in step S2, a damaged region segmentation branch network is established by using convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and a damaged region segmentation result map corresponding to the size of the two-dimensional wear surface is extracted, so as to provide more damaged region feature maps for the depth estimation network, constrain and position the damaged region, and improve the depth estimation accuracy of the depth estimation network for the damaged region; and establishing a depth information estimation branch network based on the U-Net network architecture, and fully utilizing the multi-level semantic features and the damaged area segmentation feature map in the step S1 to realize and improve the depth information estimation precision of the damaged area.
Further, a damage region segmentation branch network is constructed to obtain the position of the damage region and a corresponding characteristic diagram, so that rich damage region characteristic diagrams are provided for a subsequent depth estimation branch network.
Further, based on a U-Net network architecture, the multi-level semantic features extracted in step S1, the feature map extracted by the damaged region segmentation branch network, and the features extracted by the multi-attention mechanism module are continuously fused and superimposed, the feature map is continuously restored to the depth estimation result map with the same size as the input worn surface image by using upsampling, and meanwhile, higher damaged region texture detail information can be maintained.
Further, in step S3, a damage region base depth loss function is first constructed based on step S301
Figure BDA0003701204660000071
To constrain the differences between the depth estimation results and the label depth map of the LSCM acquisition; secondly, adding an edge detection loss function on the basis of the step S301
Figure BDA0003701204660000072
To improve the estimated damage region edge definition; after step S301 and step S302, a loss function is employed
Figure BDA0003701204660000073
Classifying the estimated pixel points to segment the damaged area, and finally estimating and extracting the depth information of the damaged area; finally adopting a structure consistency loss function
Figure BDA0003701204660000074
So as to improve the overall accuracy of the final estimated depth image of the damaged area.
Further, the final loss function is composed of depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss function; wherein depth information mean square error loss
Figure BDA0003701204660000075
A loss function of the depth estimation branch is used for ensuring the accuracy of the depth map estimation; damage detection cross entropy loss
Figure BDA0003701204660000076
Accurately dividing the damaged area; edge detection mean square error loss
Figure BDA0003701204660000081
The method is used for improving the edge definition of the damaged area in the depth map; the structural consistency loss function is used to improve the overall topography accuracy of the final wear surface depth map.
Further, a wear surface damage area marking map and a depth map are made to serve as designed depth estimation network input data, and the loss function designed in the step S3 is used as an optimization target to ensure that the estimated result of the final model is closer to the depth map acquired by the laser confocal microscope; and (4) training the wear surface depth estimation model constructed in the step S2 by adopting an adaptive moment estimation method to improve the convergence rate of the network, so that the depth estimation model can be trained quickly and a better depth estimation result can be obtained.
It is understood that the beneficial effects of the second aspect can be referred to the related description of the first aspect, and are not described herein again.
In conclusion, the three-dimensional depth information of a single abrasion surface image is effectively estimated based on the monocular depth estimation model, and the problems of high difficulty in acquiring the depth information, low efficiency and high complexity in the technical field of abrasion surface analysis are solved.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a framework diagram of a wear surface depth estimation step;
FIG. 2 is a flow chart of the production of a two-dimensional wear surface map, a wear surface depth map, and a damaged area signature map;
FIG. 3 is a three-dimensional visualization of typical wear surface damage maps and their corresponding depth maps, damage region signature maps, and depths;
FIG. 4 is a network architecture diagram of a wear surface damage region depth estimation model based on multiple self-attention mechanism fusion;
FIG. 5 is a comparison of loss values during model training with different attention mechanism models;
fig. 6 is a depth estimation result map of three types of wear surfaces and a corresponding depth three-dimensional visualization map.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be understood that the terms "comprises" and/or "comprising" indicate the presence of the stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and including such combinations, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that although the terms first, second, third, etc. may be used to describe preset ranges, etc. in embodiments of the present invention, these preset ranges should not be limited to these terms. These terms are only used to distinguish preset ranges from each other. For example, the first preset range may also be referred to as a second preset range, and similarly, the second preset range may also be referred to as the first preset range, without departing from the scope of the embodiments of the present invention.
The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.
The invention provides a wear surface damage depth estimation method based on a multi-attention machine system, which takes a two-dimensional wear surface image as a research object, makes a corresponding damage area marking image, and makes a corresponding standard depth image by using a laser confocal microscope; on the basis, constructing a wear surface depth estimation network model of a double-task model based on a U-Net framework; extracting a network by taking a ResNet-50 network as a basic feature, extracting a wear surface basic feature, and inputting the wear surface basic feature into a damaged area segmentation and depth information estimation double-branch network; merging the feature maps of a Coordinate Attention (CA) module, a high-Efficiency Pyramid Segmentation Attention (EPSA) module and a damaged area segmentation branch into a depth estimation network branch to obtain a feature map with damaged area information; the constructed network model adopts depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and a structure consistency loss function, and obtains a final worn surface depth estimation model by using an Adam optimization algorithm training model through weighting and taking the weighted sum as an overall loss function, so as to realize depth estimation of a worn surface damage area; the method is based on the monocular depth estimation model, effectively realizes the estimation of the three-dimensional depth information of a single abrasion surface image, and solves the problems of high difficulty, low efficiency and high complexity in the depth information acquisition in the technical field of abrasion surface analysis.
Referring to fig. 1, a depth estimation model of a damaged area of a wear surface based on fusion of multiple self-attention mechanisms is shown, where three-dimensional morphology information of the wear surface is the basis of an abrasive particle analysis technique, which directly affects the precision of damage detection and state identification of the wear surface. However, the three-dimensional reconstruction method based on multiple views, multiple light sources and the like is difficult to be applied to the measurement environment of the size and the narrowness of an industrial endoscope, and the problems of inaccurate shape estimation of a damaged area, fuzzy edge and the like still exist in the information estimation of the damaged depth of the worn surface based on the monocular depth estimation method, so that the damaged shape reconstruction accuracy of the monocular depth estimation model in the practical application is greatly reduced. According to the method, a multiple attention mechanism is integrated into a depth estimation branch coding layer based on a U-Net network framework, so that the characterization capability of a wear surface characteristic graph on a damage area is improved, then, damage area segmentation branches are introduced into a wear surface depth estimation model, further more damage area positioning characteristic graphs are provided for the depth estimation branches, and finally, a double-branch task model is constructed, so that the detection and depth information estimation of the wear surface damage area are realized.
The invention relates to a wear surface damage depth estimation method based on a multi-attention machine mechanism, which comprises the following steps of:
s1, constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a backbone and combining the convolution blocks of two layers of Conv-ReLU to realize automatic extraction of wear surface multi-scale features as feature input in the step S2;
s101, constructing two convolution blocks with a Conv-ReLU standard structure, and realizing primary feature extraction of the wear surface image;
referring to fig. 2, the basic feature extraction layer includes six volume blocks, wherein the first and second volume blocks adopt convolution blocks with a structure of Conv-ReLU to realize primary feature extraction of the wear surface image;
the construction of two convolution blocks with the standard structure Conv-ReLU specifically comprises the following steps:
the first convolution block and the second convolution block adopt convolution blocks with the structure of Conv-ReLU to carry out primary image feature extraction, and each convolution block comprises two layers of convolution operation; each layer of the first convolution adopts 3 convolution kernels with the size of 3 multiplied by 3 and the step size stride being 1 to carry out convolution operation, and then adopts a ReLU activation function to carry out characteristic nonlinear mapping to obtain an F0 characteristic diagram; the second convolution block adopts 64 convolution kernels with the size of 3 × 3 in each layer, the step length is stride 2, stride 1 respectively, the convolution operation is carried out, then the ReLU activation function is adopted to carry out characteristic nonlinear mapping, an F1 characteristic diagram is obtained, and finally the characteristic diagram of 64 channels is output to the following four convolution blocks.
S102, extracting Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network, wherein the Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network share 4 layers of convolution blocks as high-level semantic feature extraction blocks of the worn surface, and combining the two Conv-ReLU convolution blocks in the step S101 to establish a basic feature extraction network to realize basic feature extraction of the worn surface image.
And (3) performing further wear surface feature extraction on the last four convolution blocks by adopting 4 layers of convolution blocks including Res1, Res2, Res3 and Res4 in a ResNet-50 feature extraction network, wherein the number of feature channels output by each convolution block is 256, 512, 1024 and 2048 respectively, so that an F2-F5 feature map is obtained, and high-level feature extraction of the wear surface image is realized.
S2, constructing a damaged area segmentation branch network and a depth information estimation branch network based on the wear surface basic feature extraction layer constructed in the step S1 by combining the U-Net structure and the wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model;
s201, establishing a damaged area segmentation branch by adopting convolution blocks with structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and extracting a damaged area segmentation result graph corresponding to the size of the two-dimensional wear surface.
The damage region division branches into a decoder of a U-Net framework, and the forward propagation process of the damage region division is opposite to that of the encoder. When a decoder is constructed, upsampling is carried out on input features by adopting upsampling operation in each step, an input feature layer is amplified by 2 times, and finally, the input features are mapped into a damaged area segmentation result graph with the same size as an image of a wear surface through 5 times of upsampling operation and jump connection operation.
The specific steps of establishing the damaged area division branch network are as follows:
s2011, upsampling operation
The structure is mainly composed of convolution layers of Conv-BN-ReLU and Convtransp-BN-ReLU which are connected in series; for a first structure Conv-BN-ReLU, firstly carrying out convolution operation through a convolution kernel with the size of 3 multiplied by 3 and a step size stride equal to 1, then carrying out normalization processing on characteristics by adopting BN batch normalization operation, accelerating network convergence, and finally carrying out characteristic nonlinear mapping by adopting a ReLU activation function to obtain a characteristic diagram after noise suppression; for the second structure Convtransp-BN-ReLU, the difference from the structure Conv-BN-ReLU is that a Convtransp deconvolution operation of 3 × 3 size is used, and the step size stride is 2, finally achieving a 2-fold enlargement of the input feature map;
s2012, jump connection operation
After the up-sampling operation, the feature map and the feature map from the basic feature extraction layer are cascaded, so that the feature map can have higher resolution in the up-sampling process.
S2013, convolution Block 1 operation
And finally, mapping the P0 feature maps of the 32 channels to a damaged area segmentation map of the 3-dimensional channels by adopting a convolution block with the structure of Conv-ReLU.
S202, establishing a depth information estimation branch network based on a U-Net network architecture to estimate depth information of a damaged area;
the depth information estimation branch structure also corresponds to the coding layer, and the convolution blocks using Conv-BN-ReLU and Convtransp-BN-ReLU perform an upsampling operation to restore the feature map size to be the same as the wear surface image. In order to fully extract the characteristic image with the local damage area information, a Coordinate Attention (CA) module and an Efficient Pyramid Segmentation Attention (EPSA) module are respectively introduced, and a characteristic map from a damage area segmentation branch is merged in the last convolution block operation, so that the attention to the typical wear area information of the wear surface in the wear surface depth estimation process is further strengthened, and the damage area depth information estimation is finally realized.
S2021, adopting convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU as an upsampling operation;
and inputting the feature graph obtained by up-sampling the features from the feature basic coding layer and amplified by 2 times into a Coordinate Attention (CA) module for further attention feature extraction, and acquiring a feature image containing position coordinate information.
S2022, embedding a Coordinate Attention (CA) module, and acquiring a feature image containing position coordinate information;
when jumping connection is carried out, the basic coding layer features of corresponding positions are extracted by a high-Efficiency Pyramid Segmentation Attention (EPSA) module to obtain feature images with local damage area information, and then the feature images and the feature images containing position coordinate information are fused and cascaded to obtain a wear surface feature map with damage area position information, space information and channel information.
S2023, extracting the feature image with the local damage area information from the feature image of the coding layer through an Efficient Pyramid Segmentation Attention (EPSA) module, and fusing and cascading the feature image with the features extracted by the CA module to obtain the wear surface feature image with position information, space information and channel information.
After five times of up-sampling-CA-EPSA-jump connection cascade operation, the feature maps from the damaged area segmentation network are fused, and finally the C0 feature maps of 64 channels are mapped to the wear surface depth map of the 1-dimensional channel by the convolution block with the Conv-ReLU structure.
S3, aiming at the damaged area segmentation and depth information estimation double-branch structure constructed in the step S2, constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function, and obtaining an integral model loss function in a weighting mode to improve the depth estimation precision of the damaged area of the model;
the loss function is the target of network optimization training, and network parameter optimization is guided by back propagation through errors between prediction results of the damage segmentation graph and the depth graph and a real labeled graph. The constructed damaged region segmentation branch and the depth information estimation branch respectively correspond to damage detection cross entropy loss and depth information mean square error loss, and in order to solve the problems of edge blurring and damaged region morphology blurring of a predicted depth map, a Haar wavelet-based edge detection mean square error loss and structure consistency loss function are adopted, and an integral model loss function is constructed in a weighting mode.
S301, by calculating the mean square error between the estimated depth map of the worn surface and the depth map obtained by the laser confocal microscope, the difference between the predicted depth map and a target value is constrained, more weights are distributed to the damaged area, the problem that the worn surface area is not uniformly distributed is solved, and a depth estimation loss function of self-adaptive distribution of the weights of the damaged area is constructed, so that the accuracy of estimation of the depth information of the damaged area of the model is improved. As shown in formula (1) and formula (2);
Figure BDA0003701204660000141
a=Num other areas /Num damage areas (2)
wherein i represents the coordinate position of the pixel point, N represents the number of the image pixel points, and y i An estimated depth value representing the wear surface,
Figure BDA0003701204660000142
representing the depth value measured by a laser confocal microscope, wherein a is a super parameter for controlling the weight of each pixel point, and Num represents the number of the pixel points in the area;
s302, aiming at the difficult problem of predicted edge blurring caused by large edge depth change of the damaged area, in order to improve the estimation precision of the edge depth value of the damaged area, a two-dimensional Haar wavelet three-layer transformation is adopted to construct an edge detection loss function, as shown in a formula (3);
Figure BDA0003701204660000143
wherein V, C, H correspond to the vertical high frequency coefficient, the diagonal high frequency coefficient, and the horizontal high frequency coefficient of the depth image, respectively, L is a low frequency coefficient, y represents the wear surface estimated depth map,
Figure BDA0003701204660000151
representing a depth map measured by a confocal laser microscope.
S303, considering that the typical damage types of the wear surface are scratches and pits, dividing the wear surface area into three types of background, scratches and pits, and selecting a three-classification cross entropy function as a loss function of a damage area segmentation network branch, as shown in a formula (4);
Figure BDA0003701204660000152
in the formula, N is the number of image pixel points, and M is the number of region categories equal to 3; p is a radical of ic Is an area label vector with the length of 3, the values are 0 and 1,
Figure BDA0003701204660000153
representing the probability that the prediction sample belongs to c.
S304, in order to improve the similarity between the predicted depth map and the measurement result of the laser confocal microscope, a structural consistency loss function is adopted, as shown in a formula (5);
Figure BDA0003701204660000154
in the formula, mu y And
Figure BDA0003701204660000155
respectively represent y and
Figure BDA0003701204660000156
is determined by the average value of (a) of (b),
Figure BDA0003701204660000157
and
Figure BDA0003701204660000158
respectively represent y and
Figure BDA0003701204660000159
the variance of (a);
Figure BDA00037012046600001510
the expression is y and
Figure BDA00037012046600001511
covariance of c 1 And c 1 Is a constant that prevents the denominator from being zero, y represents the wear surface estimated depth map,
Figure BDA00037012046600001512
representing a depth map measured by a confocal laser microscope.
S305, obtaining a loss function of the whole model through a weighted summation mode on the basis of depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss function, as shown in formula (6).
Figure BDA00037012046600001513
Wherein λ is a weight coefficient of the edge loss function of the control depth map.
S4, acquiring a two-dimensional wear surface image, and making a corresponding damage area marking map and a depth map; selecting not less than 500 groups of wear surface images with typical damage areas as training samples, taking the model loss function constructed in the step S3 as an optimization target, training the constructed wear surface depth estimation model by adopting an adaptive moment estimation method (Adam), taking a single wear surface image as the input of the model based on the trained depth estimation model, and finally outputting a damage area segmentation result graph and a depth information result graph of the wear surface.
Referring to fig. 3, the surface to be measured acquires a surface image of a certain position by using a surface image in-situ acquisition system, and marks a damaged area of the surface image by using image marking software label to obtain a damaged area marking map, and fig. 4 shows three types of measured wear surface maps.
S401, acquiring a two-dimensional wear surface image with a damaged area through a handheld digital microscope, acquiring a depth image of a corresponding area through a laser confocal microscope, acquiring a marking image of the corresponding damaged area through image marking software, and realizing the manufacture of training and testing samples;
s402, using ResNet-50 network weights trained on the ImageNet data set as an encoder initialization weight parameter, and further training a wear surface depth estimation model by adopting the training sample manufactured in the step S401;
and S403, setting learning rate parameters, optimizing and training a wear surface depth estimation model by adopting an adaptive moment estimation method (Adam), and realizing wear surface damage region depth information estimation based on multiple attention mechanism surface feature extraction and damage region positioning guidance.
Setting a learning rate parameter to be 0.0003, and optimizing a wear surface depth estimation model constructed by training by adopting an adaptive moment estimation method (Adam), wherein the training iteration times and the number of input images in each batch are respectively 200 and 2; and comparing the convergence rates of the network models under different attention mechanism combinations, wherein the training process is shown in FIG. 5; therefore, the depth information estimation of the damaged area of the worn surface based on the extraction of the surface features of the multiple attention mechanism and the positioning guidance of the damaged area is realized.
In another embodiment of the present invention, a system for estimating a damage depth of a wear surface based on a multi-attention machine system is provided, where the system can be used to implement the method for estimating a damage depth of a wear surface based on a multi-attention machine system, and specifically, the system for estimating a damage depth of a wear surface based on a multi-attention machine system includes an extraction module, a fusion module, a function module, and an estimation module.
The extraction module is used for constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
the fusion module is used for constructing a damaged area segmentation branch network and a depth information estimation branch network based on a wear surface basic feature extraction layer constructed by the extraction module by combining a U-Net structure architecture and wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to serve as a wear surface depth estimation model;
the function module is used for constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged area segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed by the fusion module in a weighting mode;
the estimation module is used for acquiring a two-dimensional wear surface image and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking a loss function constructed by a function module as an optimization target, training a wear surface depth estimation model constructed by a fusion module by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result image and a depth information result image of the wear surface.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present invention compares the variation of the training convergence loss function value of the model with the attention mechanism and different combinations of the attention mechanism, as shown in fig. 5. The number 1 represents a model without an attention mechanism, the number 2 represents a model with only one CA coordinate attention mechanism, the number 3 represents a model with only one EPSA efficient pyramid segmentation attention module, and the number 4 represents a depth estimation model with two types of CA and EPSA coordinate attention mechanism modules. The training convergence speed is fastest and the minimum loss error value is 38.6 when the final number 4 is obtained from the graph, and the errors of the numbers 1, 2 and 3 are 83.2, 69.1 and 45.6 respectively, so that the superiority of the model constructed by the method is further proved.
Based on the model trained by the attention mechanism combination set by the number 4, three damage wear surface maps, namely scratches, pits, scratches and pit maps, are input, corresponding depth maps are obtained through prediction, and the difference between the wear surface depth information estimation result map and the depth map acquired by the laser confocal microscope by different methods is compared to verify the effectiveness of the constructed damage area depth estimation model of the wear surface, and as a result, as shown in fig. 6, it can be seen from the figure that the damage area depth estimation model constructed by the invention obtains the minimum estimation error, compared with the depth map acquired by the laser confocal microscope, the root mean square error RMSE is 1.49 μm and is far smaller than 61.64 μm of the SE-Net method and 41.33 μm of the BS-Net method estimation, the effectiveness of the model constructed by the invention is further proved.
In conclusion, the wear surface damage depth estimation method and system based on the multi-attention mechanism construct a dual-task branch model for damaged area segmentation and depth information estimation, realize synchronous acquisition of the depth information of the damaged area of the wear surface and the corresponding position and damage type of the damaged area under a single U-net network frame, and provide effective guide information for state monitoring and fault diagnosis based on the wear surface; a comprehensive loss function taking depth information mean square error loss, damage detection cross entropy loss, edge detection mean square error loss and structure consistency loss as optimization targets is constructed, and the sensitivity of depth estimation deviation of a damaged area is enhanced, so that the depth information precision of the damaged area estimated by the model is improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (10)

1. The method for estimating the damage depth of the wear surface based on the multi-attention machine system is characterized by comprising the following steps of:
s1, constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
s2, constructing a damaged area segmentation branch network and a depth information estimation branch network based on the wear surface basic feature extraction layer constructed in the step S1 by combining the U-Net structure and the wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to form a wear surface depth estimation model;
s3, constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged region segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed in the step S2 in a weighting mode;
s4, acquiring a two-dimensional wear surface image, and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking the loss function constructed in the step S3 as an optimization target, training the wear surface depth estimation model constructed in the step S2 by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result graph and a depth information result graph of the wear surface.
2. The method for estimating damage depth of a wear surface based on a multi-attention machine system according to claim 1, wherein the step S1 is specifically as follows:
s101, constructing two convolution blocks with a Conv-ReLU standard structure, and realizing primary feature extraction of the wear surface image;
s102, extracting Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network, wherein the Res1, Res2, Res3 and Res4 in the ResNet-50 feature extraction network share 4 layers of convolution blocks as high-level semantic feature extraction blocks of the worn surface, and combining the two Conv-ReLU convolution blocks in the step S101 to establish a basic feature extraction network to realize basic feature extraction of the worn surface image.
3. The method for estimating damage depth of a wear surface based on a multi-attention mechanism as claimed in claim 2, wherein in step S101, constructing two convolution blocks with a standard structure of Conv-ReLU is specifically:
the first convolution block and the second convolution block adopt convolution blocks with the structure of Conv-ReLU to perform primary image feature extraction, and each convolution block comprises two layers of convolution operation; performing convolution operation on each layer of a first convolution block by adopting 3 convolution kernels with the size of 3 multiplied by 3 and a step size stride of 1, and performing characteristic nonlinear mapping by adopting a ReLU activation function to obtain an F0 characteristic diagram; the second convolution block carries out convolution operation by adopting 64 convolution kernels with the size of 3 × 3 and the step length of stride 2 and stride 1 respectively, then carries out characteristic nonlinear mapping by adopting a ReLU activation function to obtain an F1 characteristic diagram, and finally outputs the characteristic diagram of 64 channels to the following four convolution blocks.
4. The method for estimating damage depth to a wear surface based on a multi-attention mechanism as claimed in claim 1, wherein in step S2, a damage region segmentation branch network is established by using convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU, and a damage region segmentation result map corresponding to the size of the two-dimensional wear surface is extracted; and establishing a depth information estimation branch network based on the U-Net network architecture, and estimating the depth information of the damaged area.
5. The method for estimating the damage depth of the wear surface based on the multi-attention machine mechanism according to claim 4, wherein the establishing of the damage region segmentation branch network specifically comprises:
the convolution layers with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU are connected in series; for a first structure Conv-BN-ReLU, firstly carrying out convolution operation through a convolution kernel with the size of 3 multiplied by 3 and a step size stride equal to 1, then carrying out normalization processing on characteristics by adopting BN batch normalization operation, accelerating network convergence, and finally carrying out characteristic nonlinear mapping by adopting a ReLU activation function to obtain a characteristic diagram after noise suppression; for the second structure Convtransp-BN-ReLU, the difference from the structure Conv-BN-ReLU is that a Convtransp deconvolution operation of 3 × 3 size is used, and the step size stride is 2, which finally achieves a 2-fold enlargement of the input feature map; then cascading the feature graph with the feature graph from the basic feature extraction layer; and finally, mapping the P0 feature maps of the 32 channels to a damaged area segmentation map of the 3-dimensional channels by adopting a convolution block with the structure of Conv-ReLU.
6. The method for estimating the damage depth of the wear surface based on the multi-attention mechanism according to claim 4, wherein the establishing of the depth information estimation branch network based on the U-Net network architecture is specifically as follows:
convolution blocks with the structures of Conv-BN-ReLU and Convtransp-BN-ReLU are adopted as upsampling operation; expanding the input feature map by two times to be used as the input of a coordinate attention module; embedding the amplified feature map into a coordinate attention module, acquiring a feature image containing position coordinate information, and performing cascade combination on the feature image and the feature map extracted by the efficient pyramid module; and extracting a characteristic image with local damage area information from the characteristic image of the coding layer through a high-efficiency pyramid segmentation attention module, and fusing and cascading the characteristic image with the characteristics extracted by the coordinate attention module to obtain a wear surface characteristic image with position information, space information and channel information.
7. The method for estimating damage depth of a wear surface based on a multi-attention machine system according to claim 1, wherein the step S3 is specifically as follows:
s301, constructing a depth estimation loss function of self-adaptive distribution of weight of a damaged area
Figure FDA0003701204650000031
S302, aiming at the difficult problem of predicted edge blurring caused by large edge depth change of the damaged area, in order to improve the estimation precision of the edge depth value of the damaged area, a two-dimensional Haar wavelet three-layer transformation is adopted to construct an edge detection loss function
Figure FDA0003701204650000038
S303, dividing the wear surface area into three types of background, scratch and pit, and selecting a three-type cross entropy function as a loss function of the damage area segmentation network branch
Figure FDA0003701204650000032
S304, adopting a structure consistency loss function
Figure FDA0003701204650000033
Improving the similarity between the predicted depth map and the measurement result of the laser confocal microscope;
and S305, obtaining a loss function of the wear surface depth estimation model through a weighted summation mode on the basis of the depth information mean square error loss obtained in the step S301, the edge detection loss function obtained in the step S302, the loss function of the damaged area segmentation network branch obtained in the step S303 and the structure consistency loss function obtained in the step S304.
8. The method of claim 1, wherein the wear surface damage depth estimation model is a loss function of a wear surface depth estimation model
Figure FDA0003701204650000034
Comprises the following steps:
Figure FDA0003701204650000035
wherein, y represents the wear surface estimated depth map,
Figure FDA0003701204650000036
representing the depth map measured by a confocal laser microscope, p being the predicted pixel class,
Figure FDA0003701204650000037
and lambda is the weight coefficient of the edge loss function of the control depth map for the actual pixel point category.
9. The method for estimating damage depth of a wear surface based on a multi-attention machine system according to claim 1, wherein the step S4 is specifically as follows:
s401, acquiring a two-dimensional wear surface image with a damaged area, acquiring a depth image of the corresponding area and a mark image of the corresponding damaged area, and manufacturing a training sample and a test sample;
s402, using ResNet-50 network weight trained on an ImageNet data set as an encoder initialization weight parameter, and training a wear surface depth estimation model by adopting the training sample manufactured in the step S401;
and S403, setting learning rate parameters, optimizing and training the wear surface depth estimation model in the step S402 by adopting an adaptive moment estimation method, inputting the test sample manufactured in the step S401 into the optimally trained wear surface depth estimation model, and realizing the wear surface damage region depth information estimation based on the multiple attention mechanism surface feature extraction and damage region positioning guidance.
10. A multi-attention mechanism based wear surface damage depth estimation system, comprising:
the extraction module is used for constructing a wear surface basic feature extraction layer by taking four layers of convolution blocks in a ResNet-50 coding layer as a main body and combining the convolution blocks of two layers of Conv-ReLU;
the fusion module is used for constructing a damaged area segmentation branch network and a depth information estimation branch network based on a wear surface basic feature extraction layer constructed by the extraction module by combining a U-Net structure architecture and wear surface characteristics, and fusing the damaged area segmentation branch network and the depth information estimation branch network to serve as a wear surface depth estimation model;
the function module is used for constructing a depth information mean square error loss, a damage detection cross entropy loss, an edge detection mean square error loss and a structure consistency loss function based on the damaged area segmentation network and the depth information estimation network, and obtaining a loss function of the wear surface depth estimation model constructed by the fusion module in a weighting mode;
the estimation module is used for acquiring a two-dimensional wear surface image and making a corresponding damage area marking map and a depth map; selecting a wear surface image with a typical damage area as a training sample, taking a loss function constructed by a function module as an optimization target, training a wear surface depth estimation model constructed by a fusion module by adopting an adaptive moment estimation method, and taking a single wear surface image as the input of the trained wear surface depth estimation model to obtain a damage area segmentation result image and a depth information result image of the wear surface.
CN202210689847.XA 2022-06-17 2022-06-17 Wear surface damage depth estimation method and system based on multi-attention mechanism Active CN114972882B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210689847.XA CN114972882B (en) 2022-06-17 2022-06-17 Wear surface damage depth estimation method and system based on multi-attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210689847.XA CN114972882B (en) 2022-06-17 2022-06-17 Wear surface damage depth estimation method and system based on multi-attention mechanism

Publications (2)

Publication Number Publication Date
CN114972882A true CN114972882A (en) 2022-08-30
CN114972882B CN114972882B (en) 2024-03-01

Family

ID=82964067

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210689847.XA Active CN114972882B (en) 2022-06-17 2022-06-17 Wear surface damage depth estimation method and system based on multi-attention mechanism

Country Status (1)

Country Link
CN (1) CN114972882B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471505A (en) * 2022-11-14 2022-12-13 华联机械集团有限公司 Intelligent carton sealing machine regulation and control method based on visual identification

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020103715A4 (en) * 2020-11-27 2021-02-11 Beijing University Of Posts And Telecommunications Method of monocular depth estimation based on joint self-attention mechanism
CN112381770A (en) * 2020-11-03 2021-02-19 西安交通大学 Wear surface three-dimensional topography measuring method based on fusion convolution neural network
CN112967327A (en) * 2021-03-04 2021-06-15 国网河北省电力有限公司检修分公司 Monocular depth method based on combined self-attention mechanism
CN114119694A (en) * 2021-11-10 2022-03-01 中国石油大学(华东) Improved U-Net based self-supervision monocular depth estimation algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381770A (en) * 2020-11-03 2021-02-19 西安交通大学 Wear surface three-dimensional topography measuring method based on fusion convolution neural network
AU2020103715A4 (en) * 2020-11-27 2021-02-11 Beijing University Of Posts And Telecommunications Method of monocular depth estimation based on joint self-attention mechanism
CN112967327A (en) * 2021-03-04 2021-06-15 国网河北省电力有限公司检修分公司 Monocular depth method based on combined self-attention mechanism
CN114119694A (en) * 2021-11-10 2022-03-01 中国石油大学(华东) Improved U-Net based self-supervision monocular depth estimation algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHUO WANG等: "Optimized CNN model for identifying similar 3D wear particles in few samples", WEAR, 15 November 2020 (2020-11-15) *
岑仕杰;何元烈;陈小聪;: "结合注意力与无监督深度学习的单目深度估计", 广东工业大学学报, no. 04, 14 July 2020 (2020-07-14) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471505A (en) * 2022-11-14 2022-12-13 华联机械集团有限公司 Intelligent carton sealing machine regulation and control method based on visual identification

Also Published As

Publication number Publication date
CN114972882B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
CN107085716B (en) Cross-view gait recognition method based on multi-task generation countermeasure network
CN108647585B (en) Traffic identifier detection method based on multi-scale circulation attention network
Deschaud et al. A fast and accurate plane detection algorithm for large noisy point clouds using filtered normals and voxel growing
CN110287932B (en) Road blocking information extraction method based on deep learning image semantic segmentation
Wang et al. Modeling indoor spaces using decomposition and reconstruction of structural elements
Lee et al. Perceptual organization of 3D surface points
CN111414923B (en) Indoor scene three-dimensional reconstruction method and system based on single RGB image
Dal Poz et al. Automated extraction of road network from medium-and high-resolution images
CN111882531B (en) Automatic analysis method for hip joint ultrasonic image
CN111311611B (en) Real-time three-dimensional large-scene multi-object instance segmentation method
CN111507921B (en) Tunnel point cloud denoising method based on low-rank recovery
CN114758288A (en) Power distribution network engineering safety control detection method and device
EP3872761A2 (en) Analysing objects in a set of frames
CN115601661A (en) Building change detection method for urban dynamic monitoring
CN116310219A (en) Three-dimensional foot shape generation method based on conditional diffusion model
CN114565594A (en) Image anomaly detection method based on soft mask contrast loss
CN115457195A (en) Two-dimensional and three-dimensional conversion method, system, equipment and medium for distribution network engineering drawings
CN114972882A (en) Wear surface damage depth estimation method and system based on multi-attention machine system
CN114283326A (en) Underwater target re-identification method combining local perception and high-order feature reconstruction
Xiao et al. A topological approach for segmenting human body shape
CN112184731A (en) Multi-view stereo depth estimation method based on antagonism training
CN113920254B (en) Monocular RGB (Red Green blue) -based indoor three-dimensional reconstruction method and system thereof
CN115797378A (en) Prostate contour segmentation method based on geometric intersection ratio loss
CN111462177B (en) Multi-clue-based online multi-target tracking method and system
CN110490235B (en) Vehicle object viewpoint prediction and three-dimensional model recovery method and device facing 2D image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant