CN115147347A

CN115147347A - Method for detecting surface defects of malleable cast iron pipe fitting facing edge calculation

Info

Publication number: CN115147347A
Application number: CN202210429986.9A
Authority: CN
Inventors: 江先亮; 白杰
Original assignee: Ningbo University
Current assignee: Ningbo University
Priority date: 2022-04-22
Filing date: 2022-04-22
Publication date: 2022-10-04

Abstract

The invention relates to a method for detecting surface defects of malleable cast iron pipes facing edge calculation, which comprises the following steps: collecting a plurality of malleable cast iron pipe fitting surface defect images; marking the defects on the collected malleable cast iron pipe surface defect image to obtain a data set; taking 80% of the data set as a training set and 20% as a testing set; constructing an expansion compression residual bottleneck network, wherein one feature extraction branch of the network adopts a convolution strategy of expansion and compression, and is used for extracting features of internal defects of malleable cast iron pipes on an input feature diagram I and inhibiting background noise information; the other characteristic extraction branch adopts a convolution strategy of compressing and expanding firstly and is used for extracting the characteristics of the edge profile of the malleable cast iron pipe fitting on the input characteristic diagram I; respectively outputting feature graphs subjected to feature extraction at the two feature extraction branches, and introducing an attention mechanism to obtain an expansion compression feature extraction backbone network fused with the attention mechanism; the method has the advantages of high detection precision, low calculation complexity and low calculation cost.

Description

Method for detecting surface defects of malleable cast iron pipe fitting facing edge calculation

Technical Field

The invention relates to the technical field of automatic defect detection, in particular to a method for detecting surface defects of malleable cast iron pipes facing edge calculation.

Background

Malleable cast Iron pipe (Malleable Iron) is also known as Malleable Iron pipe, and is widely used in pipe network systems for fire service, water supply, domestic heating and gas supply, etc. due to its excellent wear resistance, impact resistance and ductility. However, surface defects such as sand holes, dents, defects, flashes, ridges, spots and the like inevitably occur in the production process of the malleable cast iron pipe fitting, and as shown in fig. 1-1, the subsequent production and processing period is affected; even missing detection of a pipe fitting due to some minor defects can cause unpredictable safety hazards to the actual pipe network system.

In an actual pipe surface defect inspection process, nearly three quarters of the workers in the entire factory are employed to inspect product quality. The manual visual detection is also the most common detection mode, but the method has the disadvantages of low detection efficiency, high labor intensity, high false detection rate and omission factor, high manual detection cost and easy influence of subjective factors of workers. It is worth noting that when the defect size of the malleable cast iron pipe fitting is smaller than 0.5mm and large optical deformation does not exist, the defect characteristics cannot be determined by human eyes, and the method is not suitable for the requirement of large-scale malleable cast iron pipe fitting production. At present, a defect detection method based on a convolutional neural network has extensive research in academia, and the method can identify various defects only by debugging a certain amount of defect data samples, so that the labor cost is reduced. Therefore, the method for detecting the surface defects of the malleable cast iron pipes in real time in the edge end equipment is significant.

Convolutional Neural Networks (CNNs) are widely used in complex industrial environments where there are great differences in the shape, size, texture, color, background, layout, and imaging illumination of objects due to their strong feature extraction capabilities. The network can directly calculate the category, the positioning result and the category confidence coefficient of the defect object in the input image, and avoids the complex process of manually designing a feature extractor and debugging parameters in the traditional detection algorithm.

Because the Malleable Iron (MI) has heavy quality inspection quantity in the detection process, the CNN network directly used in the defect detection field is often not capable of meeting the requirement on the detection speed when being used as a feature extractor. Therefore, many researches make lightweight improvement on the CNN network structure for surface defect detection, and the existing methods for lightweight CNN network structure can be divided into three categories: detection methods based on lightweight modules, neural Architecture Search (NAS for short), and based on using various skills to compress pre-trained models.

The detection method based on the lightweight module constructs a network through efficient operation units such as point-by-point convolution, separable convolution, group convolution and the like, and reduces the parameter quantity and the calculation complexity of a model, so that the detection delay of the model is reduced. The detection method based on neural architecture search introduces Reinforcement Learning (RL) to search for a lightweight CNN network architecture with high precision. However, the search space of this approach is mainly focused on the structure at the unit level, and the same unit may be reused in all layers, resulting in a possible exponential increase in computational cost. The detection method based on compressing the pre-training model by various skills is a complementary work of the two methods, and the trained model can be further optimized by the skills of quantification (quantification), pruning (Pruning), distillation (Distillation) and the like. This approach is inherently a strategy to improve network efficiency by reducing accuracy and requires complex processing steps. At present, an attention mechanism is usually introduced into an advanced detection method based on a lightweight module, so that a network can independently learn the weight of each channel in a feature map, and the detection accuracy is improved in a weighting mode.

In summary, the detection method based on the lightweight module and the detection method based on the pre-training model compressed by various skills focus on using different strategies to reduce the complexity of the model, and the detection method based on the neural architecture search focuses on improving the detection precision of the model under the premise that the computing resources are limited. In summary, the first two methods often do not take into account the detection accuracy, while the last method does not take into account the storage cost of the arithmetic unit for the model.

Disclosure of Invention

The invention aims to solve the technical problem of providing the edge-calculation-oriented method for detecting the surface defects of the malleable cast iron pipe fittings, which has high detection precision, low calculation complexity and low calculation cost.

The technical scheme adopted by the invention is that the method for detecting the surface defects of the malleable cast iron pipe fitting facing to the edge calculation comprises the following steps:

s1, collecting a plurality of malleable cast iron pipe surface defect images by using industrial CCD cameras of different models and different environmental light source angles under a fixed collecting height, wherein each malleable cast iron pipe surface defect image comprises at least one malleable cast iron pipe surface defect;

s2, marking the defects on the surface defect images of the plurality of malleable cast iron pipes collected in the step S1 to obtain a label file corresponding to the surface defect image of each malleable cast iron pipe, and forming the label files into a data set;

s3, taking 80% of the data set obtained in the step S2 as a training set and 20% as a test set;

s4, constructing an expansion compression residual bottleneck network, wherein the expansion compression residual bottleneck network comprises two characteristic extraction branches, an input characteristic diagram is set as I,

respectively projecting an input feature map I into two feature extraction branches, wherein one feature extraction branch adopts a convolution strategy of expansion and compression, and is used for extracting features of internal defects of malleable cast iron pipes on the input feature map I and inhibiting background noise information; the other characteristic extraction branch adopts a convolution strategy of compressing and expanding firstly and is used for extracting the characteristics of the edge profile of the malleable cast iron pipe fitting on the input characteristic diagram I; the two feature extraction branches respectively output feature graphs subjected to feature extraction, and the feature extraction branches output the feature graphs subjected to feature extractionThe feature maps are fused in an element summation mode along the channel direction, and fused feature maps which are not recalibrated are output after fusion;

s5, on the basis of the expansion compression residual bottleneck network constructed in the step S4, respectively outputting feature graphs subjected to feature extraction by the two feature extraction branches, and then introducing an attention mechanism to obtain an expansion compression feature extraction backbone network fused with the attention mechanism;

s6, extracting a backbone network according to the expansion and compression characteristics integrated with the attention mechanism obtained in the step S5 to obtain a malleable cast iron pipe surface defect detection model;

s7, training the malleable cast iron pipe surface defect detection model obtained in the step S6 by using the training set divided in the step S3 to obtain a trained malleable cast iron pipe surface defect detection model;

s8, testing the trained malleable cast iron pipe fitting surface defect detection model obtained in the step S7 by using the test set divided in the step S3, and adjusting model parameters to obtain an optimized malleable cast iron pipe fitting surface defect detection model;

and S9, inputting the image of the malleable cast iron pipe fitting detected in real time into the optimized malleable cast iron pipe fitting surface defect detection model obtained in the step S8, and carrying out real-time defect detection on the malleable cast iron pipe fitting surface defect.

Preferably, in step S2, the method for labeling the defects on the surface defect images of the multiple malleable cast iron pipes collected in step S1 to obtain the label file comprises the following specific steps: marking each defect on the image of the defect on the surface of the malleable cast iron pipe fitting by using a boundary frame and a class label, and storing the marked image as a label file in a JSON format;

preferably, in step S4, the convolution strategy that one of the feature extraction branches is expanded and then compressed is adopted, and the specific process for extracting the features of the internal defects of the malleable cast iron pipe fitting on the input feature map I and suppressing the background noise information includes the following steps:

s4-01, mapping the input characteristic diagram I to a high-dimensional subspace through 1 multiplied by 1 point-by-point convolution, and recording the obtained result as

Is recorded as a non-linear activation output

Wherein e represents the expansion coefficient;

s4-02, extraction by 3 x 3 deep convolution

The obtained result is recorded as the characteristic information of each input channel

Is recorded as a non-linear activation output

S4-03, and then compressing through 1 multiplied by 1 point-by-point convolution

The number of channels obtained is recorded as

To pair

Adopting Linear activation operation, and recording the obtained result

Preferably, in step S4, the another feature extraction branch adopts a convolution strategy of compressing and expanding first, and the specific process for extracting the feature of the edge profile of the malleable cast iron pipe on the input feature map I includes the following steps:

s4-11, mapping the input characteristic diagram I to a low-dimensional subspace through 1 multiplied by 1 point-by-point convolution, and recording the obtained result as

Is recorded as a non-linear activation output

s represents a compression coefficient;

s4-12, application of 3 × 3 deep convolution to

Each input channel of (2), the result of which is noted as

Is expressed as a non-linear activation output

S4-13, expansion by 1X 1 Point-by-Point convolution

The number of channels obtained is recorded as

To pair

Adopting Swish nonlinear activation operation and recording the obtained result as

Preferably, in step S4, feature-extracted feature maps output by the two feature extraction branches are fused in an element summation manner along the channel direction, and the fused feature maps are output after fusionThe specific process of figure characterization is as follows: carrying out feature fusion on the identity mapping of the output feature maps of the two feature extraction branches and the corresponding input feature maps in an element summation mode and outputting a fusion feature map, wherein the output fusion feature map is recorded as

Wherein

Preferably, in step S5, on the basis of the expansion compression residual bottleneck network constructed in step S4, the specific process of obtaining the expansion compression feature extraction backbone network fused with the attention mechanism includes the following steps:

s5-1, integration: establishing an attention mechanism for the overall perception of each neuron of the fused feature map, namely, summing the neurons from the fused feature map by means of elements

And

the information is integrated into each neuron of the fusion feature map to obtain a new fusion feature map, and the expression of the new fusion feature map is as follows:

wherein,

s5-2, recalibration: a. using a global average pooling layer

Generating a new fused feature map

The quantized statistical data information s of (2),

merging the new fused feature maps

The expression of the channel c to the element c of s is as follows:

b. by fully connecting layers with activation functions

A shrinkage characteristic z is constructed which is,

the expression is

Where β () represents the batch normalization layer, σ () represents the Mish activation function,

c. output feature maps from two feature extraction branches are respectively and adaptively recalibrated by adopting SoftMax operator

And the calibration weight for each branch channel is guided by the shrinkage characteristic z, which is expressed as:

wherein,

are respectively as

A feature map channel domain attention weight vector; in particular, the present invention relates to a method for producing,

line c, U, representing U _c Is the c-th element of u;

s5-3, reintegration: respectively using the channel domain attention weight vectors u and v obtained by calculation in the S5-2 for re-integrating the feature map

Obtaining the final re-calibrated fused feature map O, the feature O of the c-th channel of the re-calibrated fused feature map O _c The expression of (c) is:

wherein u is _c +v _c ＝1，O＝[O ₁ ,O ₂ ,…,O _C ],

Compared with the prior art, the invention has the beneficial effects that: compared with the detection method based on the lightweight module and the detection method based on the pre-training model compressed by various skills, the method has lower model complexity; compared with a detection method based on neural architecture search, the method has the advantages that the storage cost of the operation unit is more efficiently utilized; in addition, the method has higher detection precision on the surface defects of the malleable cast iron pipes, and the complexity of the model can be adjusted according to the calculation force of the edge equipment.

Drawings

Fig. 1 is a schematic structural diagram of an expansion compression residual bottleneck network in the method for detecting surface defects of malleable cast iron pipes facing edge calculation;

fig. 2 is a schematic structural diagram of an expansion and compression feature extraction backbone network integrated with an attention mechanism in the edge calculation-oriented malleable cast iron pipe surface defect detection method of the present invention;

fig. 3 is a schematic diagram illustrating classification and labeling of surface defects of malleable cast iron pipes in step S2 according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the operation steps of Labelme in an embodiment of the present invention;

fig. 5 is a graph of test results obtained from tests performed on all models using the same input resolution 224 × 224 in an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings in combination with specific embodiments so that those skilled in the art can practice the invention with reference to the description, and the scope of the invention is not limited to the specific embodiments.

The embodiment of the invention provides a method for detecting surface defects of malleable cast iron pipes facing edge calculation, which comprises the following steps:

s1, collecting a plurality of malleable cast iron pipe surface defect images by using industrial CCD cameras of different models (such as IMX 700 CMOS, FA and SONY) at a fixed collecting height (15 cm) and adopting different environmental light source angles (at any light source angle for ensuring that the defects are visible), wherein each malleable cast iron pipe surface defect image comprises at least one malleable cast iron pipe surface defect;

s4, constructing an Expansion and compression residual bottleneck network (ESNet), as shown in FIG. 1, wherein the Expansion and compression residual bottleneck network comprises two feature extraction branches, an input feature diagram is set as I,

respectively projecting the input characteristic diagram I into two characteristic extraction branches, wherein one characteristic extraction branch adopts a convolution strategy of firstly expanding and then compressing and is used for extracting the characteristics of the internal defects of the malleable cast iron pipe fittings on the input characteristic diagram I and inhibiting background noise information; the other characteristic extraction branch adopts a convolution strategy of compressing and expanding firstly and is used for extracting the characteristics of the edge profile of the malleable cast iron pipe fitting on the input characteristic diagram I; the two feature extraction branches respectively output feature graphs subjected to feature extraction, the feature graphs subjected to feature extraction and output by the two feature extraction branches are fused in an element summation mode along the channel direction, and fused feature graphs which are not recalibrated are output after fusion;

s5, on the basis of the expansion compression residual bottleneck network constructed in the step S4, respectively outputting feature graphs subjected to feature extraction by the two feature extraction branches, and then introducing an attention mechanism to obtain an expansion compression feature extraction backbone network fused with the attention mechanism, as shown in FIG. 2;

s8, testing the trained malleable cast iron pipe surface defect detection model obtained in the step S7 by using the test set divided in the step S3, and adjusting model parameters to obtain an optimized malleable cast iron pipe surface defect detection model;

Preferably, in step S2, the method for labeling the defects on the surface defect images of the plurality of malleable cast iron pipes collected in step S1 to obtain the label file comprises the following specific steps: marking each defect on the image of the defect on the surface of the malleable cast iron pipe fitting by using a boundary frame and a class label, and storing the marked image as a JSON (Java Server object notation) format label file; the data set shown in fig. 3 contains 9 types of defects, and finally each defect image corresponds to a tag file in JSON format.

In fig. 3, each defect instance is labeled with a bounding box and a class tag, and finally each defect image corresponds to a tag file in JSON format. Through the discussion with the quality inspection worker, the following labeling rules are set:

thirdly, for irregular large-range defect defects, the defects are completely marked by a large boundary frame instead of a plurality of smaller boundary frames;

second, when a plurality of types of defects are overlapped at the same position, the defect with the largest area is preferentially marked. Particularly, the concave defect and the damaged defect usually appear together, and when the bounding boxes of the two types of defects belong to the inclusion relationship, the bounding box category is subject to the defect with the largest area; when the boundary frames of the two types of defects belong to an intersection relation, the boundary frames are attached to respective defect areas as closely as possible;

fourth, for Sand holes (Sand Hole) and about five percent defects (Appr.5%), this is a confusing defect, with a defect area for Sand holes accounting for between 0.25% and 2.5% of the defect image, and a defect area for Appr.5% accounting for between 2.5% and 7.5% of the defect image.

Under the guidance of the labeling rules, labeling the defect images of the malleable cast iron pipes by using an open source software Labelme tool as shown in figure 4; the labeling process mainly comprises the following steps: (1) starting software, and opening a folder (Open Dir) for placing a defect data set; (2) clicking an Edit (Edit) button to select Create Rectangle, namely selecting a square bounding box for marking; (3) selecting a defect area by a left mouse button; (4) inputting defect categories in the generated edit boxes according to the calibration rules; (5) clicking a confirmation button to finish the marking of a defect image; (6) and clicking the Next Image to label the Next Image, and repeating the operations (3) - (6) to finish the labeling of the whole data set.

Preferably, in step S4, as shown in fig. 1, the convolution strategy of expansion and compression is adopted by one of the feature extraction branches, and the specific process for extracting the features of the internal defects of the malleable cast iron pipe fitting on the input feature map I and suppressing the background noise information includes the following steps:

Is recorded as a non-linear activation output

Wherein e represents the expansion coefficient;

s4-02, extraction by 3 x 3 deep convolution

The obtained result is recorded as

Is expressed as a non-linear activation output

S4-03, and performing 1X 1 point-by-point convolution compression

The number of channels obtained is recorded as

To reduce the loss of information inside the feature map after the compression operation, the method

Adopting Linear activation operation, and recording the obtained result

Preferably, in step S4, as shown in fig. 1, the another feature extraction branch adopts a convolution strategy of compressing and expanding first, and the specific process for extracting the features of the edge profile of the malleable steel pipe fitting on the input feature map I includes the following steps:

Is recorded as a non-linear activation output

s represents a compression coefficient;

s4-12, application of 3 × 3 deep convolution to

Each input channel of (2), the result of which is noted as

Is recorded as a non-linear activation output

S4-13, expansion by 1X 1 Point-by-Point convolution

The number of channels obtained is recorded as

Unlike the second point-by-point convolution in the dilated branch, the pair

Adopting Swish nonlinear activation operation, and recording the obtained result as

Preferably, in step S4, feature maps output by the two feature extraction branches and subjected to feature extraction are fused in an element summation manner along the channel direction, and a specific process of outputting the fused feature maps after fusion is as follows: carrying out feature fusion on the identity mapping of the output feature maps of the two feature extraction branches and the input feature maps corresponding to the output feature maps in an element summation mode and outputting a fusion feature map, wherein the output fusion feature map is marked as

Wherein

Preferably, in step S5, on the basis of the expansion compression residual bottleneck network constructed in step S4, after the feature extraction branches output feature maps subjected to feature extraction respectively, an attention mechanism is introduced, and as shown in fig. 2, a specific process of obtaining an expansion compression feature extraction backbone network integrated with an attention mechanism includes the following steps:

s5-1, integration: establishing an attention mechanism for the overall perception of each neuron of the fused feature map (the neuron refers to a numerical value obtained by summing the internal elements of each channel of the feature map), namely, the neuron is obtained by means of element summation

s5-2, recalibration: a. using a global average pooling layer

Generating a new fused feature map

The quantized statistical data information s of (a),

merging the new fused feature maps

The c channel of (2) is counted to the c element of s, and the expression is as follows:

b. by fully-connected layers with activation functions

A shrinkage characteristic z is constructed which is,

the expression is

Where β () represents the Batch Normalization layer (Batch Normalization), σ () represents the Mish activation function,

the subsequent experimental section also investigated the effect of the coefficient of contraction (reduction) r on the model performance; c. output feature maps from two feature extraction branches are respectively and adaptively recalibrated by adopting SoftMax operator

wherein,

are respectively as

line c, U representing U _c Is the c-th element of u; s5-3, reintegration: respectively using the channel domain attention weight vectors u and v obtained by calculation in the S5-2 for re-integrating the feature map

Obtaining a final re-calibrated fused feature map O, the feature O of the c-th channel of the re-calibrated fused feature map O _c The expression of (c) is:

wherein u is _c +v _c ＝1，O＝[O ₁ ,O ₂ ,…,O _C ],

The English name of the expansion compression feature extraction backbone network fused with the attention mechanism is recorded as: ES-MobileNet; the English name of the expansion compression feature extraction backbone network is recorded as: ES-MobileNet;

table 1. The detailed structure of ES-MobileNet is given; in table 1, e. and s. respectively represent the number of channels of the output characteristic graph of the expansion branch and the compression branch after the expansion and contraction of the first point-by-point convolution; SK indicates whether a Selective Kernel (Selective Kernel) is present in the block of operation cells; NL denotes the type of nonlinear activation function used; wherein S represents Swish, R represents ReLU; ESNet represents an expansion compression bottleneck network of the fusion attention mechanism provided by the invention; conv2d represents a standard two-dimensional convolution operation; s, k, and t represent the step size of the convolution kernel, the kernel size, and the number of times the unit of operation repeats, respectively.

Detailed structure of es-MobileNet:

the data set IIDD of the embodiment of the invention has 4020 malleable iron pipe surface defect images, wherein 6313 defect examples are included in total; the entire data set contained 9 types of defects, including Sand Hole, long Peg, appr.5% (5% surface missing), abn. Peg (seam ridge), appr.50% (50% surface missing), collapse (depression), appr.25% (25% surface missing), appr.15% (15% surface missing) and Cast Waste (Cast scrap). The number and proportion of examples of each type of defect are given in table 2; FIG. 3 shows examples of various types of defect samples; in the evaluation experiments, we randomly selected 80% as training set and 20% as test set on each category.

Table 2. Statistics of surface defects of 9 common malleable steel pipes in iidd:

the ES-MobileNet defined in table 1 was designed based on ESNet. We use the proposed ES-MobileNet as a feature extractor and perform defect detection on the Darknet framework; default hyper-parameters are as follows: the edge of the input image is adjusted to 224; the training steps are 20000; the batch size (batch size) and mini-batch size (mini-batch size) were 64 and 16, respectively; a step fading learning rate scheduling strategy is adopted, the initial learning rate is 0.001, and factors of 0.1, 10, 0.1 and 0.1 are multiplied in steps 1100, 15000 and 18000 (steps) respectively; the number of preheating steps is 1000; the momentum and weight decay were set to 0.949 and 0.0005 respectively. All of our convolutional layers except the last two use batch-normalization (batch-normalization). All experiments were trained and evaluated on Windows PC using GeForce RTX 3070 and 8GB memory. In particular, we evaluated FPS of all models on an Nvidia Jetson Nano with 128 cores Maxwell and 4GB memory.

The performance of the expanded compression residual bottleneck network is evaluated herein by comparing the average accuracy (mAP) at the 0.5IoU threshold, precision, recall, F1-score, FLOPs, parameter quantities (# Params), and FPS (frames per second). In general, mAP is used to measure the accuracy of all classes in a dataset, FPS is used to test the inference speed, recall is an index that measures the coverage of the prediction box and the real box, F1-score is used to comprehensively evaluate the performance of a model, FLOPs is used to quantify the computational time complexity of the model, and # parameters is used to quantify the computational space complexity of the model.

We compared YOLOv2 and SSD (SSD is a detection algorithm in the field of computer vision) as baseline (baseline) and also evaluated and compared the detection performance of a modified version of YOLOv2 named ES-YOLO (this modified version is a model that replaces all standard convolutions in the YOLOv2 prediction layer with ESNet); results display in Table 3; the input resolutions of YOLOv2 and ES-YOLO are 416 x 416; wherein e, s and r of ESNet in ES-YOLO are respectively 4, 2 and 16; ES-YOLO significantly reduced the amount of parameters (1/8) and the calculation cost (1/10) compared to the original YOLOv 2.

Table 3.Es-YOLO compare performance with other large networks on IIDD dataset malleable steel pipe surface defect detection task:

Models	mAP	#Params	BFLOPs	F1-score
					SSD300	37.16	23.19M	60.338	0.31
SSD512	61.22	23.19M	117.694	0.82
					YOLOv2	57.26	38.15M	20.015	0.74
ES-YOLO	58.68	4.71M	1.981	0.77

in comparative experiments, ES-MobileNet differs from ES-MobileNet only in that: the former network will delete all attention mechanisms in the latter network (i.e. delete SK in table 1); the same is that both retain the expansion branch and the compression branch of the present invention. The ES-MobileNet Chinese name is an expansion compression characteristic extraction backbone network; the ES-MobileNet Chinese name is an expansion compression feature extraction backbone network with a fusion attention mechanism.

In the present example, we selected NAS-based small networks (e.g., nasNet, mnasNet-A, efficientNet-B0, and MobileNet V3-large) and artificially designed-based small networks (e.g., tiny-YOLO, mobileNet V2, shuffleNet V1, and ShuffleNet V2) as baselines and compared them with the artificially designed-based small networks ES-MobileNet proposed by the present invention. In the experiment, all small networks were evaluated under the YOLO framework at 224 input resolution, with e, s, and r for the ES bottleneck in our model being 4, 2, and 16, respectively. We do not compare model performance in other frameworks, such as fast-RCNN, because our focus is on mobile/real-time models. As shown in Table 4, ES-MobileNet can achieve similar mAP but better FPS than EfficientNet-B0 and MobileNet V2, with smaller parameters and computational cost. It is worth noting that the proposed network ES-MobileNet wins the best mAP with the help of Selective Kernel at the cost of negligible computational complexity. For the sake of intuitive feel, we give a detailed description in fig. 5.

To investigate the trade-off between performance and computational cost of selective kernels, experiments were performed on a series of different r values of fixed e =4 and s =2 using the network ES-MobileNet. As shown in table 5, the model reached the best state at r =8, and setting r =16 provides a good balance between accuracy and complexity.

The embodiment of the invention performs experiments on a series of different s values of ES-MobileNet with fixed e =4, and as shown by comparison in Table 6, increasing s does not monotonically improve mAP, while larger s significantly reduces the computational complexity of the model, and setting s =2 can achieve optimal accuracy and acceptable complexity.

The present invention example performed experiments on a series of different e values of ES-MobileNet at fixed s =2, and the comparison in table 7 shows that increasing e does not monotonically increase the mAP. Notably, the variation of the expansion factor has a greater impact on the spatial and computational complexity of the model than the contraction factor.

Table 4.Es-MobileNet comparison of performance of other small networks on IIDD dataset MI defect detection task with selective kernel:

table 5. Influence of selective kernel ES-MobileNet under different scaling factors r. Here, original refers to ES-MobileNet:

table 6 effect of es-MobileNet on model at different contraction factors s, fixed e =4:

Models	mAP	#Params	BFLOPs
				4e0s	60.62	0.75M	0.612
4e1s	60.58	0.78M	0.738
				4e2s	61.98	0.75M	0.675
4e3s	58.71	0.75M	0.654
				4e4s	57.22	0.75M	0.644

table 7 effect of es-MobileNet on model at different spreading factors e, fixed s =2:

Models	mAP	#Params	BFLOPs
				0e2s	50.76	0.25M	0.176
1e2s	53.83	0.78M	0.301
				2e2s	56.54	4.06M	0.426
3e2s	58.54	6.08M	0.551
				4e2s	61.98	0.75M	0.675
5e2s	61.45	0.90M	0.800
				6e2s	60.92	1.05M	0.925

Claims

1. a method for detecting surface defects of malleable cast iron pipes facing edge calculation is characterized by comprising the following steps: the method comprises the following steps:

s2, marking the defects on the surface defect images of the plurality of malleable cast iron pipes collected in the step S1, obtaining a label file corresponding to the surface defect image of each malleable cast iron pipe fitting, and forming the label files into a data set;

s3, taking 80% of the data set obtained in the step S2 as a training set and 20% as a testing set;

s4, constructing an expansion compression residual bottleneck network, wherein the expansion compression residual bottleneck network comprises two feature extraction branches, an input feature map is set as I,

respectively projecting an input feature map I into two feature extraction branches, wherein one feature extraction branch adopts a convolution strategy of expansion and compression, and is used for extracting features of internal defects of malleable cast iron pipes on the input feature map I and inhibiting background noise information; the other characteristic extraction branch adopts a convolution strategy of compressing and expanding firstly and is used for extracting the characteristics of the edge profile of the malleable cast iron pipe fitting on the input characteristic diagram I; two feature extraction branches respectively output feature maps subjected to feature extraction, twoThe feature images output by the feature extraction branches and subjected to feature extraction are fused in an element summation mode along the channel direction, and fused feature images which are not subjected to recalibration are output after fusion;

2. The method for detecting surface defects of malleable steel pipes according to claim 1, facing edge calculation, characterized in that: in step S2, marking the defects on the surface defect images of the plurality of malleable cast iron pipe fittings collected in step S1, and the specific process of obtaining the label file is as follows: and marking each defect on the image of the defect on the surface of the malleable cast iron pipe fitting by using a boundary frame and a class label, and storing the marked image as a JSON (Java Server object notation) format label file.

3. The method for detecting the surface defects of the malleable steel pipes facing the edge calculation according to claim 1 or 2, characterized in that: in step S4, the specific process of taking a convolution strategy of expansion and compression first for one of the feature extraction branches to extract the features of the internal defects of the malleable cast iron pipe fitting on the input feature map I and suppress the background noise information includes the following steps:

Is expressed as a non-linear activation output

Wherein e represents the expansion coefficient;

s4-02, extraction by 3 x 3 deep convolution

Is recorded as a non-linear activation output

S4-03, and performing 1X 1 point-by-point convolution compression

The number of channels of (2), the result of which is noted as

To pair

Adopting Linear activation operation, and recording the obtained result

4. The method for detecting surface defects of malleable steel pipes according to claim 3, facing edge calculation, characterized in that: in step S4, the other feature extraction branch adopts a convolution strategy of compressing and expanding first, and the specific process for extracting the features of the edge profile of the malleable cast iron pipe on the input feature map I includes the following steps:

Is recorded as a non-linear activation output

s represents a compression coefficient;

s4-12, application of 3 × 3 deep convolution to

Each input channel of (2), the result of which is noted as

Is expressed as a non-linear activation output

S4-13, by 1X 1 dot by dotConvolution expansion

The number of channels obtained is recorded as

To pair

5. The method for detecting surface defects of malleable steel pipes according to claim 4, facing edge calculation, characterized in that: in step S4, feature maps output by the two feature extraction branches and subjected to feature extraction are fused in an element summation manner along the channel direction, and the specific process of outputting the fused feature maps after fusion is as follows: carrying out feature fusion on the identity mapping of the output feature maps of the two feature extraction branches and the corresponding input feature maps in an element summation mode and outputting a fusion feature map, wherein the output fusion feature map is recorded as

Wherein

6. The method for detecting surface defects of malleable steel pipes according to claim 5, facing edge calculation, characterized in that: in step S5, on the basis of the expansion compression residual bottleneck network constructed in step S4, the specific process of obtaining the expansion compression feature extraction backbone network fused with the attention mechanism includes the following steps:

And

the information of (2) is integrated into each neuron of the fusion feature map to obtain a new fusion feature map, and the expression of the new fusion feature map is as follows:

wherein,

s5-2, recalibration: a. using a global average pooling layer

Generating a new fused feature map

The quantized statistical data information s of (a),

merging the new fused feature maps

The expression of the channel c to the element c of s is as follows:

b. by having full concatenation of activation functionsJoining layer

A shrinkage characteristic z is constructed which is,

the expression is

And

wherein, the content of U,

and (u) and (ii) are,

are respectively as

line c, U, representing U _c Is the c-th element of u;

And

obtaining a final re-calibrated fused feature map O, the feature O of the c-th channel of the re-calibrated fused feature map O _c The expression of (a) is:

wherein u is _c +v _c ＝1，O＝[O ₁ ,O ₂ ,…,O _C ],