CN111862042B

CN111862042B - Pipeline contour detection method based on full convolution neural network

Info

Publication number: CN111862042B
Application number: CN202010703954.4A
Authority: CN
Inventors: 孙军华; 程晓琦
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2020-07-21
Filing date: 2020-07-21
Publication date: 2023-05-23
Anticipated expiration: 2040-07-21
Also published as: CN111862042A

Abstract

The invention discloses a pipeline contour detection method based on a full convolution neural network, which comprises the following steps: obtaining a pipeline high dynamic range image by adopting a multi-exposure fusion algorithm, and finishing fine labeling of pipeline contours on the basis; taking the pipeline multi-exposure sequence images as the input of a full convolution neural network, wherein the images can provide pipeline contour information with different dynamic ranges so as to ensure the integrity of shallow layer characteristics; the difference value between the contour prediction graph and the expanded contour label graph and the difference value between the contour label graph and the expanded contour prediction graph are calculated to construct an objective function, and training by using the objective function is helpful to eliminate adverse effects on training caused by pixel deviation and sawtooth morphology of the pipeline contour label. The invention can effectively solve the problem of pipeline contour detection under a complex background, and the detection result has higher accuracy and integrity.

Description

Pipeline contour detection method based on full convolution neural network

Technical Field

The invention belongs to the technical field of image processing, relates to a pipeline contour detection method, and in particular relates to a pipeline contour detection method based on a full convolution neural network.

Background

The pipeline is widely used as a transport carrier for transporting liquid or gas such as fuel, cooling liquid, lubricating liquid and the like in the fields of aerospace, automobiles, ships and the like, and is usually a metal pipeline. In optical two-dimensional images, the pipeline contours usually appear as edges containing certain shallow features (mainly gradient features) and deep features (including texture, shape, spatial relationships, etc.), and the contour edges are important bases for achieving optical three-dimensional reconstruction and dimensional measurement of the pipeline. Currently, contour detection methods for optical images are mainly classified into the following three types:

(1) Contour detection method based on shallow gradient characteristics

The use of such methods for line profile detection requires that the line profile appear as a stepped edge, i.e., a shallow gradient characteristic of the profile that is severely relied upon by the profile detection process. For example, sun Junhua et al in patent CN 108801175A "a high-precision space pipeline measurement system and method", liu Shaoli et al in patent CN 109583377A "a control method, device and upper computer for pipeline model reconstruction" all adopt the method to realize pipeline contour detection. However, to ensure accuracy and integrity of contour detection, such methods typically require the construction of ideal light field conditions using a backlight to obtain high contrast pipeline images with sharp contour edges. Therefore, such methods can only meet the contour detection requirement of an off-line single pipeline, and cannot be used for on-line contour detection of a complex pipeline system.

(2) Contour detection method based on artificial features and machine learning technology

Because the method combines the outline features of the deeper layers of the target, such as texture, shape, spatial relationship and the like, compared with the outline detection method based on the shallow gradient features, the method has a certain effect in the outline detection task in a general scene. For example Chen Ke, in the patent CN 104361367A "image contour detection algorithm based on machine learning method", even if the method is used to realize contour information detection in images under complex environments, the method can also be used to realize contour detection of pipelines. However, because a great deal of expertise is required to carry out artificial fine modeling on the deep features of the pipeline, the design difficulty is high, and the robustness of the algorithm is lost. In practical applications, it is difficult to ensure the stability, accuracy and integrity of pipeline profile detection.

(3) Contour detection method based on deep learning technology

Currently, deep learning techniques have achieved excellent performance in various fields. The full-roll neural network can adapt to different-size image input and complete pixel-level dense classification, so that the full-roll neural network is widely applied to tasks such as image segmentation and edge detection. In patent CN 111028217A, such as Wang Jun, an image crack segmentation method based on a full convolution neural network, and patent CN 110705623A, such as Jiang Wen, a sea-sky-line detection method based on a full convolution neural network, crack detection and sea-sky-line detection in an image are respectively realized by adopting the full convolution neural network, and good effects are obtained. Nevertheless, the above method still has difficulty in achieving good results in line profile detection, mainly due to the following two aspects:

(1) The imaging quality of the pipeline is poor. Especially for metal pipelines, the problems of reflection, uneven illumination and the like cause the partial area in the scene to exceed the dynamic range of the camera, namely, oversaturated or dark areas are displayed in the image. During detection, a single low dynamic range image cannot provide clear and complete pipeline contour information;

(2) The contour label has poor accuracy. Accurate labeling of pipeline contours is difficult to finish manually based on low dynamic range images, and binarized contour labels naturally have saw tooth shapes instead of ideal smooth contours, which directly affect training and detection effects of a neural network.

Disclosure of Invention

The invention solves the technical problems: the pipeline contour detection method based on the full convolution neural network can achieve complete contour information acquisition of the smooth metal pipeline surface and accurate marking of pipeline contours, complete training of the full convolution neural network on the basis, and ensure integrity and accuracy of pipeline contour detection.

In order to achieve the above purpose, the technical scheme of the invention is realized as follows: a pipeline contour detection method based on a full convolution neural network, the method comprising the steps of:

step 1: acquiring pipeline multi-exposure sequence images under a complex background through image acquisition equipment, and dividing the pipeline images into a training set and a testing set;

step 2: fusing each group of acquired pipeline multi-exposure sequence images into a single high dynamic range image, and finishing fine labeling of pipeline contours based on the high dynamic range images;

step 3: constructing a full convolution neural network, taking the pipeline multi-exposure sequence image as the input of the full convolution neural network, and outputting the pipeline multi-exposure sequence image as a pipeline contour prediction graph;

step 4: constructing an objective function by calculating the difference value between the contour prediction graph and the expanded contour label graph and the difference value between the contour label graph and the expanded contour prediction graph, and simultaneously realizing the training of the full convolution neural network by enabling the objective function and the binary cross entropy to be minimum;

step 5: in the actual detection process, the acquired pipeline multi-exposure sequence images with the same number are input into a fully-convolution neural network parameter model obtained through training, and output is a pipeline contour detection result.

The specific method for acquiring the pipeline multi-exposure sequence image in the step 1 comprises the following steps: fixing the image acquisition equipment, and setting the exposure time of the camera to be as follows: t/2 ⁿ ,t/2 ^n-1 ,…,t/2,t,2t,…,2 ^n-1 t,2 ⁿ t, and performing image acquisition, and ensuring that 2n+1 pipeline images with different dynamic ranges are acquired by each group of pipeline multi-exposure sequence images, wherein n and t are respectively used for controlling the number of acquired images and the exposure time.

The step 2 of fusing the pipeline multi-exposure sequence image into the high dynamic range image comprises the following steps:

step 2.1: modeling the exposure, gray information and radiance in a pipeline scene, setting each pixel point in the sceneConstant emittance of E _ij I=1, 2, …, M, j=1, 2, …, N, M and N denote image sizes of M rows and N columns, and the exposure time set for the pipeline multi-exposure sequence image acquisition process is Δt _k K=1, 2, …, K represents the number of images of the multi-exposure sequence, and the corresponding single image is I _k Wherein, the k image is equal to E _ij The corresponding pixel gray value is I _ijk The exposure time, gray information and irradiance in the scene are expressed as:

I _ijk ＝f(E _ij Δt _k )

wherein f is the response curve of the camera; taking the logarithm of the two sides of the above formula to further obtain:

g(I _ijk )＝lnE _ij +lnΔt _k

wherein g= lnf ^-1 ；

Step 2.2: estimating a camera response curve, the problem being obtained by minimizing the following objective function:

wherein λ is a smoothing factor, w (I _ijk ) As a weighting function, g' represents the second derivative of the function g, and z is the closed interval [ min (I _ijk )+1，max(I _ijk )-1]The gray value in the range, w (z) represents the weight coefficient of the weight function under the gray value z;

the use of the weighting function helps to reduce the effect of unreliable pixels on the result, which are either oversaturated or underexposed pixel points;

step 2.3: estimating the radiance in the pipeline scene; after completion of the solution of the response curve of the camera, the emittance E is completed _ij But in order to accurately estimate E _ij The different radiation values need to be weighted as follows:

to this end, estimated E _ij I.e. a high dynamic range image describing the illumination intensity of the scene.

The full convolution neural network constructed in the step 3 is formed by cascading a compression channel and an expansion channel; wherein the compression channel comprises 5 coding blocks EB ₁ ～EB ₅ Each coding block comprises two 3 multiplied by 3 edge-compensating convolution layers, the back edge of each convolution layer is provided with 1 BatchNorm layer and 1 ReLU layer, the adjacent coding blocks are connected by adopting a maximum pooling layer and realize downsampling, and the channel number of the feature map is doubled after downsampling is carried out once; the extension channel then contains 5 decoding blocks DB ₅ ～DB ₁, wherein DB₅ I.e. EB ₅ The decoding blocks and the network layer structure of the encoding blocks are the same, and the adjacent decoding blocks are connected by deconvolution, namely, the up-sampling of the image on the scale space is realized, and the channel number of the feature image is reduced by half after up-sampling every time; final DB ₁ Also, a 1 x 1 convolution is required for mapping the multi-channel feature map to a single channel output, i.e., to achieve dense two-classification per pixel point.

The specific calculation step of the objective function in the step 4 is as follows:

step 4.1: morphological expansion operation is carried out on a pipeline contour prediction graph P and a label graph T which are output by a neural network respectively to increase the contour width, namely, the size of a window is selected and slides on a picture, the value of the center point of the window is the maximum value of all pixel points in the window, and the process is expressed as follows:

wherein, i and j respectively represent the ith row and the jth column of the image, and the values of i 'and j' are { -1,0,1};

step 4.2: obtaining an expanded profile prediction graph through step 4.1

And expanded profile label map->

For P and->

T and />

Performing difference operation, namely:

wherein ,

the function of the method is that pixel points with negative differences are set to be zero, and the difference images M and R respectively represent missing contours and redundant contour parts in the contour prediction image P;

step 4.3: finally, the objective function used for training consists of two parts, one part is obtained by calculating the ratio of the missing contour to the redundant contour in the total contour, and the other part is binary cross entropy, namely:

since the expansion operation is introduced in the calculation process of the objective function, the result of the expansion operation also reflects the missing contour and the redundant contour in the predicted contour, and therefore, minimizing the objective function helps to eliminate the adverse effect of pixel deviation and sawtooth morphology of the pipeline contour label on training.

In the acquired 2n+1 pipeline multi-exposure sequence images, the image in the middle, namely the pipeline image acquired when the exposure time is set to be t, is in a normal exposure state, and the exposure time is set to be t/2 ⁿ ,t/2 ^n-1 The pipeline image acquired at the time of t/2 is in an underexposure state, and the exposure time is set to be 2t, … and 2 ^n-1 t,2 ⁿ the pipeline image acquired at the time t is in an overexposure state; the value n is large1 or more and the larger the pipe profile information is, the more sufficient pipe profile information can be provided.

The first coding block EB of the full convolution neural network ₁ The input of the image is the pipeline multi-exposure sequence image, so that the channel number of the characteristic image is matched with the number of the pipeline multi-exposure sequence images of each group.

Compared with the prior art, the invention has the advantages that:

according to the invention, the pipeline multi-exposure sequence images in the data set are fused into the high dynamic range image, and the image can accurately reflect the radiance of each pixel point in the scene, so that the fine labeling of the pipeline outline can be realized based on the high dynamic range image; the built full convolution neural network adopts multiple exposure sequence images of pipelines as input, and the images can provide pipeline contour information with different dynamic ranges so as to ensure the integrity of shallow features; in addition, an objective function is constructed by calculating the difference value of the contour prediction graph and the expanded contour label graph and the difference value of the contour label graph and the expanded contour prediction graph in the training process, and the objective function is used for reflecting the missing and redundant contours in the prediction graph, so that the minimization of the objective function is helpful for eliminating the adverse effect of pixel deviation and saw tooth morphology of the pipeline contour label on the training.

Drawings

FIG. 1 is a flow chart of a pipeline contour detection method based on a full convolution neural network;

FIG. 2 is a set of pipeline multi-exposure sequence images acquired in an embodiment;

FIG. 3 (a) is a high dynamic range image obtained by fusing multiple exposure sequence images in an embodiment

FIG. 3 (b) is a line profile label made based on FIG. 3 (a);

FIG. 4 is a schematic diagram of a fully-convolutional neural network in accordance with the present invention;

FIG. 5 shows the result of pipeline profile detection in an embodiment.

Detailed Description

The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without the inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.

Fig. 1 is a general implementation flowchart of a pipeline contour detection method based on a full convolution neural network, which specifically includes the following steps:

step 1: an image acquisition system is formed by an industrial camera, an industrial lens and an LED annular light source to finish acquisition of a pipeline profile data set. The acquisition object is 7 metal pipelines with different shapes, wherein 4 pipelines are used for acquiring training set images, and 3 pipelines are used for acquiring test set images. In the acquisition process, 1-3 metal pipelines are randomly placed in a camera view field by taking an optical platform as a background, and a plurality of interference objects such as bolts, nuts, gaskets and the like are randomly placed in the camera view field;

step 2: in each image acquisition process, the exposure time of the camera is set as follows in sequence: 2048us,4096us,8192us,16384us, 328 us,65536us,131072us, 26144us, 524288us, i.e. the camera is set to perform pipeline image acquisition at 9 different exposure times, wherein the first four are underexposed images, the fifth are normal exposure images, the last four are overexposed images, and the image size is 1024×1280, wherein a group of pipeline multi-exposure sequence images is shown in fig. 2. Repeating the image acquisition process, and acquiring 40 groups of pipeline multi-exposure images, wherein 30 groups of images are used for training a full convolution neural network, and 10 groups of images are used for testing a network parameter model;

step 3: the exposure time, gray scale information and irradiance in the pipeline scene are modeled. 9 images are acquired in total in each scene, and the radiance of each pixel point corresponding to the scene is set as E constantly _ij I=1, 2, …,1024, j=1, 2, …,1280, the exposure time set in the pipeline multi-exposure sequence image acquisition process is Δt _k K=1, 2, …,9, the corresponding single image is I _k Wherein, the k image is equal to E _ij The corresponding pixel gray value is I _ijk Then in the sceneThe exposure, gray information, and irradiance are expressed as:

I _ijk ＝f(E _ij Δt _k )

g(I _ijk )＝lnE _ij +lnΔt _k

wherein g= lnf ^-1 ；

Step 4: estimating a camera response curve; this problem is obtained by minimizing the following objective function:

wherein λ is a smoothing factor, w (I _ijk ) As a weight function, g' represents the second derivative of the function g, and in this embodiment, the value range of z is 1-254;

step 5: the irradiance in the piping scene is estimated. After completion of the solution of the response curve of the camera, the emittance E is completed _ij But in order to accurately estimate E _ij The different radiation values need to be weighted as follows:

to this end, estimated E _ij Namely, a high dynamic range image describing the illumination intensity of a scene, and a high dynamic range image obtained by fusing the images shown in fig. 2 is shown in fig. 3 (a);

step 6: and (3) carrying out path drawing on the pipeline outline in the high dynamic range image of each scene by using an image processing tool Photoshop, carrying out edge drawing operation on the path in one step, and setting the edge drawing width to be 2 pixels to finish the manufacture of the pipeline outline label. FIG. 3 (b) shows a pipe profile label made based on FIG. 3 (a);

step 7: a full convolutional neural network is constructed, which is formed by concatenating a compression channel and an expansion channel, as shown in fig. 4. Wherein the compression channel comprises5 coding blocks EB ₁ ～EB ₅ Each coding block comprises two 3 multiplied by 3 edge-compensating convolution layers, the back edge of each convolution layer is provided with 1 BatchNorm layer and 1 ReLU layer, the adjacent coding blocks are connected by adopting a maximum pooling layer and realize downsampling, and the channel number of the feature map is doubled after downsampling is carried out once; the extension channel then contains 5 decoding blocks DB ₅ ～DB ₁, wherein DB₅ I.e. EB ₅ The network layer structure of the decoding block and the encoding block is the same, and the adjacent decoding blocks are connected by deconvolution, namely, the up-sampling of the image on the scale space is realized, and the channel number of the feature map is reduced by half after each up-sampling, and finally DB ₁ Also, a 1 x 1 convolution is required for mapping the multi-channel feature map to a single channel output, i.e., to achieve dense two-classification per pixel point. Wherein, the first coding block EB of the full convolution neural network ₁ The input of the pattern is a pipeline multi-exposure sequence image, so that the channel number of the characteristic image is set to be 9;

step 8: morphological expansion operation is carried out on a pipeline contour prediction graph P and a label graph T which are output by a neural network respectively to increase the contour width, namely a window with a selected size of 3 multiplied by 3 slides on a picture, the value of the central point of the window is the maximum value of all pixel points in the window, and the process is expressed as follows:

wherein, i and j respectively represent the ith row and the jth column of the image, and the values of i 'and j' are { -1,0,1}.

Step 9: obtaining an expanded contour prediction graph through the step 8

And expanded profile label map->

For P and->

T and />

Performing difference operation, namely:

wherein ,

step 10: the objective function used in training consists of two parts, one part is obtained by calculating the duty ratio of the missing contour and the redundant contour in the total contour, and the other part is binary cross entropy, namely:

because the expansion operation is introduced in the calculation process of the objective function, the result of the expansion operation also reflects the missing contour and the redundant contour in the predicted contour, and therefore, minimizing the objective function is helpful for eliminating the adverse effect of pixel deviation and sawtooth morphology of the pipeline contour label on training;

step 11: and (3) putting the training set manufactured in the steps (1-6) into the network built in the step (7) in order for iterative training, wherein each iteration uses 3 groups of images to form a batch. Before training the data input network, firstly, carrying out normalization processing with a mean value of 0.46 and a variance of 0.1, and additionally, carrying out data enhancement operation, so that the network has stronger generalization capability and overfitting is avoided. The enhancement of the data set mainly adopts the following modes: random horizontal flip and vertical flip with probability 0.5, ±45° random rotation, ±0.2 times horizontal and vertical translation, ±1.2 times random scaling, ±10° shear. In the iterative process, the current prediction error is calculated according to the steps 8-10, and training is carried out by adopting an Adam optimizer. The initial learning rate of training is set to be 0.01, and the learning rate is reduced by 0.9 times every 70 epochs, and 700 epochs are trained in total;

step 12: and after training is completed, obtaining the full convolution neural network parameter model for pipeline contour detection. The pipeline multi-exposure sequence image in the test set is input into the parameter model to obtain the pipeline contour detection result, and fig. 5 shows part of pipeline contour detection results. The average class precision mAP and the global maximum F measure MF-ODS are used as evaluation indexes, and test results show that the mAP of the pipeline contour detection result can reach 85.7%. The MF-ODS can reach 80.1 percent.

While the foregoing has been described in relation to illustrative embodiments thereof, so as to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as limited to the spirit and scope of the invention as defined and defined by the appended claims, as long as various changes are apparent to those skilled in the art, all within the scope of which the invention is defined by the appended claims.

Claims

1. The pipeline contour detection method based on the full convolution neural network is characterized by comprising the following steps of:

step 4: constructing an objective function by calculating the difference value between the contour prediction graph and the expanded contour label graph and the difference value between the contour label graph and the expanded contour prediction graph, and simultaneously realizing the training of the full convolution neural network by enabling the objective function and the binary cross entropy to be minimum; the specific calculation step of the objective function in the step 4 is as follows:

step 4.2: obtaining an expanded profile prediction graph through step 4.1

And expanded profile label map->

For P and->

T and />

Performing difference operation, namely:

wherein ,

the function of (1) is to zero the pixel point with negative difference value, and the difference value graphs M' and R respectively represent the defects in the contour prediction graph PA missing contour and redundant contour portion;

because the expansion operation is introduced in the calculation process of the objective function, the result of the expansion operation also reflects the missing contour and the redundant contour in the predicted contour, and therefore, minimizing the objective function is helpful for eliminating the adverse effect of pixel deviation and sawtooth morphology of the pipeline contour label on training; m and N represent image sizes of M rows and N columns;

2. The method for detecting the contour of the pipeline according to claim 1, wherein the specific method for acquiring the images of the multi-exposure sequence of the pipeline in step 1 is as follows: fixing image acquisition equipment, and setting exposure time of cameras to be t/2 in sequence ⁿ ,t/2 ^n-1 ,…,t/2,t,2t,…,2 ^n-1 t,2 ⁿ t, and performing image acquisition, and ensuring that 2n+1 pipeline images with different dynamic ranges are acquired by each group of pipeline multi-exposure sequence images, wherein n and t are respectively used for controlling the number of acquired images and the exposure time.

3. The method of claim 1, wherein the step of 2 completing the fusion of the multiple exposure sequence images of the pipeline into the high dynamic range image comprises:

step 2.1: modeling the exposure, gray information and radiance in the pipeline scene, and setting the radiance corresponding to each pixel point in the scene to be constant as E _ij ，i＝1,2,…,M，j＝1,2, …, N, M and N represent that the image size is M rows and N columns, and the exposure time set in the pipeline multi-exposure sequence image acquisition process is Deltat _k K=1, 2, …, K represents the number of images of the multi-exposure sequence, and the corresponding single image is I _k Wherein, the k image is equal to E _ij The corresponding pixel gray value is I _ijk The exposure time, gray information and irradiance in the scene are expressed as:

I _ijk ＝f(E _ij △t _k )

g(I _ijk )＝lnE _ij +ln△t _k

wherein g= lnf ^-1 ；

step 2.3: estimating the radiance in the pipeline scene, and completing the radiance E after completing the solution of the response curve of the camera _ij But in order to accurately estimate E _ij The different radiation values need to be weighted as follows:

4. The pipeline contour detection method according to claim 1, wherein the full convolution neural network constructed in the step 3 is formed by cascading a compression channel and an expansion channel; wherein the compression channel comprises 5 coding blocks EB ₁ ～EB ₅ Each coding block comprises two 3 multiplied by 3 edge-compensating convolution layers, the back edge of each convolution layer is provided with 1 BatchNorm layer and 1 ReLU layer, the adjacent coding blocks are connected by adopting a maximum pooling layer and realize downsampling, and the channel number of the feature map is doubled after downsampling is carried out once; the extension channel then contains 5 decoding blocks DB ₅ ～DB ₁, wherein DB₅ I.e. EB ₅ The decoding blocks and the network layer structure of the encoding blocks are the same, and the adjacent decoding blocks are connected by deconvolution, namely, the up-sampling of the image on the scale space is realized, and the channel number of the feature image is reduced by half after up-sampling every time; final DB ₁ Also, the outputs of (a) are subjected to a 1 x 1 convolution, which is used to map the multi-channel feature map to a single channel output, i.e., to achieve a dense binary classification for each pixel.

5. The pipeline contour detecting method as claimed in claim 2, wherein, of the 2n+1 images acquired in the step 1, the image located in the middle, i.e., the pipeline image acquired when the exposure time is set to t, is in a normal exposure state, and the exposure time is set to t/2 ⁿ ,t/2 ^n-1 The pipeline image acquired at the time of t/2 is in an underexposure state, and the exposure time is set to be 2t, … and 2 ^n-1 t,2 ⁿ the pipeline image acquired at the t time is in an overexposure state, the numerical value n is more than or equal to 1, and the larger the pipeline image is, the more sufficient pipeline profile information can be provided.

6. The method according to claim 4, wherein in the step 3, the first encoding block EB of the convolutional neural network is a full convolutional neural network ₁ The input of the pattern is a pipeline multi-exposure sequence image, so that the number of channels of the input feature image and each group of pipelinesThe number of the multi-exposure sequence images is matched.