CN112541930A

CN112541930A - Image super-pixel target pedestrian segmentation method based on cascade connection

Info

Publication number: CN112541930A
Application number: CN201910900391.5A
Authority: CN
Inventors: 杨大伟; 马雪; 毛琳
Original assignee: Dalian Minzu University
Current assignee: Dalian Minzu University
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2021-03-23

Abstract

The invention belongs to the technical field of image segmentation, in particular to a cascading-based image superpixel target pedestrian segmentation method. The target area and the segmentation result; S2, send the single target area to the superpixel segmentation channel, and output the superpixel result with the label; S3, calculate the feature similarity difference value of the adjacent superpixel blocks in the superpixel segmentation image, and divide the superpixel block Aggregate into a super-coefficient pixel block to gather a more accurate target object contour; S4, use the energy high-frequency fusion rule to fuse the target result extracted by the first-stage coarse-grained segmentation and the target object contour extracted by the second-stage fine-grained segmentation. The final fused image is reconstructed.

Description

Image super-pixel target pedestrian segmentation method based on cascade connection

Technical Field

The invention belongs to the technical field of image segmentation, and particularly relates to a pedestrian segmentation method based on a cascading super-pixel target.

Background

The target pedestrian segmentation technology is an important research subject in the field of machine vision, and with the development of autonomous automobiles and driving assistance systems, how to protect the safety of pedestrians and vehicles becomes a research hotspot at present. The image segmentation technology is a preprocessing part in an image recognition and computer vision system, and can provide target characteristics for recognition and tracking by effectively segmenting an image, so that the interference of redundant information on post-image processing is avoided.

Currently, object segmentation based on color features can be divided into two categories, color image and grayscale image, and image segmentation techniques based on generation can be divided into two categories, region and feature. Aiming at different fields, the segmentation precision can be effectively improved by adopting a suitable image segmentation method. The current mainstream segmentation methods can be mainly divided into a region method, a threshold value method, an edge method and an interactive segmentation method. The patent application number is CN201910020730, which is entitled "an image segmentation method and apparatus", and discloses that an image to be segmented is obtained from a first image to be segmented and a second image to be segmented and the segmented images are input into a pre-trained image segmentation model, and correlation parameters of the segmented images are calculated by extracting feature vectors of the segmented images, so as to predict the segmented images. The patent application number is CN201811472018 entitled "chest segmentation and processing method, system and electronic device", which discloses that an image data set is trained by using a deep learning algorithm to obtain an image segmentation model based on deep learning, then an image to be segmented is processed by the image segmentation model to obtain a segmented lung region and a thoracic region, and finally a lung-chest ratio is calculated according to the lung region and the thoracic region. The method focuses on calculating the lung-chest ratio, but the precision of the method is still not accurate enough in terms of the segmentation result, and if the precision can be improved, the calculated lung-chest ratio result can be obviously more accurate. The patent application number is CN201810860392, which is named as a target detection network design method fusing image segmentation features and discloses a method for realizing small target image segmentation by combining a general target detection frame Mask-RCNN and image segmentation features. When the method is applied to small targets, the segmentation precision is relatively high, but the segmentation precision of large targets is still not accurate enough. The patent application number is CN201810860454, which is named as a pedestrian detection method based on Mask-RCNN, and aims at the condition that an in-vehicle target is mistakenly detected as a pedestrian, the method combines the characteristics that the Mask-RCNN can simultaneously carry out target detection and target segmentation, and provides an optimization algorithm combined with a target segmentation result.

Disclosure of Invention

In order to improve the accuracy of image segmentation of the target pedestrian, the invention provides a cascading-type image superpixel target pedestrian segmentation method, which provides more accurate preprocessing information for subsequent work of a computer vision system by establishing a target pedestrian segmentation model.

The invention realizes the above aim by the following technical scheme: a cascading-based image super-pixel target pedestrian segmentation method comprises the following steps:

step 1, sending a source image to an example segmentation channel, outputting an example segmentation image, splitting the example segmentation image, and extracting a single-target area and a segmentation result:

(M，R，S₀)＝MASKRCNN(I₀) (1)

wherein MASKRCNN is an example partition function, I₀For inputting a source image, R is a single target area split and extracted from an example segmentation image, S₀Splitting and extracting a segmentation result in the example segmentation image, wherein M is the example segmentation image;

wherein:

the example segmentation image M is an image which is not processed after the source image is subjected to example segmentation;

the single target area R is a single target area image obtained by splitting and extracting the example segmentation image M, and the range of the single target area R is larger than that of the target detection frame; formula (2) is a calculation formula of the number of the single target regions R:

B＝A±X(A∈(0，N⁺)，B∈(0，N⁺)，X∈(0，N⁺)) (2)

the source image contains A target objects, and B single target areas R after example segmentation, wherein X represents the number of error detection people of example segmentation targets;

segmentation result S₀For example, a contour image obtained by splitting and extracting an image M is segmented, and a segmentation result S is represented by formula (3)₀The calculation formula of the number is as follows:

N＝J±X(N∈(0，N⁺)，J∈(0，N⁺)，X∈(0，N⁺)) (3)

the example segmentation image M comprises J target objects, and N segmentation results S after the J target objects are split and extracted₀Wherein X represents the number of error detection persons of the example division target;

step 2, sending the single target region R to a super-pixel segmentation channel, and outputting a super-pixel segmentation image with a label;

Q_GK＝SLIC(R) (4)

wherein SLIC is a superpixel segmentation function, R is an extracted single-target region in an example segmentation image, Q_GKSegmenting the image for K tagged superpixels;

step 3, dividing the super pixel into an image Q_GKMerging the super-pixel blocks with similar characteristics in the middle adjacent super-pixel blocks, replacing K super-pixel blocks in the super-pixel segmentation image with N super-pixel coloring information blocks, and finally reconstructing a more accurate target object outline;

P_N＝C_slic(Q_GK) (5)

wherein, C_slicFor combining functions, Q, of SLIC_GKSegmenting an image for K tagged superpixels, P_NIs a reconstructed target object contour;

step 4, dividing the result S₀And a target object profile P_NFusing and reconstructing a cascaded segmentation fused image E_i。

E_i＝NSST(P_N，S₀) (6)

Wherein NSST is a non-down-shear wave transform multi-scale analysis function, P_NIs the contour of the target object, S₀For example segmentation results split and extracted in segmented images, E_iIs a reconstructed final fused image;

the image fusion uses an energy filtering high-low frequency fusion rule: for the registered segmentation result S₀And pre-fusing the target object contour P by adopting an energy filtering high-low frequency fusion rule, fusing the low-frequency coefficient by adopting a fusion rule based on an image guide filter in the low-frequency information fusion, and obtaining the low-frequency fusion coefficient. High frequency information fusion for superpixel Q_GKAnd then, the coefficients with the same label are gathered into a super-coefficient block, and the spatial frequency of each super-coefficient block is solved to obtain a high-frequency fusion coefficient. Finally, NSST inverse transformation is carried out on the high-frequency fusion coefficient and the low-frequency fusion coefficient, and a final fusion image E is reconstructed_i。

Further, the super-pixel block feature merging step is as follows:

1) setting and sequencing superpixel blocks, and calculating the characteristic difference of color and space distance of adjacent superpixel blocks in the graph by the following calculation formula:

in the formula (9), the LAB vector adopts a CIELAB color space model, D_LAB(R_i) For the inter-superpixel block color space distance, R' denotes the non-target region, l_iAnd l_jIs a component of the pixel brightness, a_i、a_j、b_i、b_jBeing a component of a color, D_XY(R_i) Is a position space distance, x_i、x_j、y_i、y_jThe vector obtains the spatial coordinate value of the pixel, D (R)_i) Is the superpixel distance, δ is the distance weight coefficient, and δ belongs to (0, 1);

2) comparing the calculation result with a preset threshold, merging the target super-pixel block and the adjacent super-pixel block if the characteristic result is smaller than the threshold, ignoring the target pixel block if the characteristic result is larger than the threshold, and continuing to perform characteristic inspection on the next super-pixel block;

determining the correlation degree of the super-pixel area according to the super-pixel distance, wherein the calculation formula is as follows:

C(R_i)＝1-exp(-D(R_i)) (10)

in the formula (10), C (R)_i) Representing the degree of super-pixel area correlation, D (R)_i) The super-pixel distance is inversely related to the area correlation. Determining whether the superpixel blocks accord with the characteristic information of the same target or not according to the correlation;

according to the calculation of the regional relevance of all superpixels, a regional relevance threshold value is calculated by utilizing a maximum inter-class difference method, all superpixel blocks meeting the relevance threshold value are extracted as target superpixels, and the calculation formula is as follows:

in the formula (11), R^*Representing the set of target superpixels finally acquired, R_iIs the target superpixel at i, C (R)_i) Representing the degree of correlation of the super-pixel area,

the method comprises the steps of obtaining a region correlation threshold value, wherein epsilon is a correlation threshold value coefficient, epsilon is 0.5, when epsilon is 0.5, characteristic information can be better divided into different pixel sets, each obtained subset forms a region corresponding to a real scene, the interior of each region has consistent attributes, and adjacent regions do not have the consistent attributes;

3) iterating the steps until all the superpixel blocks in the image complete one-time feature comparison, and generating a first merging result image at the moment;

4) before the second combination, refreshing the characteristic information and the rearrangement sequence of the superpixel blocks, and then combining the first combination result as an object of the combination operation until the superpixel blocks in the first combination result complete characteristic comparison to generate a second combination result graph.

Through the technical scheme, the invention has the beneficial effects that: the existing image segmentation method is basically used for segmenting a source image, the extracted target feature result is not accurate enough, and especially the edge contour effect of the segmented target feature is not ideal. The method adopts a cascading-type super-pixel segmentation method to carry out cascading type segmentation on a source image, finally uses an energy filtering high-low frequency fusion rule to realize sparse representation on the image in each direction and each scale, overcomes the pseudo Gibbs effect, finally improves the segmentation precision of image preprocessing, and provides a beneficial segmentation basis for subsequent identification and tracking

Drawings

FIG. 1 is a block diagram of cascaded superpixel splitting logic;

FIG. 2 is a simplified model diagram of imaging by a vehicle-mounted camera;

FIG. 3 is a single person source image from an in-vehicle perspective;

FIG. 4 is a vehicle-mounted perspective single-pedestrian example segmentation image

FIG. 5 is a vehicle view single person superpixel segmentation image;

FIG. 6 is a schematic view of a vehicle-mounted perspective single-target pedestrian fusion sign;

FIG. 7 is a vehicle view dual target pedestrian source image;

FIG. 8 is an on-board perspective dual-target pedestrian example segmentation image;

FIG. 9 is a schematic view of a vehicle-mounted perspective dual-target pedestrian fusion sign;

FIG. 10 is a dim dual target pedestrian source image;

FIG. 11 is a dim binocular example segmented image of a pedestrian;

fig. 12 is a schematic diagram of a dim-out dual target pedestrian fusion marker.

Detailed Description

The invention is further described with reference to the accompanying drawings and the specific classification procedures:

a logic block diagram of a cascade-based image super-pixel target pedestrian segmentation method is shown in FIG. 1, and the method comprises the following specific implementation steps:

step 1: sending the source image to an example segmentation channel, outputting an example segmentation image, and splitting and extracting a single target region R and a segmentation result S on the basis of the example segmentation image₀；

Step 2: sending the single target region R to the super-pixel segmentation channel, and outputting the super-pixel segmentation image Q with labels_GK；

And 3, step 3: segmenting the superpixel into an image Q_GKMerging and reconstructing super pixel blocks with similar characteristics in middle and adjacent super pixel blocks to obtain more accurate target object contour Q_GK；

Step 4, dividing the result S₀And a target object profile P_NFusing and reconstructing a final cascade segmentation fused image E_i。

The specific scheme is as follows:

the invention is different from the existing target pedestrian segmentation algorithm, provides a cascading-type super-pixel-based target pedestrian segmentation method, and provides more accurate preprocessing information for the follow-up work of a computer vision system by establishing a target pedestrian segmentation model.

The invention realizes the above aim by the following technical scheme:

step 1, sending a source image to an example segmentation channel, outputting an example segmentation image, and splitting and extracting a single target area and a segmentation result on the basis of the example segmentation image.

(M，R，S₀)＝MASKRCNN(I₀) (1)

Wherein MASKRCNN is an example partition function, I₀For inputting a source image (length and width 2)⁶Multiple source images), R is an extracted single-target region in the example segmentation image, S₀The extracted segmentation result is split in example segmentation, and M is an example segmentation image.

Definition 1: the example segmentation image M is an image which is not processed after the source image is subjected to example segmentation.

Definition 2: the single target area R is an image obtained by splitting and extracting the example segmentation image M, and the range of the single target area R is certainly larger than that of the target detection frame. Formula (2) is a calculation formula of the number of the single target regions R:

B＝A±X(A∈(0，N⁺)，B∈(0，N⁺)，X∈(0，N⁺)) (2)

the source image comprises A target objects, B single target areas R after example segmentation, and X represents the number of error detection persons of the example segmentation target.

Definition 3: segmentation result S₀For example, the contour image obtained by splitting and extracting the image M is divided, and the formula (3) is a division result S₀The calculation formula of the number is as follows:

N＝J±X(N∈(0，N⁺)，J∈(0，N⁺)，X∈(0，N⁺)) (3)

here, the example segmented image M contains J target objects, and after the splitting extraction, N segmentation results S₀In the formula, X represents the number of detection errors of the example division target.

And 2, sending the single target region R to a superpixel segmentation channel, and outputting a super pixel segmentation image with a label.

Q_GK＝SLIC(R) (4)

Wherein SLIC is a superpixel segmentation function, R is an extracted single-target region in an example segmentation image, Q_GKThe image is segmented for the superpixel containing K labeled superpixels.

Step 3, dividing the super pixel into an image Q_GKAnd combining the super-pixel blocks with similar characteristics in the middle adjacent super-pixel blocks to realize that K super-pixel blocks in the super-pixel segmentation image are replaced by N super-pixel coloring information blocks, and finally reconstructing a more accurate target object outline.

P_N＝C_slic(Q_GK) (5)

Wherein, C_slicFor combining functions, Q, of SLIC_GKSegmenting an image for K tagged superpixels, P_NIs the reconstructed contour of the target object.

E_i＝NSST(P_N，S₀) (6)

Wherein NSST is a non-down-shear wave transform multi-scale analysis function, P_NIs the contour of the target object, S₀For splitting the extracted segmentation result in the instance segmentation, E_iIs the reconstructed final fused image.

Energy filtering high-low frequency fusion rule: for the registered segmentation result S₀And pre-fusing the target object contour P by adopting an energy filtering high-low frequency fusion rule, fusing the low-frequency coefficient by adopting a fusion rule based on an image guide filter in the low-frequency information fusion, and obtaining the low-frequency fusion coefficient. High frequency information fusion for superpixel Q_GKAnd then, the coefficients with the same label are gathered into a super-coefficient block, and the spatial frequency of each super-coefficient block is solved to obtain a high-frequency fusion coefficient. Finally, NSST inverse transformation is carried out on the high-frequency fusion coefficient and the low-frequency fusion coefficient, and a final fusion image E is reconstructed_i。

The super pixel block feature merging step is as follows:

C(R_i)＝1-exp(-D(R_i)) (10)

The existing image segmentation method is basically used for segmenting a source image, the extracted target feature result is not accurate enough, and especially the edge contour effect of the segmented target feature is not ideal. The method adopts a cascading-type super-pixel segmentation method to carry out cascading type segmentation on a source image, and finally uses an energy filtering high-low frequency fusion rule to realize sparse representation on the image in each direction and each scale, so that the pseudo Gibbs effect is overcome, the segmentation precision of image preprocessing is finally improved, and a beneficial segmentation basis is provided for subsequent identification and tracking. That is to say, the method has the problems of small target false detection, missing detection, inaccurate segmentation result of the overlapped part and the like due to the fact that the Mask-RCNN is independently used, and the method segments the image by establishing a cascading super-pixel segmentation system. Firstly, carrying out example segmentation through Mask-RCNN, splitting and extracting a single target region R and a segmentation result S after segmentation₀Then, the single target region R is subjected to superpixel segmentation to obtain a superpixel segmentation image Q_GKFinally, corresponding fusion rules are formulated to divide the superpixel into the image Q_GKAnd the segmentation result S₀Fusing to reconstruct the final fused image E_i. According to the method, the super-pixel single-target segmentation is carried out on the result of the Mask-RCNN example segmentation, so that the segmentation precision can be improved, and more accurate preprocessing information is provided for the follow-up work of a computer vision system.

Example 1:

vehicle-mounted visual angle single-target pedestrian segmentation condition

A vehicle-mounted camera model is established by utilizing a geometrical relation, wherein the height of a target in an image plane is set to be h, the height of the target in the real world is 168cm, the focal length of a camera is set to be 12.25cm, the actual distance between the target and the camera is 145cm, and the pedestrian target moves at the speed of about 1.5m/s in a video and keeps moving linearly without changing the moving speed. By way of example segmentation, it can be observed that the segmentation of the contour of the target person in fig. 4 is not accurate enough. On the basis of the above-mentioned accuracy, it can be raised, and transferred into super-pixel input channel, and the null can be setThe inter-distance weight value is 65, the number of divided blocks is 225, the initial step size is 5, and superpixel division is performed. After the segmentation is finished, the registered single target region R and the segmentation result S are subjected to energy filtering high-low frequency fusion rule₀Fusing to reconstruct the final fused image E_iThe image contour accuracy of the cascade segmentation fusion can be obviously improved.

Example 2:

vehicle-mounted visual angle dual-target pedestrian segmentation condition

The vehicle-mounted camera model is established by using a geometrical relation, wherein the height of a target in an image plane is set to be h, the height of a target A in the real world is 168cm, the height of a target B in the real world is 165cm, the focal length of a camera is set to be 11.25cm, the actual distance between the target A and the camera is 120cm, and the actual distance between the target B and the camera is 195 cm. The video contains the double-pedestrian target, moves oppositely at the speed of about 1.4m/s, and keeps moving linearly without changing the moving speed. By way of example segmentation, it can be observed that the segmentation of the contour of the target person in fig. 8 is not accurate enough. On the basis of the above, the precision is improved, and the super-pixel is sent to a super-pixel input channel, the spatial distance weighted value is set to be 75, the number of the segmentation blocks is set to be 150, the initial step size is set to be 5, and super-pixel segmentation is carried out. After the segmentation is finished, the registered single target region R and the segmentation result S are subjected to energy filtering high-low frequency fusion rule₀Fusing to reconstruct the final fused image E_iThe image contour accuracy of the cascading segmentation and fusion can be obviously improved.

Example 3:

dim environment dual-target pedestrian segmentation condition

And establishing a camera model by using a geometrical relation, wherein the height of the target in an image plane is set as h, the height of a target A in the real world is 175cm, the height of a target B in the real world is 165cm, the focal length of the camera is set as 12.45cm, the actual distance between the target A and the camera is 115cm, and the actual distance between the target B and the camera is 105 cm. The twin pedestrian objects A, B move in opposite directions at a speed of about 0.5m/s, and both keep moving straight without changing the moving speed. By example segmentation, a graph can be observedThe segmentation of the contour of the target person in 11 is not accurate enough. On the basis, the precision is improved, the super-pixel is sent to a super-pixel input channel, a spatial distance weighted value is set to be 80, the number of segmentation blocks is set to be 200, the initial step size is set to be 6, and super-pixel segmentation is carried out. After the segmentation is finished, the registered single target region R and the segmentation result S are subjected to energy filtering high-low frequency fusion rule₀Fusing to reconstruct the final fused image E_iThe image contour accuracy of the cascading segmentation and fusion can be obviously improved.

Claims

1. a kind of image superpixel target pedestrian segmentation method based on cascade, is characterized in that, comprises the steps:

The first step is to send the source image to the instance segmentation channel, output the instance segmentation image, split the instance segmentation image and extract the single target area and segmentation result:

(M,R,S ₀ )=MASKRCNN(I ₀ ) (1)

Among them, MASKRCNN is the instance segmentation function, I ₀ is the input source image, R is the single target region split and extracted in the instance segmentation image, S ₀ is the segmentation result split and extracted in the instance segmentation image, and M is the instance segmentation image;

in:

The instance segmentation image M is an image that does not perform any processing on the source image after instance segmentation;

The single target area R is the single target area map obtained by splitting and extracting the instance segmentation image M, and its range is larger than the range of the target detection frame; formula (2) is the calculation formula for the number of single target areas R:

B=A±X(A∈(0, N ⁺ ), B∈(0, N ⁺ ), X∈(0, N ⁺ )) (2)

The source image contains A target objects, and after instance segmentation, contains B single-target regions R, where X represents the number of people with instance segmentation target detection errors;

The segmentation result S ₀ is the contour image obtained by splitting and extracting the instance segmentation image M, and formula (3) is the calculation formula of the number of segmentation results S ₀ :

N=J±X(N∈(0, N ⁺ ), J∈(0, N ⁺ ), X∈(0, N ⁺ )) (3)

The instance segmentation image M contains J target objects, and after being split and extracted, it contains N segmentation results S ₀ , where X represents the number of people with instance segmentation target detection errors;

The second step is to send the single target area R to the superpixel segmentation channel, and output the superpixel segmentation image with labels;

Q _GK = SLIC(R) (4)

Among them, SLIC is the superpixel segmentation function, R is the extracted single target region in the instance segmentation image, and Q _GK is the superpixel segmentation image containing K labels;

The third step is to merge the superpixel blocks with similar characteristics in the adjacent superpixel blocks in the superpixel segmentation image Q _GK , and replace the K superpixel blocks in the superpixel segmentation image with N superpixel coloring information blocks. Construct a more accurate outline of the target object;

P _N =C _slic (Q _GK ) (5)

Among them, C _slic is the SLIC merging function, Q _GK is the superpixel segmentation image containing K labels, and P _N is the reconstructed target object contour;

In the fourth step, the segmentation result S ₀ and the target object outline P _N are fused to reconstruct a cascaded segmentation and fusion image E _i .

E _i =NSST(P _N ,S ₀ ) (6)

Among them, NSST is the non-lower shearlet transform multi-scale analysis function, P _N is the contour of the target object, S ₀ is the segmentation result split and extracted from the instance segmentation image, and E _i is the reconstructed final fusion image;

The energy filtering high and low frequency fusion rules are used in image fusion: the registered segmentation result S ₀ and the target object contour P are pre-fused using the energy filtering high and low frequency fusion rules, and the low frequency information fusion adopts the fusion rule based on the image guided filter to fuse the low frequency information. The coefficients are fused to obtain low-frequency fusion coefficients. High-frequency information fusion For all the pixels in the superpixel Q _GK , use its spatial coordinates to find the corresponding high-frequency coefficients, and assign the corresponding label superpixel blocks, and then gather the coefficients with the same label into a super-coefficient block and obtain the high-frequency fusion coefficient by calculating the spatial frequency of each super-coefficient block. Finally, perform NSST inverse transformation on the high-frequency fusion coefficient and the low-frequency fusion coefficient to reconstruct the final fusion image E _i .

2. the image superpixel target pedestrian segmentation method based on cascade type as claimed in claim 1, is characterized in that, superpixel block feature merging step is as follows:

1) Set and sort the superpixel blocks, and calculate the feature difference between color and spatial distance between adjacent superpixel blocks in the figure. The calculation formula is as follows:

In formula (9), the LAB vector adopts the chromaticity space model of CIELAB, D _LAB (R _i ) is the color space distance between superpixel blocks, R′ represents the non-target area, and l _i and l _j are the components of pixel brightness. , a _i , a _j , b _i , b _j are the color components, D _XY (R _i ) is the position space distance, the x _i , x _j , y _i , y _j vectors obtain the spatial coordinate value of the pixel, D (R _i ) is the superpixel distance, δ is the distance weight coefficient, and δ∈(0,1);

2) Compare the calculation result with the preset threshold, if the feature result is less than the threshold, merge the target superpixel block with the adjacent superpixel block, if it is greater than the threshold, ignore the target pixel block, and continue to be the next superpixel block. carry out feature inspection;

The correlation degree of the superpixel region is determined according to the superpixel distance, and the calculation formula is as follows:

C(R _i )=1-exp(-D(R _i )) (10)

In formula (10), C(R _i ) represents the superpixel regional correlation, D(R _i ) is the superpixel distance, and the superpixel distance is negatively correlated with the regional correlation. According to the correlation, determine whether the superpixel block conforms to the characteristic information of the same target;

According to the calculation of the regional correlation of all superpixels, the maximum inter-class difference method is used to obtain the regional correlation threshold, and all superpixel blocks that meet the correlation threshold are extracted as target superpixels. The calculation formula is as follows:

In formula (11), R ^* represents the set of target superpixels finally obtained, R _i is the target superpixel at i, C(R _i ) represents the superpixel region correlation,

is the regional correlation threshold, ε is the correlation threshold coefficient, and ε=0.5. When ε is 0.5, the feature information can be better divided into different pixel sets, and each subset obtained forms a corresponding to the real scene. Areas, each area has consistent attributes, but adjacent areas do not have the same attributes;

3) Iterate the above steps until all superpixel blocks in the image complete a feature comparison, and then generate the first merged result map at this time;

4) Before the second merging, refresh the feature information of the superpixel blocks and rearrange the order, and then use the first merged result as the object of this merge operation to merge until the superpixel blocks in the first merged result have completed the feature comparison. , to generate the second merged result graph.