CN112541930A - Image super-pixel target pedestrian segmentation method based on cascade connection - Google Patents
Image super-pixel target pedestrian segmentation method based on cascade connection Download PDFInfo
- Publication number
- CN112541930A CN112541930A CN201910900391.5A CN201910900391A CN112541930A CN 112541930 A CN112541930 A CN 112541930A CN 201910900391 A CN201910900391 A CN 201910900391A CN 112541930 A CN112541930 A CN 112541930A
- Authority
- CN
- China
- Prior art keywords
- image
- segmentation
- target
- super
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 153
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000004927 fusion Effects 0.000 claims abstract description 47
- 238000001914 filtration Methods 0.000 claims abstract description 12
- 238000004364 calculation method Methods 0.000 claims description 21
- 238000001514 detection method Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 7
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 3
- 238000004040 coloring Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 230000008707 rearrangement Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000003709 image segmentation Methods 0.000 abstract description 15
- 230000004931 aggregating effect Effects 0.000 abstract 2
- 238000007781 pre-processing Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000009977 dual effect Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000004072 lung Anatomy 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 210000000115 thoracic cavity Anatomy 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 210000000038 chest Anatomy 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20192—Edge enhancement; Edge preservation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of image segmentation, and particularly relates to a cascading-based image superpixel target pedestrian segmentation method, which comprises the following steps of: s1, performing one-stage segmentation by taking Mask R-CNN as coarse granularity, and extracting a target region and a segmentation result by combining an energy filtering criterion; s2, sending the single target area to a super-pixel segmentation channel, and outputting a super-pixel result with a label; s3, calculating the characteristic similar difference values of adjacent superpixel blocks in the superpixel segmentation image, aggregating the superpixel blocks into a superpixel block, and aggregating a more accurate target object outline; and S4, performing fusion operation on the target result extracted by the first-stage coarse-grained segmentation and the target object contour extracted by the second-stage fine-grained segmentation by using an energy high-low frequency fusion rule, and reconstructing a final fusion image.
Description
Technical Field
The invention belongs to the technical field of image segmentation, and particularly relates to a pedestrian segmentation method based on a cascading super-pixel target.
Background
The target pedestrian segmentation technology is an important research subject in the field of machine vision, and with the development of autonomous automobiles and driving assistance systems, how to protect the safety of pedestrians and vehicles becomes a research hotspot at present. The image segmentation technology is a preprocessing part in an image recognition and computer vision system, and can provide target characteristics for recognition and tracking by effectively segmenting an image, so that the interference of redundant information on post-image processing is avoided.
Currently, object segmentation based on color features can be divided into two categories, color image and grayscale image, and image segmentation techniques based on generation can be divided into two categories, region and feature. Aiming at different fields, the segmentation precision can be effectively improved by adopting a suitable image segmentation method. The current mainstream segmentation methods can be mainly divided into a region method, a threshold value method, an edge method and an interactive segmentation method. The patent application number is CN201910020730, which is entitled "an image segmentation method and apparatus", and discloses that an image to be segmented is obtained from a first image to be segmented and a second image to be segmented and the segmented images are input into a pre-trained image segmentation model, and correlation parameters of the segmented images are calculated by extracting feature vectors of the segmented images, so as to predict the segmented images. The patent application number is CN201811472018 entitled "chest segmentation and processing method, system and electronic device", which discloses that an image data set is trained by using a deep learning algorithm to obtain an image segmentation model based on deep learning, then an image to be segmented is processed by the image segmentation model to obtain a segmented lung region and a thoracic region, and finally a lung-chest ratio is calculated according to the lung region and the thoracic region. The method focuses on calculating the lung-chest ratio, but the precision of the method is still not accurate enough in terms of the segmentation result, and if the precision can be improved, the calculated lung-chest ratio result can be obviously more accurate. The patent application number is CN201810860392, which is named as a target detection network design method fusing image segmentation features and discloses a method for realizing small target image segmentation by combining a general target detection frame Mask-RCNN and image segmentation features. When the method is applied to small targets, the segmentation precision is relatively high, but the segmentation precision of large targets is still not accurate enough. The patent application number is CN201810860454, which is named as a pedestrian detection method based on Mask-RCNN, and aims at the condition that an in-vehicle target is mistakenly detected as a pedestrian, the method combines the characteristics that the Mask-RCNN can simultaneously carry out target detection and target segmentation, and provides an optimization algorithm combined with a target segmentation result.
Disclosure of Invention
In order to improve the accuracy of image segmentation of the target pedestrian, the invention provides a cascading-type image superpixel target pedestrian segmentation method, which provides more accurate preprocessing information for subsequent work of a computer vision system by establishing a target pedestrian segmentation model.
The invention realizes the above aim by the following technical scheme: a cascading-based image super-pixel target pedestrian segmentation method comprises the following steps:
step 1, sending a source image to an example segmentation channel, outputting an example segmentation image, splitting the example segmentation image, and extracting a single-target area and a segmentation result:
(M,R,S0)=MASKRCNN(I0) (1)
wherein MASKRCNN is an example partition function, I0For inputting a source image, R is a single target area split and extracted from an example segmentation image, S0Splitting and extracting a segmentation result in the example segmentation image, wherein M is the example segmentation image;
wherein:
the example segmentation image M is an image which is not processed after the source image is subjected to example segmentation;
the single target area R is a single target area image obtained by splitting and extracting the example segmentation image M, and the range of the single target area R is larger than that of the target detection frame; formula (2) is a calculation formula of the number of the single target regions R:
B=A±X(A∈(0,N+),B∈(0,N+),X∈(0,N+)) (2)
the source image contains A target objects, and B single target areas R after example segmentation, wherein X represents the number of error detection people of example segmentation targets;
segmentation result S0For example, a contour image obtained by splitting and extracting an image M is segmented, and a segmentation result S is represented by formula (3)0The calculation formula of the number is as follows:
N=J±X(N∈(0,N+),J∈(0,N+),X∈(0,N+)) (3)
the example segmentation image M comprises J target objects, and N segmentation results S after the J target objects are split and extracted0Wherein X represents the number of error detection persons of the example division target;
step 2, sending the single target region R to a super-pixel segmentation channel, and outputting a super-pixel segmentation image with a label;
QGK=SLIC(R) (4)
wherein SLIC is a superpixel segmentation function, R is an extracted single-target region in an example segmentation image, QGKSegmenting the image for K tagged superpixels;
step 3, dividing the super pixel into an image QGKMerging the super-pixel blocks with similar characteristics in the middle adjacent super-pixel blocks, replacing K super-pixel blocks in the super-pixel segmentation image with N super-pixel coloring information blocks, and finally reconstructing a more accurate target object outline;
PN=Cslic(QGK) (5)
wherein, CslicFor combining functions, Q, of SLICGKSegmenting an image for K tagged superpixels, PNIs a reconstructed target object contour;
step 4, dividing the result S0And a target object profile PNFusing and reconstructing a cascaded segmentation fused image Ei。
Ei=NSST(PN,S0) (6)
Wherein NSST is a non-down-shear wave transform multi-scale analysis function, PNIs the contour of the target object, S0For example segmentation results split and extracted in segmented images, EiIs a reconstructed final fused image;
the image fusion uses an energy filtering high-low frequency fusion rule: for the registered segmentation result S0And pre-fusing the target object contour P by adopting an energy filtering high-low frequency fusion rule, fusing the low-frequency coefficient by adopting a fusion rule based on an image guide filter in the low-frequency information fusion, and obtaining the low-frequency fusion coefficient. High frequency information fusion for superpixel QGKAnd then, the coefficients with the same label are gathered into a super-coefficient block, and the spatial frequency of each super-coefficient block is solved to obtain a high-frequency fusion coefficient. Finally, NSST inverse transformation is carried out on the high-frequency fusion coefficient and the low-frequency fusion coefficient, and a final fusion image E is reconstructedi。
Further, the super-pixel block feature merging step is as follows:
1) setting and sequencing superpixel blocks, and calculating the characteristic difference of color and space distance of adjacent superpixel blocks in the graph by the following calculation formula:
in the formula (9), the LAB vector adopts a CIELAB color space model, DLAB(Ri) For the inter-superpixel block color space distance, R' denotes the non-target region, liAnd ljIs a component of the pixel brightness, ai、aj、bi、bjBeing a component of a color, DXY(Ri) Is a position space distance, xi、xj、yi、yjThe vector obtains the spatial coordinate value of the pixel, D (R)i) Is the superpixel distance, δ is the distance weight coefficient, and δ belongs to (0, 1);
2) comparing the calculation result with a preset threshold, merging the target super-pixel block and the adjacent super-pixel block if the characteristic result is smaller than the threshold, ignoring the target pixel block if the characteristic result is larger than the threshold, and continuing to perform characteristic inspection on the next super-pixel block;
determining the correlation degree of the super-pixel area according to the super-pixel distance, wherein the calculation formula is as follows:
C(Ri)=1-exp(-D(Ri)) (10)
in the formula (10), C (R)i) Representing the degree of super-pixel area correlation, D (R)i) The super-pixel distance is inversely related to the area correlation. Determining whether the superpixel blocks accord with the characteristic information of the same target or not according to the correlation;
according to the calculation of the regional relevance of all superpixels, a regional relevance threshold value is calculated by utilizing a maximum inter-class difference method, all superpixel blocks meeting the relevance threshold value are extracted as target superpixels, and the calculation formula is as follows:
in the formula (11), R*Representing the set of target superpixels finally acquired, RiIs the target superpixel at i, C (R)i) Representing the degree of correlation of the super-pixel area,the method comprises the steps of obtaining a region correlation threshold value, wherein epsilon is a correlation threshold value coefficient, epsilon is 0.5, when epsilon is 0.5, characteristic information can be better divided into different pixel sets, each obtained subset forms a region corresponding to a real scene, the interior of each region has consistent attributes, and adjacent regions do not have the consistent attributes;
3) iterating the steps until all the superpixel blocks in the image complete one-time feature comparison, and generating a first merging result image at the moment;
4) before the second combination, refreshing the characteristic information and the rearrangement sequence of the superpixel blocks, and then combining the first combination result as an object of the combination operation until the superpixel blocks in the first combination result complete characteristic comparison to generate a second combination result graph.
Through the technical scheme, the invention has the beneficial effects that: the existing image segmentation method is basically used for segmenting a source image, the extracted target feature result is not accurate enough, and especially the edge contour effect of the segmented target feature is not ideal. The method adopts a cascading-type super-pixel segmentation method to carry out cascading type segmentation on a source image, finally uses an energy filtering high-low frequency fusion rule to realize sparse representation on the image in each direction and each scale, overcomes the pseudo Gibbs effect, finally improves the segmentation precision of image preprocessing, and provides a beneficial segmentation basis for subsequent identification and tracking
Drawings
FIG. 1 is a block diagram of cascaded superpixel splitting logic;
FIG. 2 is a simplified model diagram of imaging by a vehicle-mounted camera;
FIG. 3 is a single person source image from an in-vehicle perspective;
FIG. 4 is a vehicle-mounted perspective single-pedestrian example segmentation image
FIG. 5 is a vehicle view single person superpixel segmentation image;
FIG. 6 is a schematic view of a vehicle-mounted perspective single-target pedestrian fusion sign;
FIG. 7 is a vehicle view dual target pedestrian source image;
FIG. 8 is an on-board perspective dual-target pedestrian example segmentation image;
FIG. 9 is a schematic view of a vehicle-mounted perspective dual-target pedestrian fusion sign;
FIG. 10 is a dim dual target pedestrian source image;
FIG. 11 is a dim binocular example segmented image of a pedestrian;
fig. 12 is a schematic diagram of a dim-out dual target pedestrian fusion marker.
Detailed Description
The invention is further described with reference to the accompanying drawings and the specific classification procedures:
a logic block diagram of a cascade-based image super-pixel target pedestrian segmentation method is shown in FIG. 1, and the method comprises the following specific implementation steps:
step 1: sending the source image to an example segmentation channel, outputting an example segmentation image, and splitting and extracting a single target region R and a segmentation result S on the basis of the example segmentation image0;
Step 2: sending the single target region R to the super-pixel segmentation channel, and outputting the super-pixel segmentation image Q with labelsGK;
And 3, step 3: segmenting the superpixel into an image QGKMerging and reconstructing super pixel blocks with similar characteristics in middle and adjacent super pixel blocks to obtain more accurate target object contour QGK;
Step 4, dividing the result S0And a target object profile PNFusing and reconstructing a final cascade segmentation fused image Ei。
The specific scheme is as follows:
the invention is different from the existing target pedestrian segmentation algorithm, provides a cascading-type super-pixel-based target pedestrian segmentation method, and provides more accurate preprocessing information for the follow-up work of a computer vision system by establishing a target pedestrian segmentation model.
The invention realizes the above aim by the following technical scheme:
step 1, sending a source image to an example segmentation channel, outputting an example segmentation image, and splitting and extracting a single target area and a segmentation result on the basis of the example segmentation image.
(M,R,S0)=MASKRCNN(I0) (1)
Wherein MASKRCNN is an example partition function, I0For inputting a source image (length and width 2)6Multiple source images), R is an extracted single-target region in the example segmentation image, S0The extracted segmentation result is split in example segmentation, and M is an example segmentation image.
Definition 1: the example segmentation image M is an image which is not processed after the source image is subjected to example segmentation.
Definition 2: the single target area R is an image obtained by splitting and extracting the example segmentation image M, and the range of the single target area R is certainly larger than that of the target detection frame. Formula (2) is a calculation formula of the number of the single target regions R:
B=A±X(A∈(0,N+),B∈(0,N+),X∈(0,N+)) (2)
the source image comprises A target objects, B single target areas R after example segmentation, and X represents the number of error detection persons of the example segmentation target.
Definition 3: segmentation result S0For example, the contour image obtained by splitting and extracting the image M is divided, and the formula (3) is a division result S0The calculation formula of the number is as follows:
N=J±X(N∈(0,N+),J∈(0,N+),X∈(0,N+)) (3)
here, the example segmented image M contains J target objects, and after the splitting extraction, N segmentation results S0In the formula, X represents the number of detection errors of the example division target.
And 2, sending the single target region R to a superpixel segmentation channel, and outputting a super pixel segmentation image with a label.
QGK=SLIC(R) (4)
Wherein SLIC is a superpixel segmentation function, R is an extracted single-target region in an example segmentation image, QGKThe image is segmented for the superpixel containing K labeled superpixels.
Step 3, dividing the super pixel into an image QGKAnd combining the super-pixel blocks with similar characteristics in the middle adjacent super-pixel blocks to realize that K super-pixel blocks in the super-pixel segmentation image are replaced by N super-pixel coloring information blocks, and finally reconstructing a more accurate target object outline.
PN=Cslic(QGK) (5)
Wherein, CslicFor combining functions, Q, of SLICGKSegmenting an image for K tagged superpixels, PNIs the reconstructed contour of the target object.
Step 4, dividing the result S0And a target object profile PNFusing and reconstructing a final cascade segmentation fused image Ei。
Ei=NSST(PN,S0) (6)
Wherein NSST is a non-down-shear wave transform multi-scale analysis function, PNIs the contour of the target object, S0For splitting the extracted segmentation result in the instance segmentation, EiIs the reconstructed final fused image.
Energy filtering high-low frequency fusion rule: for the registered segmentation result S0And pre-fusing the target object contour P by adopting an energy filtering high-low frequency fusion rule, fusing the low-frequency coefficient by adopting a fusion rule based on an image guide filter in the low-frequency information fusion, and obtaining the low-frequency fusion coefficient. High frequency information fusion for superpixel QGKAnd then, the coefficients with the same label are gathered into a super-coefficient block, and the spatial frequency of each super-coefficient block is solved to obtain a high-frequency fusion coefficient. Finally, NSST inverse transformation is carried out on the high-frequency fusion coefficient and the low-frequency fusion coefficient, and a final fusion image E is reconstructedi。
The super pixel block feature merging step is as follows:
1) setting and sequencing superpixel blocks, and calculating the characteristic difference of color and space distance of adjacent superpixel blocks in the graph by the following calculation formula:
in the formula (9), the LAB vector adopts a CIELAB color space model, DLAB(Ri) For the inter-superpixel block color space distance, R' denotes the non-target region, liAnd ljIs a component of the pixel brightness, ai、aj、bi、bjBeing a component of a color, DXY(Ri) Is a position space distance, xi、xj、yi、yjThe vector obtains the spatial coordinate value of the pixel, D (R)i) Is the superpixel distance, δ is the distance weight coefficient, and δ belongs to (0, 1);
2) comparing the calculation result with a preset threshold, merging the target super-pixel block and the adjacent super-pixel block if the characteristic result is smaller than the threshold, ignoring the target pixel block if the characteristic result is larger than the threshold, and continuing to perform characteristic inspection on the next super-pixel block;
determining the correlation degree of the super-pixel area according to the super-pixel distance, wherein the calculation formula is as follows:
C(Ri)=1-exp(-D(Ri)) (10)
in the formula (10), C (R)i) Representing the degree of super-pixel area correlation, D (R)i) The super-pixel distance is inversely related to the area correlation. Determining whether the superpixel blocks accord with the characteristic information of the same target or not according to the correlation;
according to the calculation of the regional relevance of all superpixels, a regional relevance threshold value is calculated by utilizing a maximum inter-class difference method, all superpixel blocks meeting the relevance threshold value are extracted as target superpixels, and the calculation formula is as follows:
in the formula (11), R*Representing the set of target superpixels finally acquired, RiIs the target superpixel at i, C (R)i) Representing the degree of correlation of the super-pixel area,the method comprises the steps of obtaining a region correlation threshold value, wherein epsilon is a correlation threshold value coefficient, epsilon is 0.5, when epsilon is 0.5, characteristic information can be better divided into different pixel sets, each obtained subset forms a region corresponding to a real scene, the interior of each region has consistent attributes, and adjacent regions do not have the consistent attributes;
3) iterating the steps until all the superpixel blocks in the image complete one-time feature comparison, and generating a first merging result image at the moment;
4) before the second combination, refreshing the characteristic information and the rearrangement sequence of the superpixel blocks, and then combining the first combination result as an object of the combination operation until the superpixel blocks in the first combination result complete characteristic comparison to generate a second combination result graph.
The existing image segmentation method is basically used for segmenting a source image, the extracted target feature result is not accurate enough, and especially the edge contour effect of the segmented target feature is not ideal. The method adopts a cascading-type super-pixel segmentation method to carry out cascading type segmentation on a source image, and finally uses an energy filtering high-low frequency fusion rule to realize sparse representation on the image in each direction and each scale, so that the pseudo Gibbs effect is overcome, the segmentation precision of image preprocessing is finally improved, and a beneficial segmentation basis is provided for subsequent identification and tracking. That is to say, the method has the problems of small target false detection, missing detection, inaccurate segmentation result of the overlapped part and the like due to the fact that the Mask-RCNN is independently used, and the method segments the image by establishing a cascading super-pixel segmentation system. Firstly, carrying out example segmentation through Mask-RCNN, splitting and extracting a single target region R and a segmentation result S after segmentation0Then, the single target region R is subjected to superpixel segmentation to obtain a superpixel segmentation image QGKFinally, corresponding fusion rules are formulated to divide the superpixel into the image QGKAnd the segmentation result S0Fusing to reconstruct the final fused image Ei. According to the method, the super-pixel single-target segmentation is carried out on the result of the Mask-RCNN example segmentation, so that the segmentation precision can be improved, and more accurate preprocessing information is provided for the follow-up work of a computer vision system.
Example 1:
vehicle-mounted visual angle single-target pedestrian segmentation condition
A vehicle-mounted camera model is established by utilizing a geometrical relation, wherein the height of a target in an image plane is set to be h, the height of the target in the real world is 168cm, the focal length of a camera is set to be 12.25cm, the actual distance between the target and the camera is 145cm, and the pedestrian target moves at the speed of about 1.5m/s in a video and keeps moving linearly without changing the moving speed. By way of example segmentation, it can be observed that the segmentation of the contour of the target person in fig. 4 is not accurate enough. On the basis of the above-mentioned accuracy, it can be raised, and transferred into super-pixel input channel, and the null can be setThe inter-distance weight value is 65, the number of divided blocks is 225, the initial step size is 5, and superpixel division is performed. After the segmentation is finished, the registered single target region R and the segmentation result S are subjected to energy filtering high-low frequency fusion rule0Fusing to reconstruct the final fused image EiThe image contour accuracy of the cascade segmentation fusion can be obviously improved.
Example 2:
vehicle-mounted visual angle dual-target pedestrian segmentation condition
The vehicle-mounted camera model is established by using a geometrical relation, wherein the height of a target in an image plane is set to be h, the height of a target A in the real world is 168cm, the height of a target B in the real world is 165cm, the focal length of a camera is set to be 11.25cm, the actual distance between the target A and the camera is 120cm, and the actual distance between the target B and the camera is 195 cm. The video contains the double-pedestrian target, moves oppositely at the speed of about 1.4m/s, and keeps moving linearly without changing the moving speed. By way of example segmentation, it can be observed that the segmentation of the contour of the target person in fig. 8 is not accurate enough. On the basis of the above, the precision is improved, and the super-pixel is sent to a super-pixel input channel, the spatial distance weighted value is set to be 75, the number of the segmentation blocks is set to be 150, the initial step size is set to be 5, and super-pixel segmentation is carried out. After the segmentation is finished, the registered single target region R and the segmentation result S are subjected to energy filtering high-low frequency fusion rule0Fusing to reconstruct the final fused image EiThe image contour accuracy of the cascading segmentation and fusion can be obviously improved.
Example 3:
dim environment dual-target pedestrian segmentation condition
And establishing a camera model by using a geometrical relation, wherein the height of the target in an image plane is set as h, the height of a target A in the real world is 175cm, the height of a target B in the real world is 165cm, the focal length of the camera is set as 12.45cm, the actual distance between the target A and the camera is 115cm, and the actual distance between the target B and the camera is 105 cm. The twin pedestrian objects A, B move in opposite directions at a speed of about 0.5m/s, and both keep moving straight without changing the moving speed. By example segmentation, a graph can be observedThe segmentation of the contour of the target person in 11 is not accurate enough. On the basis, the precision is improved, the super-pixel is sent to a super-pixel input channel, a spatial distance weighted value is set to be 80, the number of segmentation blocks is set to be 200, the initial step size is set to be 6, and super-pixel segmentation is carried out. After the segmentation is finished, the registered single target region R and the segmentation result S are subjected to energy filtering high-low frequency fusion rule0Fusing to reconstruct the final fused image EiThe image contour accuracy of the cascading segmentation and fusion can be obviously improved.
Claims (2)
1. A cascading-based image super-pixel target pedestrian segmentation method is characterized by comprising the following steps:
step 1, sending a source image to an example segmentation channel, outputting an example segmentation image, splitting the example segmentation image, and extracting a single-target area and a segmentation result:
(M,R,S0)=MASKRCNN(I0) (1)
wherein MASKRCNN is an example partition function, I0For inputting a source image, R is a single target area split and extracted from an example segmentation image, S0Splitting and extracting a segmentation result in the example segmentation image, wherein M is the example segmentation image;
wherein:
the example segmentation image M is an image which is not processed after the source image is subjected to example segmentation;
the single target area R is a single target area image obtained by splitting and extracting the example segmentation image M, and the range of the single target area R is larger than that of the target detection frame; formula (2) is a calculation formula of the number of the single target regions R:
B=A±X(A∈(0,N+),B∈(0,N+),X∈(0,N+)) (2)
the source image contains A target objects, and B single target areas R after example segmentation, wherein X represents the number of error detection people of example segmentation targets;
segmentation result S0Segmenting an outline image obtained by splitting and extracting an image M for an example(3) As a result of the segmentation S0The calculation formula of the number is as follows:
N=J±X(N∈(0,N+),J∈(0,N+),X∈(0,N+)) (3)
the example segmentation image M comprises J target objects, and N segmentation results S after the J target objects are split and extracted0Wherein X represents the number of error detection persons of the example division target;
step 2, sending the single target region R to a super-pixel segmentation channel, and outputting a super-pixel segmentation image with a label;
QGK=SLIC(R) (4)
wherein SLIC is a superpixel segmentation function, R is an extracted single-target region in an example segmentation image, QGKSegmenting the image for K tagged superpixels;
step 3, dividing the super pixel into an image QGKMerging the super-pixel blocks with similar characteristics in the middle adjacent super-pixel blocks, replacing K super-pixel blocks in the super-pixel segmentation image with N super-pixel coloring information blocks, and finally reconstructing a more accurate target object outline;
PN=Cslic(QGK) (5)
wherein, CslicFor combining functions, Q, of SLICGKSegmenting an image for K tagged superpixels, PNIs a reconstructed target object contour;
step 4, dividing the result S0And a target object profile PNFusing and reconstructing a cascaded segmentation fused image Ei。
Ei=NSST(PN,S0) (6)
Wherein NSST is a non-down-shear wave transform multi-scale analysis function, PNIs the contour of the target object, S0For example segmentation results split and extracted in segmented images, EiIs a reconstructed final fused image;
the image fusion uses an energy filtering high-low frequency fusion rule: for the registered segmentation result S0And pre-fusing the target object contour P by adopting an energy filtering high-low frequency fusion rule, fusing the low-frequency coefficient by adopting a fusion rule based on an image guide filter in the low-frequency information fusion, and obtaining the low-frequency fusion coefficient. High frequency information fusion for superpixel QGKAnd then, the coefficients with the same label are gathered into a super-coefficient block, and the spatial frequency of each super-coefficient block is solved to obtain a high-frequency fusion coefficient. Finally, NSST inverse transformation is carried out on the high-frequency fusion coefficient and the low-frequency fusion coefficient, and a final fusion image E is reconstructedi。
2. The method for segmenting the pedestrian of the super-pixel target based on the cascading image as claimed in claim 1, wherein the super-pixel block feature merging step is as follows:
1) setting and sequencing superpixel blocks, and calculating the characteristic difference of color and space distance of adjacent superpixel blocks in the graph by the following calculation formula:
in the formula (9), the LAB vector adopts a CIELAB color space model, DLAB(Ri) For the inter-superpixel block color space distance, R' denotes the non-target region, liAnd ljIs a component of the pixel brightness, ai、aj、bi、bjBeing a component of a color, DXY(Ri) Is a position space distance, xi、xj、yi、yjThe vector obtains the spatial coordinate value of the pixel, D (R)i) Is the superpixel distance, δ is the distance weight coefficient, and δ belongs to (0, 1);
2) comparing the calculation result with a preset threshold, merging the target super-pixel block and the adjacent super-pixel block if the characteristic result is smaller than the threshold, ignoring the target pixel block if the characteristic result is larger than the threshold, and continuing to perform characteristic inspection on the next super-pixel block;
determining the correlation degree of the super-pixel area according to the super-pixel distance, wherein the calculation formula is as follows:
C(Ri)=1-exp(-D(Ri)) (10)
in the formula (10), C (R)i) Representing the degree of super-pixel area correlation, D (R)i) The super-pixel distance is inversely related to the area correlation. Determining whether the superpixel blocks accord with the characteristic information of the same target or not according to the correlation;
according to the calculation of the regional relevance of all superpixels, a regional relevance threshold value is calculated by utilizing a maximum inter-class difference method, all superpixel blocks meeting the relevance threshold value are extracted as target superpixels, and the calculation formula is as follows:
in the formula (11), R*Representing the set of target superpixels finally acquired, RiIs the target superpixel at i, C (R)i) Representing the degree of correlation of the super-pixel area,the method comprises the steps of obtaining a region correlation threshold value, wherein epsilon is a correlation threshold value coefficient, epsilon is 0.5, when epsilon is 0.5, characteristic information can be better divided into different pixel sets, each obtained subset forms a region corresponding to a real scene, the interior of each region has consistent attributes, and adjacent regions do not have the consistent attributes;
3) iterating the steps until all the superpixel blocks in the image complete one-time feature comparison, and generating a first merging result image at the moment;
4) before the second combination, refreshing the characteristic information and the rearrangement sequence of the superpixel blocks, and then combining the first combination result as an object of the combination operation until the superpixel blocks in the first combination result complete characteristic comparison to generate a second combination result graph.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910900391.5A CN112541930A (en) | 2019-09-23 | 2019-09-23 | Image super-pixel target pedestrian segmentation method based on cascade connection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910900391.5A CN112541930A (en) | 2019-09-23 | 2019-09-23 | Image super-pixel target pedestrian segmentation method based on cascade connection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112541930A true CN112541930A (en) | 2021-03-23 |
Family
ID=75012996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910900391.5A Pending CN112541930A (en) | 2019-09-23 | 2019-09-23 | Image super-pixel target pedestrian segmentation method based on cascade connection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112541930A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298809A (en) * | 2021-06-25 | 2021-08-24 | 成都飞机工业(集团)有限责任公司 | Composite material ultrasonic image defect detection method based on deep learning and superpixel segmentation |
CN113807349A (en) * | 2021-09-06 | 2021-12-17 | 海南大学 | Multi-view target identification method and system based on Internet of things |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150234863A1 (en) * | 2014-02-18 | 2015-08-20 | Environmental Systems Research Institute (ESRI) | Automated feature extraction from imagery |
CN106780505A (en) * | 2016-06-20 | 2017-05-31 | 大连民族大学 | Super-pixel well-marked target detection algorithm based on region energy |
US20180247126A1 (en) * | 2017-02-24 | 2018-08-30 | Beihang University | Method and system for detecting and segmenting primary video objects with neighborhood reversibility |
CN109360163A (en) * | 2018-09-26 | 2019-02-19 | 深圳积木易搭科技技术有限公司 | A kind of fusion method and emerging system of high dynamic range images |
-
2019
- 2019-09-23 CN CN201910900391.5A patent/CN112541930A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150234863A1 (en) * | 2014-02-18 | 2015-08-20 | Environmental Systems Research Institute (ESRI) | Automated feature extraction from imagery |
CN106780505A (en) * | 2016-06-20 | 2017-05-31 | 大连民族大学 | Super-pixel well-marked target detection algorithm based on region energy |
US20180247126A1 (en) * | 2017-02-24 | 2018-08-30 | Beihang University | Method and system for detecting and segmenting primary video objects with neighborhood reversibility |
CN109360163A (en) * | 2018-09-26 | 2019-02-19 | 深圳积木易搭科技技术有限公司 | A kind of fusion method and emerging system of high dynamic range images |
Non-Patent Citations (2)
Title |
---|
DAWEI YANG等: "A Superpixel Segmentation Algorithm with Region Correlation Saliency Analysis for Video Pedestrian Detection", PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE * |
马雪 等: "粗细粒度超像素行人目标分割算法", 大连民族大学学报 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298809A (en) * | 2021-06-25 | 2021-08-24 | 成都飞机工业(集团)有限责任公司 | Composite material ultrasonic image defect detection method based on deep learning and superpixel segmentation |
CN113298809B (en) * | 2021-06-25 | 2022-04-08 | 成都飞机工业(集团)有限责任公司 | Composite material ultrasonic image defect detection method based on deep learning and superpixel segmentation |
CN113807349A (en) * | 2021-09-06 | 2021-12-17 | 海南大学 | Multi-view target identification method and system based on Internet of things |
CN113807349B (en) * | 2021-09-06 | 2023-06-20 | 海南大学 | Multi-view target identification method and system based on Internet of things |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Petrovai et al. | Exploiting pseudo labels in a self-supervised learning framework for improved monocular depth estimation | |
Dai et al. | Multi-task faster R-CNN for nighttime pedestrian detection and distance estimation | |
Bruls et al. | The right (angled) perspective: Improving the understanding of road scenes using boosted inverse perspective mapping | |
CN111079685A (en) | 3D target detection method | |
CN110826389B (en) | Gait recognition method based on attention 3D frequency convolution neural network | |
Salih et al. | Comparison of stochastic filtering methods for 3D tracking | |
CN104966054B (en) | Detection method of small target in unmanned plane visible images | |
CN112215074A (en) | Real-time target identification and detection tracking system and method based on unmanned aerial vehicle vision | |
CN106529441B (en) | Depth motion figure Human bodys' response method based on smeared out boundary fragment | |
Tarchoun et al. | Hand-Crafted Features vs Deep Learning for Pedestrian Detection in Moving Camera. | |
Liu et al. | Vision-based environmental perception for autonomous driving | |
Burlacu et al. | Obstacle detection in stereo sequences using multiple representations of the disparity map | |
CN115063717A (en) | Video target detection and tracking method based on key area live-action modeling | |
CN114973199A (en) | Rail transit train obstacle detection method based on convolutional neural network | |
CN112541930A (en) | Image super-pixel target pedestrian segmentation method based on cascade connection | |
CN103268482A (en) | Low-complexity gesture extracting and gesture depth acquiring method | |
CN115880662A (en) | 3D target detection method for autonomous driving by utilizing synergistic effect of heterogeneous sensors | |
CN118097150A (en) | Small sample camouflage target segmentation method | |
Harianto et al. | Data augmentation and faster rcnn improve vehicle detection and recognition | |
Tu et al. | A biologically inspired vision-based approach for detecting multiple moving objects in complex outdoor scenes | |
Teutsch | Moving object detection and segmentation for remote aerial video surveillance | |
CN114494594A (en) | Astronaut operating equipment state identification method based on deep learning | |
Dike et al. | Unmanned aerial vehicle (UAV) based running person detection from a real-time moving camera | |
Rahaman et al. | Lane detection for autonomous vehicle management: PHT approach | |
CN112396637A (en) | Dynamic behavior identification method and system based on 3D neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210323 |
|
RJ01 | Rejection of invention patent application after publication |