CN112613427B

CN112613427B - Road obstacle detection method based on visual information flow partition projection coding model

Info

Publication number: CN112613427B
Application number: CN202011578651.0A
Authority: CN
Inventors: 范影乐; 杨瑞; 武薇
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2024-02-27
Anticipated expiration: 2040-12-28
Also published as: CN112613427A

Abstract

The invention relates to a road obstacle detection method based on a visual information flow partition projection coding model. Firstly, constructing a parallel vision path model in a V1 topological projection profile global perception unit, and extracting the brightness edge and the color edge of a road traffic map in parallel to obtain a topological projection profile map representing the overall characteristics of an obstacle and an optimal azimuth index matrix; then, a visual information difference enhancement model is constructed in a V4 sparse coding fine feature extraction unit, and the position response difference and the azimuth response difference are utilized to carry out contrast enhancement on the topological projection profile to obtain a refined profile; thirdly, providing a self-adaptive size sparse coding model, realizing intelligent focusing on the contour features of the obstacle according to the sparseness of the thinned contour map, and obtaining a pooling map representing the local features of the obstacle; and finally, simulating feedback regulation and control mechanisms of different brain regions of the visual cortex, correcting the topological projection contour map of the cross-visual area by using the pooling map, and fusing to obtain a final obstacle contour appearance result.

Description

Road obstacle detection method based on visual information flow partition projection coding model

Technical Field

The invention belongs to the field of machine vision, and particularly relates to a road obstacle detection method based on a visual information flow partition projection coding model.

Background

Road obstacle detection tasks are one of the subtasks of intelligent traffic systems, and for vehicles traveling at high speeds, obstacles can be classified into various forms such as pedestrians, road traffic signs, mountain falling stones, vehicle spills, and the like. Road obstacle detection is of great importance for safe driving of vehicles and efficient management of traffic. Contours are an efficient representation of the target subject as low-dimensional spatial features of the image. The obstacle outline acquisition in the road traffic map is beneficial to improving the rapidity and accuracy of tasks such as subsequent obstacle classification, recognition and the like.

The contours detected by the traditional image processing technology adopting the mathematical differentiation operator often contain a large amount of texture information, and road traffic target barriers cannot be well distinguished from complex backgrounds. Along with the great emergence of research results in the field of biological vision and the rapid development of nerve computation, a target perception method based on a bionic mechanism is widely focused. The sensitivity characteristics of the classical receptive field and the edge line segments of the primary visual cortex are simulated by a two-dimensional Gabor energy model, the physiological structure of the non-classical receptive field is simulated by a Gaussian differential model, an isotropy and anisotropy inhibition method is provided, and texture information is effectively inhibited; in addition, the modeling of the color antagonism mechanism in the video path is studied, so that the effective extraction of the contour information of the color image is realized. And then further suppressing texture information by using an improved sparsity measurement method on the basis. The contour detection method models the processing process of the information flow in the visual path by utilizing different biological vision mechanisms, but simplifies the information flow projection process in the visual path, simplifies the processing effect of the advanced visual cortex on the visual information, performs black box simulation on the physiological characteristics of neuron information coding or feedback regulation and the like only on the basis of single visual path modeling, and is unfavorable for effectively extracting the obstacle target because the target usually does not have a structured contour shape for the road obstacle target under a complex background.

Disclosure of Invention

The invention provides a road obstacle detection method based on a visual information flow partition projection coding model.

The model provided by the invention is composed of a V1 topological projection contour global perception unit and a V4 sparse coding fine feature extraction unit, and respectively simulates the front-stage characteristic of a V1 layer and the coding characteristic of the V4 layer. Firstly, constructing a parallel view path model in a V1 topological projection profile global perception unit, extracting the brightness edge and the color edge of a road traffic map in parallel, and acquiring a topological projection profile E (x, y) representing the overall characteristics of an obstacle and an optimal azimuth index matrix Θ (x, y) in a V1 layer; then constructing a visual information difference enhancement model in a V4 sparse coding fine feature extraction unit, and respectively performing texture suppression and contour enhancement on the topological projection contour map E (x, y) by using the position response difference and the azimuth response difference to obtain a refined contour map E _t (x, y); then put forward aSeed adaptive size sparse coding model according to refined contour map E _t The sparseness degree of (x, y) realizes intelligent focusing on the outline features of the obstacle, and a pooling graph E for representing the local features of the obstacle is obtained at a V4 layer _s (x, y); finally, simulating feedback regulation and control mechanisms of different brain regions of visual cortex, and utilizing pooling graph E _s And (x, y) correcting the topological projection profile map E (x, y) of the cross-view area, and fusing to obtain a final barrier profile result.

Compared with the prior art, the invention has the following effects:

the invention constructs a parallel vision path model, and simulates the information flow partition projection characteristic to primarily sense the outline of the road obstacle. In consideration of the detail sensitivity difference of M-type ganglion cells and P-type ganglion cells to brightness information flow and color information flow and the partition projection characteristic of two types of visual information flow in a primary visual path, a parallel visual path model is constructed at the front stage of a V1 layer, the brightness edge and the color edge of a road traffic map are respectively extracted, primary contour response is obtained by fusion, and the model ensures the integrity of obstacle contour information perception in the road traffic map.

The invention provides a visual information difference enhancement model, which utilizes the physiological characteristics of a visual receptive field to carry out contrast enhancement on primary contour response. Considering the azimuth sensitivity characteristic of a classical receptive field, calculating azimuth response difference by utilizing the optimal azimuth information and orthogonal azimuth information corresponding to the primary profile response to enhance the profile pixel point; and calculating the position response difference to restrain the texture pixel point by using the response difference between the classical receptive field and the non-classical receptive field. The model refines the primary profile response by enhancing contrast.

The invention provides a novel sparse coding method for adaptively selecting sparse kernel sizes. In consideration of rapidity and accuracy of the visual nerve center in sensing the spatial position of the target main body, a sparse coding model with adaptively selected sparse kernel size is constructed, the adaptively selected sparse kernel size is carried out according to the sparse degree of pixel distribution in the refined contour map, and information redundancy is removed while intelligent focusing is completed on the contour features of the obstacle. Compared with the traditional method for sparse coding by adopting a fixed-size sparse kernel, the self-adaptive size sparse coding method provided by the invention is more in line with the dynamic focusing characteristic of a biological vision system on the target contour characteristic.

A road obstacle detection method based on an information flow partition projection coding model is provided. Firstly, simulating partition projection characteristics of different information flows, constructing a parallel vision path model, respectively extracting brightness edges and color edges of a road traffic map, and fusing in a V1 layer to obtain a primary contour response; then, constructing a visual information difference enhancement model by utilizing the physiological characteristics of the visual receptive field, and refining the primary contour response by using the position response difference and the azimuth response difference; then simulating dynamic focusing characteristics of the visual nerve center on the target contour features, constructing a self-adaptive size sparse coding model, and obtaining a pooling graph; and finally, simulating a feedback regulation mechanism between visual layers, correcting the primary contour response of the cross-visual area by using the pooling graph, and fusing to obtain a final obstacle contour result.

Drawings

Fig. 1 is a flowchart of a road obstacle detection method according to the present invention.

FIG. 2 is a schematic diagram of the optimal orientation and orthogonal orientation of the receptive field.

The specific embodiment is as follows:

note that: taking E (x, y) as an example, where (x, y) represents the two-dimensional coordinate position of the pixel in the image E, E (x, y) represents the pixel value at the coordinate position (x, y) in the image E, and the steps will not be described.

A specific embodiment of the present invention will be described with reference to fig. 1 and 2.

And (1) constructing a parallel vision path model, and extracting a topological projection profile E (x, y) and an optimal azimuth index matrix theta (x, y). For the road traffic diagram to be detected, the brightness component I (x, y) and the red, green and blue color components R (x, y), G (x, y) and B (x, y) are decomposed, and the number of rows and columns of each component are respectively m and n. The parallel view path model is composed of a brightness path and a color path, and extracts the brightness edge and the color edge of the road traffic map respectively.

Step 1.1 the mathematical model of the luminance path is shown in formula (1):

wherein sigma and gamma represent the size and ellipticity of classical receptive fields, and the default values are set to 2 and 0.5, respectively.

Representing selective orientations, defaulting to 8 orientations equally spaced, i.e. θ _i = {0 °,45 °,90 °,135 °,180 °,225 °,270 °,315 ° }, which represents convolution operation, e (x, y; θ _i ) Representative pixel coordinates are (x, y), selective orientation is θ _i The corresponding luminance component edge response.

For each pixel, selecting the maximum value of the brightness component edge response corresponding to all the selective orientations, linearly normalizing the maximum value to be used as output, and simultaneously recording the corresponding optimal orientation index to obtain the contour response E of the brightness path _L (x, y) and an optimal azimuth index matrix Θ _L (x, y) as shown in formula (2):

where N (-) represents the linear normalization operation.

Step 1.2 the color channels are modeled using a color antagonistic mechanism, and are divided into four types of antagonistic channels, R-on/G-off, G-on/R-off, B-on/Y-off, Y-on/B-off, wherein the yellow component Y (x, Y) = (R (x, Y) +G (x, Y))/2. The R-on/G-off type antagonistic channel is exemplified. Firstly, simulating the action of cone cells, processing R (x, y) component and G (x, y) component by a Gaussian filter (default variance is set to be 1), and marking the result asThen calculating the edge response of the single antagonistic receptive field, and marking the result as S _RG (x, y) as shown in formula (3):

then simulating the action of the double antagonistic receptive fields, calculating the position of pixel coordinates (x, y) and the selective azimuth is theta _i Corresponding color component edge response d _RG (x,y；θ _i ) As shown in formula (4):

for each pixel, selecting the maximum value of the edge response of the color component corresponding to all the selective orientations, linearly normalizing the maximum value to be used as output, and simultaneously recording the corresponding optimal orientation index to obtain the edge response D of the R-on/G-off type antagonism channel _RG (x, y) and an optimal azimuth index matrix Θ _RG (x, y) as shown in formula (5):

similar to the R-on/G-off type antagonistic channels, the edge response D of the other three types of antagonistic channels is calculated _GR 、D _BY 、D _YB And an optimal azimuth index matrix Θ _GR 、Θ _BY 、Θ _YB Or by calculation. For each pixel coordinate position, taking the maximum value of edge responses in four antagonistic channels as output, and simultaneously recording the corresponding optimal azimuth index to obtain the contour response E of the color channel _C (x, y) and an optimal azimuth index matrix Θ _C (x, y) as shown in formula (6):

and 1.3, respectively fusing the contour response of the brightness path and the color path and the optimal azimuth index matrix, and simulating the front-stage characteristic of the visual cortex V1 functional layer to obtain a topological projection contour map E (x, y) and the optimal azimuth index matrix theta (x, y). As shown in formula (7):

step (2) constructing a visual information difference enhancement model, and carrying out contrast enhancement on the topological projection profile E (x, y) by utilizing the position response difference and the azimuth response difference to obtain a refined profile E _t (x, y). First, for the topology projection profile E (x, y) obtained in step (1), a Gaussian function G (x, y; σ) and a Gaussian difference function DoG are used, respectively ⁺ (x, y; sigma) convolving with E (x, y) to obtain classical receptive field visual input L _C (x, y; sigma) and non-classical receptive field visual input L _N (x, y; sigma) as shown in formula (8):

wherein,

the default value of sigma in the formula (8) is the same as that in the formula (1), and L in the formula _C (x, y; sigma) and L _N (x, y; sigma) performing a difference operation to obtain a position response difference DeltaL (x, y; sigma) as shown in formula (10):

ΔL(x,y；σ)＝max{L _C (x,y；σ)-L _N (x,y；σ),0} (10)

then, an orthogonal azimuth index matrix Θ of the topology projection profile E (x, y) is calculated ⁺ (x, y) as shown in formula (11):

Θ ⁺ (x,y)＝(Θ(x,y)+2)mod 8 (11)

where mod represents the remainder taking operation; the theta (x, y) obtained in the step (1) stores the optimal azimuth information at the pixel coordinates (x, y) in the form of indexes, wherein the value range is {0,1,2,3,4,5,6,7}, {0,4} represents the horizontal optimal azimuth, {2,6 represents a vertical optimum orientation, {1,5} represents a forward optimum orientation, {3,7} represents a reverse optimum orientation; theta (theta) ⁺ (x, y) represents an orthogonal orientation complementary to the optimal orientation Θ (x, y), wherein the horizontal orientation is complementary to the vertical orientation and the left diagonal orientation is complementary to the right diagonal orientation. A schematic relationship of the optimal orientation to the orthogonal orientation is shown in fig. 2.

For each pixel coordinate (x, y) in the topology projection profile E, the sum of the two neighborhood pixel values is used as response information. Calculating optimal azimuth response O respectively _Θ (x, y) and orthogonal azimuth response O _Θ+ (x, y), and performing a difference operation on the two to obtain an azimuth response difference Δo (x, y), as shown in formula (12):

finally, contrast enhancement is carried out on E (x, y) to obtain a refined contour map E _t (x, y) as shown in formula (13):

E _t (x,y)＝E(x,y)+E _enha (x,y)-E _inhi (x,y) (13)

wherein E is _enha (x, y) represents the result of linear normalization of the azimuth response difference Δo (x, y) for enhancing the contour pixel point. E (E) _inhi (x, y) represents the result of exponential normalization of the position response difference DeltaL (x, y; sigma) for suppressing background pixels.

Step (3) constructing an adaptive size sparse coding model, and refining a contour map E _t (x, y) performing sparse coding to obtain a pooling graph E _s (x, y). For the refined profile E obtained in step (2) _t And (x, y), calculating an adaptive threshold value thresh according to a maximum inter-class variance method, binarizing and counting the proportion of the contour pixels, wherein the proportion is represented by a formula (14):

wherein,representing the binarized result of the refined contour map, the count (& gt) operation is used for counting binary imagesThe number of pixel values 1.

Then, calculating the size of a sparse kernel of the sparse coding model, selecting the shape of a sparse kernel window as a square window for simplifying calculation, and adaptively selecting the window size according to the outline pixel proportion description size, wherein the formula (15) is as follows:

wherein w is ₁ 、w ₂ 、w ₃ Representing sparse kernel window sizes of different sizes. The invention is provided with w ₁ ＝3、w ₂ ＝5、w ₃ =7. thresh1 and thresh2 are used to measure the refined profile E _t Threshold parameters of the sparseness degree default to the value thresh1=0.2, thresh2=0.1. According to E _t The relative sizes of the middle outline pixel proportion project and the threshold parameter are equal to the sparse kernel window size w _s And performing self-adaptive selection. Adopting mirror symmetry method to refine contour map E before sparse coding _t And filling boundary pixels. For E _t Calculating an average value of all pixels in a window taking the average value as the center as sparse coding output of the pixel point, taking the window size as window moving step length, and realizing sparse expression of visual information in space, wherein the sparse expression is shown in a formula (16):

wherein w and h are respectively the lateral offset and the longitudinal offset, and w _s The size of the sparse nuclear window is also the window moving step length; floor (·) represents a rounding down function, E _s (x, y) is a pooled graph after the thinning process.

Step (4)And calculating an obstacle outline result T (x, y) after feedback correction. Pooling the FIG. E obtained in step (3) _s And (x, y) feeding back to the V1 visual cortex in a cross-visual area mode, correcting the topological projection profile map E (x, y) obtained in the step (1) in a form of an adjustment coefficient, and obtaining a final barrier profile result T (x, y) through pixel-by-pixel multiplication and fusion, wherein the final barrier profile result T (x, y) is shown as a formula (17):

wherein, the size (·) represents a bilinear interpolation operation,the size of the sparse precise sketch enlarged by bilinear interpolation operation is the same as that of the topological projection outline.

Claims

1. The road obstacle detection method based on the visual information flow partition projection coding model is characterized by comprising the following steps of:

step one: at V ₁ Constructing a parallel vision path model in a topology projection profile global perception unit, and extracting the brightness edge and the color edge of a road traffic map in parallel so as to obtain a topology projection profile E (x, y) representing the overall characteristics of the obstacle and an optimal azimuth index matrix theta (x, y);

step two: at V ₄ Constructing a visual information difference enhancement model in the sparse coding fine feature extraction unit, and respectively performing texture suppression and contour enhancement on the topological projection contour map E (x, y) by using the position response difference and the azimuth response difference to obtain a refined contour map E _t (x,y)；

Step three: an adaptive size sparse coding model is provided for refining the contour map E _t (x, y) carrying out pooling operation, realizing intelligent focusing on the outline features of the obstacle according to the image characteristics, and obtaining a pooling graph E representing the local features of the obstacle _s (x,y)；

Step four: feedback regulation mechanism for simulating different brain regions of visual cortexUsing pooled graphs E _s (x, y) correcting the topological projection profile map E (x, y) of the cross-view area, and fusing to obtain a final barrier profile result T (x, y);

at V ₁ Constructing a parallel vision path model in a topology projection profile global perception unit, and extracting the brightness edge and the color edge of a road traffic map in parallel so as to obtain a topology projection profile E (x, y) representing the overall characteristics of the obstacle and an optimal azimuth index matrix theta (x, y); the method comprises the following steps:

for a road traffic diagram to be detected, decomposing a brightness component I (x, y) and red, green and blue color components R (x, y), G (x, y) and B (x, y); the number of rows and columns of each component are m and n respectively; the parallel vision path model consists of a brightness path and a color path, and extracts the brightness edge and the color edge of the road traffic map respectively;

step 1.1 the mathematical model of the luminance path is shown in formula (1):

wherein sigma and gamma represent the size and ellipticity of classical receptive field,representing selective orientations, taking equally spaced 8 orientations, i.e. θ _i = {0 °,45 °,90 °,135 °,180 °,225 °,270 °,315 ° }, which represents convolution operation, e (x, y; θ _i ) Representative pixel coordinates are (x, y), selective orientation is θ _i The corresponding luminance component edge response;

wherein, N (·) represents a linear normalization operation;

step 1.2, modeling a color channel by using a color antagonism mechanism, wherein the color channel is divided into four types of antagonism channels of R-on/G-off, G-on/R-off, B-on/Y-off and Y-on/B-off, and yellow component Y (x, Y) = (R (x, Y) +G (x, Y))/2; take the R-on/G-off type antagonistic channel as an example; firstly, simulating the action of cone cells, processing R (x, y) component and G (x, y) component by using Gaussian filter respectively, and recording the result as Then calculating the edge response of the single antagonistic receptive field, and marking the result as S _RG (x, y) as shown in formula (3):

the same calculation mode as that of R-on/G-off type antagonistic channels is adopted to obtain the edge response D of other three antagonistic channels _GR 、D _BY 、D _YB And an optimal azimuth index matrix Θ _GR 、Θ _BY 、Θ _YB The method comprises the steps of carrying out a first treatment on the surface of the For each pixel coordinate position, taking the maximum value of edge responses in four antagonistic channels as output, and simultaneously recording the corresponding optimal azimuth index to obtain the contour response E of the color channel _C (x, y) and an optimal azimuth index matrix Θ _C (x, y) as shown in formula (6):

step 1.3, respectively fusing the contour response of the brightness path and the color path and an optimal azimuth index matrix, and simulating the front-stage characteristic of the visual cortex V1 functional layer to obtain a topological projection contour map E (x, y) and an optimal azimuth index matrix theta (x, y); as shown in formula (7):

2. the method for detecting the road obstacle based on the visual information flow partition projection coding model according to claim 1, wherein the method comprises the following steps: sigma, gamma are set to 2 and 0.5 in order.

3. The method for detecting the road obstacle based on the visual information flow partition projection coding model according to claim 1, wherein the method comprises the following steps: the second step is as follows: at V ₄ Constructing a visual information difference enhancement model in the sparse coding fine feature extraction unit, and respectively performing texture suppression and contour enhancement on the topological projection contour map E (x, y) by using the position response difference and the azimuth response difference to obtain a refined contour map E _t (x, y); the method comprises the following steps:

first, for the topology projection profile E (x, y) obtained in the first step, a Gaussian function G (x, y; sigma) and a Gaussian difference function DoG are used, respectively ⁺ (xY; sigma) is convolved with E (x, y) to obtain visual input L of classical receptive field _C (x, y; sigma) and non-classical receptive field visual input L _N (x, y; sigma) as shown in formula (8):

wherein,

in the pair (8), L _C (x, y; sigma) and L _N (x, y; sigma) performing a difference operation to obtain a position response difference DeltaL (x, y; sigma) as shown in formula (10):

△L(x,y；σ)＝max{L _C (x,y；σ)-L _N (x,y；σ),0} (10)

Θ ⁺ (x,y)＝(Θ(x,y)+2)mod 8 (11)

where mod represents the remainder taking operation; the theta (x, y) obtained in the step (1) stores the optimal azimuth information at the pixel coordinates (x, y) in an index form, wherein the value range is {0,1,2,3,4,5,6,7}, {0,4} represents the horizontal optimal azimuth, {2,6} represents the vertical optimal azimuth, {1,5} represents the forward-inclined optimal azimuth, {3,7} represents the reverse-inclined optimal azimuth; theta (theta) ⁺ (x, y) represents an orthogonal orientation complementary to the optimal orientation Θ (x, y), wherein the horizontal orientation is complementary to the vertical orientation, and the left diagonal orientation is complementary to the right diagonal orientation;

for each pixel coordinate (x, y) in the topological projection profile diagram E, taking the sum of two neighborhood pixel values as response information; calculating optimal azimuth response O respectively _Θ (x, y) and orthogonal azimuth responseAnd performing difference operation on the two to obtain azimuth response difference delta O (x, y), as shown in formula (12):

E _t (x,y)＝E(x,y)+E _enha (x,y)-E _inhi (x,y) (13)

wherein E is _enha (x, y) represents the result of linear normalization of the azimuth response difference delta O (x, y) and is used for enhancing the contour pixel points; e (E) _inhi (x, y) represents the result of exponential normalization of the position response difference DeltaL (x, y; sigma) for suppressing background pixels.

4. The method for detecting road obstacle based on visual information flow partition projection coding model according to claim 3, wherein: sigma is set to 2.

5. The method for detecting road obstacle based on visual information flow partition projection coding model according to claim 3, wherein: and step three: an adaptive size sparse coding model is provided for refining the contour map E _t (x, y) carrying out pooling operation, realizing intelligent focusing on the outline features of the obstacle according to the image characteristics, and obtaining a pooling graph E representing the local features of the obstacle _s (x, y); the method comprises the following steps:

for the refined profile E obtained in the step two _t And (x, y), calculating an adaptive threshold value thresh according to a maximum inter-class variance method, binarizing and counting the proportion of the contour pixels, wherein the proportion is represented by a formula (14):

wherein,binarization of representative refined profileThe count (·) operation is used to count binary imagesThe number of middle pixel values 1;

wherein w is ₁ 、w ₂ 、w ₃ Representing sparse kernel window sizes of different sizes; thresh1 and thresh2 are used to measure the refined profile E _t Threshold parameters of sparseness according to E _t The relative sizes of the middle outline pixel proportion project and the threshold parameter are equal to the sparse kernel window size w _s Performing self-adaptive selection; adopting mirror symmetry method to refine contour map E before sparse coding _t Filling boundary pixels; for E _t Calculating an average value of all pixels in a window taking the average value as the center as sparse coding output of the pixel point, taking the window size as window moving step length, and realizing sparse expression of visual information in space, wherein the sparse expression is shown in a formula (16):

6. The method for detecting road obstacle based on the visual information flow partition projection coding model according to claim 5, wherein: w (w) ₁ ＝3，w ₂ ＝5，w ₃ ＝7。

7. The method for detecting road obstacle based on the visual information flow partition projection coding model according to claim 5, wherein: thresh1=0.2, thresh2=0.1.

8. The method for detecting road obstacle based on the visual information flow partition projection coding model according to claim 5, wherein: and step four: feedback regulation mechanism simulating different brain regions of visual cortex and using pooling diagram E _s (x, y) correcting the topological projection profile map E (x, y) of the cross-view area, and fusing to obtain a final barrier profile result T (x, y); the method comprises the following steps:

pooling graph E obtained in step three _s And (x, y) feeding back to the V1 visual cortex in a cross-visual area mode, correcting the topological projection profile map E (x, y) obtained in the step one in a form of an adjustment coefficient, and obtaining a final barrier profile shape result T (x, y) through pixel-by-pixel multiplication and fusion, wherein the final barrier profile shape result T (x, y) is shown as a formula (17):

wherein, the size (·) represents a bilinear interpolation operation,the dimension of the pooled map amplified by bilinear interpolation operation is the same as that of the topological projection profile map.