CN112053379B

CN112053379B - Biooptic nerve sensitivity bionic modeling method

Info

Publication number: CN112053379B
Application number: CN202010848790.4A
Authority: CN
Inventors: 陈哲; 顾宇鹏; 王慧斌; 张丽丽; 沈洁; 刘海韵
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2022-08-26
Anticipated expiration: 2040-08-21
Also published as: CN112053379A

Abstract

The invention discloses a biological visual nerve sensitivity bionic modeling method which is characterized in that a parallel information processing double-channel consisting of a leaflet giant motion detector and a direction sensitive neuron model is constructed in a bionic mode by modeling the leaflet giant motion detector and the direction sensitive neuron in a leaflet nerve heap layer of a biological visual system, and biological visual sense is simulated to realize double-channel nerve sensitivity information fusion through a continuous correlation mechanism of motion perception. The invention can extract the depth and directional information of the target motion, and bionically integrates two information perception scene information; training and background modeling can be avoided, the calculation cost is low, and the real-time performance is good; the method can quickly and effectively detect the moving target in the complex dynamic scene, and realize the effective perception of the scene.

Description

Biooptic nerve sensitivity bionic modeling method

Technical Field

The invention relates to a bionic modeling method for biological visual nerve sensitivity, in particular to bionic perception of motion and depth information in a scene by simulating biological visual nerve sensitivity modeling.

Background

Scene perception is an important technology in the field of computer vision, and the scene perception mainly aims to extract a moving target foreground part in a video, and the main application fields comprise video monitoring, target tracking, video editing, automatic driving and the like. Most of the current perception methods are based on background subtraction and deep learning. Background subtraction-based methods mostly require establishing a reliable background model or estimating the motion of the background, and such methods are often single and lack robustness; the deep learning-based method needs certain data and training cost, lacks certain generalization capability on data of a brand new scene, is often poor in real-time performance, and is difficult to apply to a real scene.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problems in the prior art, the invention aims to provide a biological visual nerve sensitivity bionic modeling method which can be stably and reliably used for target detection tasks under various complex dynamic scene conditions.

The technical scheme is as follows: a bionic modeling method for biological visual nerve sensitivity comprises the following steps:

(1) simulating and constructing a dual-channel biological visual sensitivity model consisting of a leaflet giant motion detector and a direction sensitive neuron model, and extracting target depth motion information and direction motion information in a scene;

(2) and the information of the depth and the direction of the motion is fused and strengthened, the target is highlighted, irrelevant background noise excitation is inhibited, and the detection of the video motion target is finished.

Further, in the step (1), the specific process of constructing the leaflet giant motion detector model is as follows:

according to the mechanism of a lobular giant motion detector in a bionic visual brain and a depth motion perception process thereof, establishing a depth motion perception model by taking a frame sequence of a video as the input of a model, wherein the bionic motion perception model comprises a depth receptor layer, a depth excitation layer, a depth inhibition layer, a depth summation layer and a depth output layer;

the depth receptor layer is used for sensing the motion change stimulation of the video image; the layer is modeled as a 3D Gabor filter for simulating the receptive field characteristics of biological vision while considering spatiotemporal variation information as shown in the following formula:

wherein the content of the first and second substances,

for the generated 3D Gabor filter kernel, x and y are space domain variables, and t is a time variable in a time domain;

in the formula (I), the compound is shown in the specification,

a Gabor filter; wherein γ is the spatial aspect ratio; σ is the Gaussian standard deviation;

and

the operation is rotating; v is a cell _c Is a spatial Gaussian envelope function

The moving speed; theta.theta. _l Is the kernel function direction; v is a cell _l As the speed of movement of the nucleus, take upsilon _l ＝υ _c (ii) a λ is a sine function wavelength;

is a phase offset;

in the formula (I), the compound is shown in the specification,

is a gaussian function in the time domain; wherein, mu _t Is the mean value of the Gaussian function; τ is the gaussian standard deviation; u (t) is a unit step function for ensuring causal characteristics of the filter;

the resulting stimulus response is:

wherein L (x, y, t) is the luminance distribution of the sequence of input video frames; is a convolution; hw [. cndot. ] represents half-wave rectification operation, which is consistent with biological visual mechanisms;

the output of the depth receptor layer is phase taken as 0 and

sum of squares of the resulting stimulus responses:

the output of the depth receptor layer is directly sent to the depth excitation layer and the depth inhibition layer; the depth excitation layer continues to pass on to the depth summation layer with one-to-one pixels as shown in the following equation:

El(x,y,t)＝Pl(x,y,t)

the depth suppression layer flows into the neighboring cells of the corresponding cell in the depth summation layer with a delay τ according to the side suppression principle, as shown in the following formula:

wherein, tau is time delay; omega _I Performing local inhibition for the r multiplied by r side inhibition template matrix;

the depth summation layer sums the signals from the depth excitation and suppression layers using a side suppression mechanism as shown in the following equation:

Sl(x,y,t)＝El(x,y,t)-Il(x,y,t)·W _I

wherein, W _I For global suppression weights, T _l Is a threshold value;

the depth output layer adopts an excitation convergence processing mechanism of the biological optic nerve terminal to enhance the depth motion information of the target, and the depth motion information is shown as the following formula:

wherein w _e Converging a template matrix for excitation; thereby obtaining the detection result of the leaflet giant motion detector.

Further, in the step (1), the specific process of constructing the direction sensitive neuron model is as follows:

according to the mechanism of a direction sensitive neuron in a bionic visual brain and a direction movement perception process thereof, establishing a direction movement perception model by taking a frame sequence of a video as the input of the model, wherein the bionic movement perception model comprises a direction receptor layer, a direction exciting layer, a direction inhibiting layer, a direction summing layer and a direction output layer;

the directional receptor layer is modeled as a 3D Gabor filter and is used for simulating the receptive field characteristics of biological vision and simultaneously considering the time-space variation information:

wherein the content of the first and second substances,

in the formula (I), the compound is shown in the specification,

a Gabor filter;

wherein γ is the spatial aspect ratio; sigma is a Gaussian standard deviation;

and

the operation is rotating; upsilon is _c Along a spatial Gaussian envelope function

The moving speed; theta _d Is the kernel function direction; v is a cell _d As the speed of movement of the nucleus, let u be _d ＝υ _c (ii) a λ is a sine function wavelength;

is a phase offset;

in the formula (I), the compound is shown in the specification,

the resulting stimulus response is then:

wherein L (x, y, t) is the luminance distribution of the sequence of input video frames; is a convolution; hw [. cndot. ] represents a half-wave rectification operation;

the outputs of the direction sensor layers are taken as phases 0 and

sum of squares of the resulting stimulus responses:

the output is directly sent to a direction exciting layer and a direction inhibiting layer; the direction receptor layer is delivered to the depth excitation layer in one-to-one pixels, as shown in the following formula:

Ed(x,y,t)＝Pd(x,y,t)

the direction inhibition layer is divided into inhibition in four directions, namely, upper, lower, left and right directions, and the inhibition information in the four directions is as follows:

wherein, ω is _LI 、ω _RI 、ω _UI And omega _DI Respectively forming q multiplied by s (q is not equal to s) local inhibition template matrixes in four directions;

the excitation and suppression information for the four directions of the direction summation layer are summed as follows:

S _L (x,y,t)＝[Ed(x,y,t)-I _L (x,y,t)gW _LI ] ^*

S _R (x,y,t)＝[Ed(x,y,t)-I _R (x,y,t)gW _RI ] ^*

S _U (x,y,t)＝[Ed(x,y,t)-I _U (x,y,t)gW _UI ] ^*

S _D (x,y,t)＝[Ed(x,y,t)-I _D (x,y,t)gW _DI ] ^*

wherein the content of the first and second substances,

T _d is a threshold value; w _LI 、W _RI 、W _UI And W _DI Global suppression weights in four directions, respectively;

the directional output layer includes four directional outputs:

averaging the information in four directions, and extracting the final direction movement information of the target:

further, in the step (2), based on a biological visual motion correlation mechanism, carrying out dual-channel neural sensitivity information fusion, and according to a continuous correlation mechanism of visual perception motion, considering the spatial-temporal regularity of motion perception so as to highlight a target and eliminate irrelevant background noise left by two neural network channels; the method specifically comprises the following steps:

firstly, the motion information of two aspects is simply and linearly integrated:

fg(x,y,t)＝f(x,y,t)+g(x,y,t)

then, a continuous correlation mechanism of biological visual perception motion is applied to enhance a moving target and inhibit isolated noise excitation caused by a dynamic complex background:

M(x,y,t)＝(fg(x,y,t-1))·fg(x,y,t)·ω ^-1

ω is calculated in each frame image by:

ω＝Δc+max(abs[C(x,y,t)])·C _ω ^-1

where Δ c is a real number; c _ω Is a constant;

and finally, according to an excitation convergence mechanism of the biological visual nerve terminal, carrying out convergence operation on the motion information after fusion enhancement through a sliding window, thereby obtaining a target:

obj＝M(x,y,t)*H

wherein H is an a x a matrix; is a convolution operation.

Compared with the prior art, the invention has the beneficial effects that: on the basis of two sensitive neurons, namely a leaflet giant motion detector and a direction sensitive neuron which play a leading role in bionic simulated motion perception, directional and depth motion information is collected, and based on visual space-time regularity, a continuous correlation mechanism and a visual convergence mechanism of motion perception are utilized to inhibit background noise and highlight a target state; the strategy gets rid of training and background modeling, can accurately detect the motion information at the current moment, has the capability of inhibiting motion noise caused by lens movement and a complex background, can quickly and effectively detect the motion target in a complex dynamic scene finally, realizes effective perception of the scene, and has remarkable real-time property, effectiveness and accuracy.

Drawings

FIG. 1 is a simplified schematic diagram of modeling two biological motion-aware neural networks;

FIG. 2 is a flow chart of moving object detection according to an embodiment of the present invention, (a) a frame image at a current time in an input video sequence, (b) depth information detected by a leaflet giant motion detector, (c) direction information detected by a direction sensitive neuron, and (d) a motion detection result in the frame image at the current time;

FIG. 3 is a schematic block diagram of an overall model in an embodiment of the invention;

fig. 4 shows the final moving object detection and positioning result in the embodiment of the present invention.

Detailed Description

The invention will be further elucidated with reference to the following specific examples.

Example a biomimetic modeling method of bio-optic nerve sensitivity is shown in fig. 2. The invention mainly relies on the biological discovery of two visual motion perception neural networks and two visual perception mechanisms: firstly, the small-leaf giant motion detector can well detect the depth motion of the target in the scene; the direction sensitive neuron can detect the direction movement information of the target; the information integration and convergence mechanism of the biological visual nerve terminal is used for fusing the information of the two aspects; and fourthly, the continuous correlation mechanism of visual motion perception is utilized to have no relation with background noise and enhance the target. According to the finding (r), in a video scene photographed by a moving camera, a depth motion cue of an object can be well detected (as shown in (b) of fig. 2). According to the finding (c), the directional motion information of the object can be effectively detected (as shown in fig. 2 (c)). And (d) fusing the depth and direction motion clues according to the mechanism (c) to obtain a final motion target area (as shown in fig. 2). According to the mechanism IV, after the two models are fused, irrelevant background noise excitation can be inhibited, and target information is enhanced.

Firstly, a dual-channel visual neural network is constructed according to the depth motion and direction motion perception processes of the lobular giant motion detector and the direction sensitive neuron in the bionic visual brain, as shown in fig. 1. For a leaflet giant motion detector, the first layer is the depth receptor layer. The layer mainly senses the motion change stimulation of a video image, as shown in formula (1), models a 3D Gabor filter, is used for simulating the receptive field characteristics of biological vision, and considers the time-space change information:

wherein, the first and the second end of the pipe are connected with each other,

to generate the 3D Gabor filter kernel, x and y are space domain variables and t is a time variable in the time domain. The former part of the formula is a spatial Gabor filter: gamma is the spatial aspect ratio; σ is the Gaussian standard deviation;

and

the operation is rotating; upsilon is _c Is a spatial Gaussian envelope function

The moving speed; theta _l Is the kernel function direction; upsilon is _l As the speed of movement of the nucleus, take upsilon _l ＝υ _c (ii) a λ is a sine function wavelength;

is the phase offset; the latter part is a gaussian function in the time domain: mu.s _t Is a mean value of a Gaussian function; τ is the gaussian standard deviation; u (t) is a unit step function that ensures causal characteristics of the filter. The stimulus response that then occurs is:

wherein is a convolution; hw [. C]Representing a half-wave rectification operation, which is consistent with a bio-visual mechanism. The resulting output of the depth receptor layer is phase taken as 0 and

sum of squares of the resulting stimulus responses:

the second and third layers are depth excitation layer and depth inhibition layer. The output of the depth receptor layer is directed to the depth excitation layer and the depth inhibition layer.

As shown in equations (4) and (5), the excitation layer directly continues to the depth summation layer with one-to-one pixels, and the depth suppression layer flows into the adjacent cells of the corresponding cells in the depth summation layer with a delay τ according to the side suppression principle:

El(x,y,t)＝Pl(x,y,t) (4)

wherein, tau is time delay; omega _I For the r × r side suppression template matrix, values were taken to be 0.125 and 0.25, and local suppression was performed.

The fourth layer is a depth summation layer, which sums the signals from the depth excitation and depth suppression layers using a side suppression mechanism, as shown in equation (6):

Sl(x,y,t)＝El(x,y,t)-Il(x,y,t)·W _I (6)

wherein, W _I For global suppression weights, empirically set to 0.3, T _l The threshold is set to 0.1, and part of noise interference can be primarily filtered.

The last layer is a depth output layer of the leaflet giant motion detector, and an excitation convergence processing mechanism of the biological optic nerve terminal is adopted in the last layer, as shown in a formula (8), the depth motion information of the target is enhanced:

wherein, w _e Is an excitation convergence template matrix.

The final result is shown in fig. 2 (b), and the depth information is highlighted because the left vehicle comes to the lens; while the right vehicle moves mainly to the left and contains only a small amount of depth information.

As shown in fig. 1, for the direction-sensitive neuron model, the emphasis is on direction selectivity compared with the leaflet giant motion detector, and the direction inhibition layer mainly has directional side inhibition, so that the direction-sensitive neuron model is mainly divided into four directions of inhibition, namely, up, down, left and right.

wherein the content of the first and second substances,

for the generated 3D Gabor filter kernel, x and y are space domain variables, t is a time variable in a time domain, and gamma is a space aspect ratio; σ is the Gaussian standard deviation;

and

The moving speed; theta _d Is the kernel function direction; v is a cell _d As the speed of movement of the nucleus, take upsilon _d ＝υ _c (ii) a λ is the sine function wavelength;

is a phase offset; mu.s _t Is the mean value of the Gaussian function; τ is the gaussian standard deviation; u (t) is a unit step function for ensuring causal characteristics of the filter;

the resulting stimulus response is then:

the output of the direction sensor layer is that the phase takes 0 and

sum of squares of the resulting stimulus responses:

Ed(x,y,t)＝Pd(x,y,t) (12)

wherein, ω is _LI 、ω _RI 、ω _UI And omega _DI The local suppression template matrixes are q × s (q ≠ s) in four directions respectively, and element values are arranged according to side suppression directionality by taking 0 and 1.

T _d is a threshold value, set to 0.3; w _LI 、W _RI 、W _UI And W _DI The global suppression weights in the four directions, respectively, are all set to 0.8.

And finally, a direction output layer, wherein the outputs in the first four directions are respectively as follows:

then, simple averaging is carried out on the information in the four directions to extract the final directional motion information of the target:

as a result of the model output, as shown in fig. 2 (c), three vehicles detected to be moving have direction information.

Therefore, the extraction of the depth motion information and the direction motion information of the target in the scene is completed.

Then, according to the mechanisms (c) and (d), the motion information of the two aspects is simply and linearly integrated,

fg(x,y,t)＝f(x,y,t)+g(x,y,t) (17)

M(x,y,t)＝(fg(x,y,t-1))·fg(x,y,t)·ω ^-1 (18)

ω is calculated in each frame image by:

ω＝Δc+max(abs[C(x,y,t)])·C _ω ^-1 (19)

where Δ c is a small real number; c _ω Is a constant.

obj＝M(x,y,t)*H (20)

wherein, H is an a × a matrix, where a is 9; is a convolution operation. The final moving object detection result is obtained as shown in fig. 2 (d) and fig. 4.

Therefore, the fusion and the enhancement of the output of the two motion sensing networks are completed, and the detection of the video moving target shot by the mobile camera is completed.

Claims

1. A bionic modeling method for biological visual nerve sensitivity is characterized by comprising the following steps:

(1) simulating and constructing a dual-channel biological visual sensitivity model consisting of a leaflet giant motion detector and a direction sensitive neuron model, and extracting target depth motion information and direction motion information in a scene; the specific process of constructing the leaflet giant motion detector model is as follows:

wherein the content of the first and second substances,

in the formula (I), the compound is shown in the specification,

a spatial Gabor filter; wherein γ is the spatial aspect ratio; σ is a spatial Gaussian standard deviation;

and

The moving speed; theta _l Is the kernel function direction; v is a cell _l As the speed of movement of the nucleus, let u be _l ＝υ _c (ii) a λ is the sine function wavelength;

is a phase offset;

in the formula (I), the compound is shown in the specification,

is a gaussian function in the time domain; wherein, mu _t Is the mean value of the Gaussian function; eta is a time domain Gaussian standard deviation; u (t) is a unit step function for ensuring causal characteristics of the filter;

the resulting stimulus response is:

wherein L (x, y, t) is the luminance distribution of the sequence of input video frames; is a convolution; hw [. cndot. ] represents half-wave rectification operation, δ is the input of half-wave rectification, and the operation is consistent with biological visual mechanism;

the output of the depth receptor layer is taken as the phase 0 and

sum of squares of the resulting stimulus responses:

El(x,y,t)＝Pl(x,y,t)

the depth suppression layer flows into the neighboring cells of the corresponding cell in the depth summation layer with a delay τ according to the side suppression principle, as shown in the following equation:

Sl(x,y,t)＝El(x,y,t)-Il(x,y,t)·W _I

wherein, W _I For global suppression weights, T _l Is a threshold value;

the depth output layer adopts an excitation convergence processing mechanism at the tail end of the biological optic nerve to enhance the depth motion information of the target, and the depth motion information is shown as the following formula:

wherein, w _e Is an excitation convergence template matrix; thereby obtaining a detection result of the leaflet giant motion detector;

(2) and the depth and direction information of the motion is fused and strengthened, the target is highlighted, irrelevant background noise excitation is inhibited, and the detection of the video motion target is completed.

2. The bio-visual nerve sensitivity bionic modeling method according to claim 1, characterized in that in the step (1), the specific process of constructing the direction sensitive neuron model is as follows:

according to the mechanism of a direction sensitive neuron in a bionic visual brain and a direction movement perception process thereof, establishing a direction movement perception model by taking a frame sequence of a video as the input of a model, wherein the bionic movement perception model comprises a direction receptor layer, a direction exciting layer, a direction inhibiting layer, a direction summing layer and a direction output layer;

the directional receptor layer is modeled as a 3D Gabor filter and is used for simulating the receptive field characteristics of biological vision and considering the time-space change information:

wherein the content of the first and second substances,

in the formula (I), the compound is shown in the specification,

a spatial Gabor filter;

wherein γ is the spatial aspect ratio; σ is a spatial Gaussian standard deviation;

and

The moving speed; theta.theta. _d Is the kernel function direction; upsilon is _d As the speed of movement of the nucleus, let u be _d ＝υ _c (ii) a λ is a sine function wavelength;

is the phase offset;

in the formula (I), the compound is shown in the specification,

the resulting stimulus response is then:

wherein L (x, y, t) is the luminance distribution of the sequence of input video frames; is a convolution; hw [ · ] denotes half-wave rectification operation;

the outputs of the direction sensor layers are taken as phases 0 and

sum of squares of the resulting stimulus responses:

Ed(x,y,t)＝Pd(x,y,t)

wherein, ω is _LI 、ω _RI 、ω _UI And ω _DI Q is multiplied by s in four directions respectively, q is not equal to s, and a template matrix is locally inhibited;

S _L (x,y,t)＝[Ed(x,y,t)-I _L (x,y,t)gW _LI ] ^*

S _R (x,y,t)＝[Ed(x,y,t)-I _R (x,y,t)gW _RI ] ^*

S _U (x,y,t)＝[Ed(x,y,t)-I _U (x,y,t)gW _UI ] ^*

S _D (x,y,t)＝[Ed(x,y,t)-I _D (x,y,t)gW _DI ] ^*

wherein the content of the first and second substances,

the directional output layer includes four directional outputs:

3. the bio-visual nerve sensitivity bionic modeling method according to claim 1, characterized in that in the step (2), based on a bio-visual motion correlation mechanism, double-channel nerve sensitivity information fusion is performed, and according to a continuous correlation mechanism of visual perception motion, a temporal-spatial regularity of motion perception is considered so as to highlight a target and eliminate irrelevant background noise left by two neural network channels.

4. The biomimetic modeling method for biological visual nerve sensitivity according to claim 3, wherein the step (2) specifically comprises the following steps:

fg(x,y,t)＝f(x,y,t)+g(x,y,t)

M(x,y,t)＝(fg(x,y,t-1))·fg(x,y,t)·ω ^-1

ω is calculated in each frame image by:

ω＝Δc+max(abs[C(x,y,t)])·C _ω ^-1

where Δ c is a real number; c _ω Is a constant;

obj＝M(x,y,t)*H

wherein H is an a x a matrix; is a convolution operation.