CN104050685A - Moving target detection method based on particle filtering visual attention model - Google Patents
Moving target detection method based on particle filtering visual attention model Download PDFInfo
- Publication number
- CN104050685A CN104050685A CN201410255267.5A CN201410255267A CN104050685A CN 104050685 A CN104050685 A CN 104050685A CN 201410255267 A CN201410255267 A CN 201410255267A CN 104050685 A CN104050685 A CN 104050685A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msubsup
- attention
- mover
- msub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000002245 particle Substances 0.000 title claims abstract description 110
- 238000001514 detection method Methods 0.000 title claims abstract description 24
- 230000000007 visual effect Effects 0.000 title claims abstract description 24
- 238000001914 filtration Methods 0.000 title claims abstract description 19
- 230000033001 locomotion Effects 0.000 claims abstract description 97
- 238000000034 method Methods 0.000 claims abstract description 45
- 230000004927 fusion Effects 0.000 claims abstract description 20
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 17
- 230000008569 process Effects 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 37
- 238000005070 sampling Methods 0.000 claims description 31
- 238000004364 calculation method Methods 0.000 claims description 20
- 238000012952 Resampling Methods 0.000 claims description 10
- 230000002123 temporal effect Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 4
- 230000003287 optical effect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 2
- 230000001105 regulatory effect Effects 0.000 claims description 2
- 230000008030 elimination Effects 0.000 claims 1
- 238000003379 elimination reaction Methods 0.000 claims 1
- 241000282414 Homo sapiens Species 0.000 abstract description 7
- 238000011160 research Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000003068 static effect Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 230000007787 long-term memory Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241001061106 Sargocentron rubrum Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a moving target detection method based on a particle filtering visual attention model. First of all, a particle filtering bidirectional fusion attention model is constructed according to the Bayesian estimation principle; next, on the basis of the particle filtering bidirectional fusion attention model, movement attention and target color attention serve as B-U attention input and T-D attention input respectively, the particle distribution state is changed by calculating particle weights, an attention saliency map is formed, and finally the position of a moving target is determined. According to the method, time attention and space attention are fused, so that the movement attention is calculated more accurately; bottom-to-top attention and top-to-bottom attention are fused, so that the forming process of human visual attention is simulated simply and effectively; with respect to a complex global movement scene, the effectiveness and accuracy of moving target detection are improved.
Description
Technical Field
The invention belongs to the technical field of video image detection, and relates to a moving target detection method based on a particle filter visual attention model.
Background
Moving object detection is one of the important problems in the field of machine vision, and is a precondition for object tracking and identification, but in a complex moving scene, the existing moving object detection method still has great limitations and defects. In recent years, visual perception research is gradually merged into human physiological and psychological research results, the main idea is to adopt a computer to simulate local functions of human physiology to solve the problems existing in the visual field, visual attention is typical of the research, and the research results have important promotion effects on visual problems such as target detection, segmentation and the like.
The traditional moving target detection method comprises an inter-frame difference method, a background difference method and a global motion compensation method, wherein the background difference method and the inter-frame difference method are only limited to a local moving scene, the global motion compensation method is wide in application range, but accuracy is affected by target size and motion intensity, when a target body is large or the motion is strong, the global motion estimation accuracy is reduced, global effective compensation cannot be carried out, and large errors exist in moving target detection.
Human visual attention is generated by the combined action of Bottom-Up (B-U) and Top-Down (T-D). In 2002, Itti and Koch establish a B-U visual attention model with main characteristics of brightness, color, direction and the like, and subsequent research derives a plurality of visual attention calculation methods and applications. Currently, visual attention models can be roughly classified into a bottom-up type and a bidirectional type. Describing a B-U attention calculation method from bottom to top; the bidirectional type describes the attention calculation method of the combined action of B-U and T-D.
Attention is the initial response of the human visual system to external observations, and some scholars have conducted preliminary studies on the problem of target detection from the point of attention. Some of these use a two-way attention approach for static object detection. The Sang-Woo Ban and the like perform self-organizing neural network learning on the color characteristics of a specific static target, generate a weight matrix, and adjust a B-U attention calculation process by taking the weight matrix as an influence factor of T-D to form a target attention saliency map. And (4) extracting the direction characteristics of the target as T-D attention by Yuming Fang and the like, performing proportional weighted fusion with the B-U attention, and finally determining the position of the target. Yuanlong Yu et al establish a target feature Long Term Memory (LTM) unit, calculate a position probability distribution bias by comparison with low-level features, and perform two-way weighted fusion to determine the target position.
In addition, a motion attention model is established in documents for detecting a moving target, and the main idea is to define the motion attention model according to motion contrast so as to enable a motion salient region to approach a target region. Yu-Fei Ma comprehensively defines a motion attention model according to the motion vector energy, the spatial correlation and the temporal correlation of the motion vector field obtained by decompression from the MPEG code stream, and a motion saliency region can be obtained through the model. Junwei Han divides attention into static attention and dynamic attention, the static attention is mainly attracted by information such as brightness and color of an image, the dynamic attention is defined by calculating the proportion of changing pixels of an area on the basis of global motion compensation, and an attention model is finally obtained by fusing the static attention and the dynamic attention and is mainly applied to moving target detection.
In summary, the research on target detection from the visual attention point of view has positive significance, but most of the current research aims at the detection of static targets, and the research on the detection of moving targets is lacked; in addition, the exercise attention model is limited to a bottom-up data-driven model method, and there is no bidirectional attention model that fuses a plurality of features such as color and motion.
Disclosure of Invention
The invention aims to provide a moving target detection method based on a particle filter visual attention model, and solves the problems that in the prior art, a moving attention model is limited to a bottom-up data drive model method, a plurality of characteristics such as color and motion are not fused, the moving attention model cannot adapt to a complex moving scene, and a moving target is difficult to detect effectively and accurately.
The invention adopts the technical scheme that a moving target detection method based on a particle filter visual attention model is characterized in that firstly, a particle filter bidirectional fusion attention model is constructed according to the Bayesian estimation principle; and then on the basis of a particle filter bidirectional fusion attention model framework, taking the motion attention and the target color attention as B-U and T-D attention inputs respectively, calculating and changing the distribution state of particles through particle weight values to form an attention saliency map, and finally determining the position of the motion target.
The invention has the beneficial effects that:
1) the method has the advantages that the video image is subjected to Gaussian multi-scale decomposition, the visual characteristics of human beings are better met, the motion vector field is subjected to superposition and filtering preprocessing, and the influence of estimation errors and noise is reduced.
2) Temporal and spatial attention are fused, so that the motion attention calculation is more accurate.
3) A particle filtering mechanism is introduced according to a Bayesian estimation principle, bottom-up and top-down attention is fused, a bidirectional fusion attention model is constructed, and the human visual attention forming process is simulated simply and effectively.
4) The moving target detection is carried out by using the bidirectional fusion attention model, and the effectiveness and the accuracy of the moving target detection are improved aiming at a complex global moving scene.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention;
FIG. 2 is a three-level Gaussian pyramid of a video image according to an embodiment of the present invention, where (a) is an original scale, (b) is a next lower scale, and (c) is a lower scale;
fig. 3 shows the corresponding motion vector field and its pre-processing result in the present invention, wherein (a) is the motion vector field, (b) the motion vector field after the filtering is superimposed, and (c) the motion vector field after the median filtering is performed;
FIG. 4 is a B-U attention saliency map of an implementation of the present invention, wherein (a) is an attention saliency map and (B) is an attention saliency heat map;
FIG. 5 is a particle filter fusion two-way attention result of the present invention, wherein (a) is the importance sampling result and (b) is the re-sampled particle distribution map;
fig. 6 is a schematic diagram of particle attention saliency map and target localization implemented in accordance with the present invention, wherein (a) a saliency value is generated for the spatial distribution of particles, (b) a saliency map is obtained from resampled particles, and (c) a target localization result.
Fig. 7 shows experimental results of a "vehicle" video sequence in example 1 of the present invention, where the first to fourth rows are respectively the 5 th, 28 th, 40 th and 60 th frames, where (a) is an original frame, (b) is a Yu-Fei Ma motion attention model result, (c) is a global motion compensation-based visual attention (GMC-VA) model result, (d) is a YumingFang two-way weighted fusion visual attention result, (e) is a two-way attention model result of the present invention, and (f) is a moving object detection positioning result of the method of the present invention;
fig. 8 shows the experimental results of the "horse" video sequence of embodiment 2 of the present invention, wherein the first to fourth rows are the 4 th, 15 th, 38 th and 100 th frames respectively.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a moving target detection method based on a particle filter visual attention model, which comprises the steps of firstly, constructing a particle filter bidirectional fusion attention model according to a Bayesian estimation principle; then based on a particle filter bidirectional fusion attention model frame, the motion attention and the target color attention are respectively used as B-U and T-D attention inputs, the particle distribution state is changed through particle weight calculation, an attention saliency map is formed (the attention saliency is calculated through the particle distribution state after filtering), and finally the position of the motion target is determined.
The method comprises the following specific implementation steps:
step 1, calculating the attention of the movement at the current t moment as B-U attention, and recording the significance asBy passingControlling the sampling of the initial importance of the particles;
carrying out Gaussian multi-scale decomposition on the video image; estimating motion vector fields of the motion vectors respectively, and preprocessing the motion vector fields; defining time and space attention factors, constructing motion attention, and obtaining by fusing multi-scale motion attentionBy passingControlling the sampling of particle importance. The method specifically comprises the following steps:
1.1) Gaussian multiscale decomposition of images
The multi-scale analysis adopts a gaussian image pyramid method, and specifically, a three-layer scale image can be obtained through gaussian smoothing and down-sampling processing, as shown in fig. 2, in the embodiment, a three-layer gaussian pyramid method is adopted.
1.2) estimating a motion vector field by adopting an optical flow method, and performing two preprocessing of superposition and filtering on the motion vector field
As shown in fig. 3(a), the motion vector field estimated by the optical flow method generally exhibits sparse and local and chaotic motion features, because the motion features of adjacent frames are not strong enough, and there is a certain amount of noise in the video signal, which is not favorable for accurate calculation of motion attention. By performing two preprocessing of superposition and filtering on the motion vector field, the influence of estimation errors and noise can be well reduced.
The motion vector superposition process is as follows: let the current frame motion vector field be MVFtThe center coordinate of the macro block is (k, l), and the corresponding motion vector is expressed asSuperimposed with the motion vectors of preceding and succeeding frames according to the formula <math>
<mrow>
<mrow>
<mo>(</mo>
<msubsup>
<mi>v</mi>
<mi>x</mi>
<mi>kl</mi>
</msubsup>
<mo>,</mo>
<msubsup>
<mi>v</mi>
<mi>y</mi>
<mi>kl</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mi>n</mi>
<mo>-</mo>
<mi>c</mi>
</mrow>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mi>n</mi>
<mo>+</mo>
<mi>c</mi>
</mrow>
</munderover>
<mrow>
<mo>(</mo>
<msubsup>
<mi>v</mi>
<mi>x</mi>
<mi>kl</mi>
</msubsup>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
<mo>,</mo>
<msubsup>
<mi>v</mi>
<mi>y</mi>
<mi>kl</mi>
</msubsup>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
<mo>)</mo>
</mrow>
</mrow>
</math> The results of the calculation and the superposition are shown in FIG. 3 (b).
The motion vectors are processed after superposition by median filtering, i.e. for each non-zero motion vector, the median of the neighboring motion vectors is used to replace its value, and the median filtering result is shown in fig. 3 (c).
1.3) calculating the attention of the movement as the attention of the B-U
Calculating the attention of the movement, defining the attention of both time and spaceAndrepresenting temporal and spatial attention, respectively, defined as:
whereinAndrepresenting the motion vector with coordinate position (i, j) in the motion vector field at time t and t-1, respectively,representsThe mean of the motion vectors within the neighborhood a,
the motor attention is obtained by linear fusion of temporal and spatial attention, i.e. <math>
<mrow>
<msubsup>
<mi>SM</mi>
<mi>t</mi>
<mrow>
<mi>B</mi>
<mo>-</mo>
<mi>U</mi>
</mrow>
</msubsup>
<mo>=</mo>
<mi>α</mi>
<mo>·</mo>
<msubsup>
<mi>SM</mi>
<mi>t</mi>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<mi>β</mi>
<mo>·</mo>
<msubsup>
<mi>SM</mi>
<mi>t</mi>
<mi>S</mi>
</msubsup>
<mo>,</mo>
</mrow>
</math>
In the formula, α and β are coefficients of positive values, and B-U attention is shown in FIG. 4.
1.4) byControlling particle importance sampling
Adopting the motion significance characteristics to adjust the density of Gaussian random particle sampling to obtain a random sampling result which changes along with the motion significance, adopting the Gaussian random sampling to obtain the initial distribution state of particles, and setting1,2, …, N independently and equally distributed, such that:
wherein, mux、μy、Andare respectivelyMean and variance of the pseudo-random sequence, expressed in (μ) by the above formulax,μy) Generating random Gaussian sampling results in a region of the coordinate center, wherein the density of the sampled particles in the region is regulated and controlled by the motion significance, assumingIs the saliency value of the saliency map at time t in (x, y) coordinates, the sampling density function is defined as follows:
wherein i and j represent the horizontal and vertical coordinates, mean values, respectively, in the saliency mapSum varianceThe significance sampling result is shown in fig. 5 (a).
If the initial time is, sampling particles according to the method to form an initial particle distribution state; otherwise, sampling a part of particles at the position of the difference between the motion attention and the previous motion attention, and replacing the same number of particles with lower weight at the previous moment to serve as the initial particle state at the moment.
Step 2, calculating T-D color attention according to target characteristics
T-D color attention significance is recorded asThe magnitude of T-D attention is measured by the degree of similarity of the target feature to the image feature,
2.1) setting the color histogram as the target feature quantization expression method, and recording asm is the number of components, the color distribution of the particle target region is defined as <math>
<mrow>
<msup>
<mover>
<mi>p</mi>
<mo>^</mo>
</mover>
<mrow>
<mo>(</mo>
<mi>u</mi>
<mo>)</mo>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<msub>
<mi>X</mi>
<mi>t</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>C</mi>
<mo>·</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>M</mi>
</munderover>
<mi>K</mi>
<mrow>
<mo>(</mo>
<mo>|</mo>
<mo>|</mo>
<mfrac>
<mrow>
<mi>x</mi>
<mo>-</mo>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
</mrow>
<mi>h</mi>
</mfrac>
<mo>|</mo>
<mo>|</mo>
<mo>)</mo>
</mrow>
<mo>·</mo>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<mi>b</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mi>u</mi>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
</math> Where Delta (□) is a Delta function,to normalize the factor so thatK (□) is an Epanechnikov kernel function defined as <math>
<mrow>
<msub>
<mi>K</mi>
<mi>E</mi>
</msub>
<mrow>
<mo>(</mo>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open='{' close=''>
<mtable>
<mtr>
<mtd>
<mn>1</mn>
<mo>-</mo>
<msup>
<mrow>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
</mrow>
<mn>2</mn>
</msup>
<mo>,</mo>
</mtd>
<mtd>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
<mo><</mo>
<mn>1</mn>
</mtd>
</mtr>
<mtr>
<mtd>
<mn>0</mn>
<mo>,</mo>
</mtd>
<mtd>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
<mo>≥</mo>
<mn>1</mn>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>.</mo>
</mrow>
</math>
2.2) T-D attention significance calculation is defined as:where ρ is the Barcharya (Bhattacharyya) coefficient.
Step 3, fusing bidirectional attention by adopting particle filtering, calculating the weight of the particles, and forming new particle distribution after resampling
The particle filtering is regarded as a Bayesian estimation solving process, namely, an observed value is given, a condition expectation value of the current state is estimated, the process is divided into three links of importance sampling, weight calculation and resampling,
3.1) importance sampling
Defining observation likelihood probabilities of B-U and T-D attention respectively, updating the particle weight by Bayesian fusion of the observation likelihood probabilities of B-U and T-D,
the importance sampling method adopts the motion significance characteristics to adjust the density of Gaussian random particle sampling as described in the step 1.4);
3.2) weight calculation
Using resampling to eliminate particles with lower weight, making the particles gather around the particles with higher weight, and setting the particle state asObserved value is ZkThen the posterior probability density function at time k is approximated as: <math>
<mrow>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>X</mi>
<mi>T</mi>
</msup>
<mo>|</mo>
<msup>
<mi>Z</mi>
<mi>T</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>≈</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msup>
<mi>λ</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
</msup>
<mo>·</mo>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>X</mi>
<mi>T</mi>
</msup>
<mo>-</mo>
<msubsup>
<mi>X</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
</math>
wherein,i is 1,2, …, N, assuming bi-directional fusion saliency map state x0:tB-U and T-D attention observations areAndthe posterior probability P (X)T|ZT) Is shown as
The assumption is that: (i) the temporal dynamic process conforms to the markov process; (ii) the observed values at different times are independent of each other, and the observed values are only related to the current state, so that the posterior probability solution of the bidirectional attention is simply deduced as follows:
according to the sampling theorem of importance, the weight of a particle is λ(i)Proportional ratioIs represented as follows:
then there are:
wherein,represents the conditional probability of the B-U attention observation under the current particle attention state, andis the conditional probability of the T-D attention observation under the B-U observation and the current particle attention state,andthe weight of the updated particle is directly determined, and is defined as follows:
3.3) resampling
After the weight of the particle is recalculated, resampling is added to eliminate the particles with lower weight, so that the particles are gathered around the particles with high weight.
Step 4, calculating an attention saliency map SM 'according to the particle distribution state at the moment'tAnd determining the target position
4.1) particle saliency map SM't
After resampling, the density degree of particle distribution reflects the attention intensity, the attention significance of the dense particle distribution area is strong, the attention significance of the sparse distribution area is weak,
according to the distribution state of the particles, the attention saliency is defined as follows on a two-dimensional space:
wherein (x, y) is the spatial position of the particle distribution, n is the number of particles,for the window width, the window function is a two-dimensional gaussian window function, and then the above equation is transformed:
a schematic diagram of the calculation result of attention saliency is shown in fig. 6(a), and a current attention saliency map is shown in fig. 6(b) as a current attention saliency value after superimposing and normalizing a calculated attention saliency value and a moving attention
4.2) target location
After the processing, the distribution state of all the particles in each frame of image is obtained, the larger the particle swarm density is, the higher the possibility of existence of the moving target is, therefore, the target position information is estimated by calculating the particle swarm position mean value, and the calculation formula is as follows:and (5) obtaining the finished product.
The final target location result is shown in fig. 6 (c).
Fig. 7 and 8 show the test experiment results of the "aircraft" sequence and the "horse" sequence, respectively. The "aircraft" sequence: the target to be detected is selected as a red-coat pilot, the target slides at a high speed in the air, the background is a ground distant view object, the texture is complex, the aircraft moves towards the upper part of the image, and the lens moves along with the aircraft; the "horse" sequence: the target to be measured is selected as a horse, the target moves violently, the background is trees and grasslands, the texture changes flatly, the horse runs to the left side of the image, and the lens moves to the left side rapidly. The test result shows that the overall effect of the method is superior to that of the MA algorithm, the GMC-VA algorithm and the YumingFang algorithm. The MA algorithm is influenced by the motion estimation precision, so that the motion attention calculation is inaccurate, when the motion is relatively simple, a good effect can be obtained, such as an 'aircraft' sequence, and when the motion is strong, the motion attention is relatively messy, and the significance of a motion target cannot be effectively reflected, such as a 'horse' sequence; the GMA-VA algorithm integrates global motion compensation and static attention, but generally, the target significance effect is poor due to the fact that global motion estimation has errors; the experimental result shows that the method of the invention has a significant attention map, compared with the attention maps of the MA algorithm, the GMC-VA algorithm and the YumingFang algorithm, the background noise interference is greatly weakened, the particle distribution state changes along with the motion state and the target position due to the particle filtering process, and the particles are finally converged and gathered in the motion target area, so that the attention accuracy is greatly improved.
Claims (6)
1. A moving target detection method based on a particle filter visual attention model is characterized in that firstly, a particle filter bidirectional fusion attention model is constructed according to a Bayesian estimation principle; and then on the basis of a particle filter bidirectional fusion attention model framework, taking the motion attention and the target color attention as B-U and T-D attention inputs respectively, calculating and changing the distribution state of particles through particle weight values to form an attention saliency map, and finally determining the position of the motion target.
2. The method of claim 1, comprising the steps of:
step 1, calculating the attention of the movement at the current t moment as B-U attention, and recording the significance asBy passingControlling the sampling of the initial importance of the particles;
step 2, calculating the T-D color attention according to the target characteristics;
step 3, adopting particle filtering to fuse bidirectional attention, calculating a particle weight, and forming new particle distribution after resampling;
step 4, calculating the attention saliency map according to the particle distribution state at the momentAnd determines the target location.
3. The method for detecting the moving object based on the particle filter visual attention model according to claim 2, wherein the step 1 is performed according to the following specific steps,
1.1) Gaussian multiscale decomposition of images
The multi-scale analysis adopts a Gaussian image pyramid method;
1.2) estimating a motion vector field by adopting an optical flow method, and performing two preprocessing of superposition and filtering on the motion vector field
The motion vector superposition process is as follows: let the current frame motion vector field be MVFtThe center coordinate of the macro block is (k, l), and the corresponding motion vector is expressed asSuperimposed with the motion vectors of preceding and succeeding frames according to the formula <math>
<mrow>
<mrow>
<mo>(</mo>
<msubsup>
<mi>v</mi>
<mi>x</mi>
<mi>kl</mi>
</msubsup>
<mo>,</mo>
<msubsup>
<mi>v</mi>
<mi>y</mi>
<mi>kl</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mi>n</mi>
<mo>-</mo>
<mi>c</mi>
</mrow>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mi>n</mi>
<mo>+</mo>
<mi>c</mi>
</mrow>
</munderover>
<mrow>
<mo>(</mo>
<msubsup>
<mi>v</mi>
<mi>x</mi>
<mi>kl</mi>
</msubsup>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
<mo>,</mo>
<msubsup>
<mi>v</mi>
<mi>y</mi>
<mi>kl</mi>
</msubsup>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
<mo>)</mo>
</mrow>
</mrow>
</math> The calculation is carried out according to the calculation,
the motion vectors are processed by median filtering after superposition, namely, for each non-zero motion vector, the median of the adjacent motion vectors is used for replacing the value of the motion vector;
1.3) calculating the attention of the movement as the attention of the B-U
Calculating the attention of the movement, defining the attention of both time and spaceAndrepresenting temporal and spatial attention, respectively, defined as:
whereinAndrepresenting motion at times t and t-1, respectivelyA motion vector with coordinate position (i, j) in the vector field,representsThe mean of the motion vectors within the neighborhood a,
the motor attention is obtained by linear fusion of temporal and spatial attention, i.e. <math>
<mrow>
<msubsup>
<mi>SM</mi>
<mi>t</mi>
<mrow>
<mi>B</mi>
<mo>-</mo>
<mi>U</mi>
</mrow>
</msubsup>
<mo>=</mo>
<mi>α</mi>
<mo>·</mo>
<msubsup>
<mi>SM</mi>
<mi>t</mi>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<mi>β</mi>
<mo>·</mo>
<mi>S</mi>
<mo>,</mo>
</mrow>
</math>
In the formula, alpha and beta are coefficients with positive values;
1.4) byControlling particle importance sampling
Adopting the motion significance characteristics to adjust the density of Gaussian random particle sampling to obtain a random sampling result which changes along with the motion significance, adopting the Gaussian random sampling to obtain the initial distribution state of particles, and setting1,2, …, N independently and equally distributed, such that:
wherein, mux、μy、Andare respectivelyMean and variance of the pseudo-random sequence, expressed in (μ) by the above formulax,μy) Generating random Gaussian sampling results in a region of the coordinate center, wherein the density of the sampled particles in the region is regulated and controlled by the motion significance, assumingIs the saliency value of the saliency map at time t in (x, y) coordinates, the sampling density function is defined as follows:
wherein i and j represent the horizontal and vertical coordinates, mean values, respectively, in the saliency mapSum variance <math>
<mrow>
<msubsup>
<mi>δ</mi>
<mi>A</mi>
<mn>2</mn>
</msubsup>
<mo>=</mo>
<mfrac>
<mrow>
<munder>
<mi>Σ</mi>
<mrow>
<mo>{</mo>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
<mo>|</mo>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
<mo>∈</mo>
<msup>
<mi>SM</mi>
<mi>t</mi>
</msup>
<mo>}</mo>
</mrow>
</munder>
<msup>
<mrow>
<mo>[</mo>
<mi>SM</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mi>u</mi>
<mo>]</mo>
</mrow>
<mn>2</mn>
</msup>
</mrow>
<mi>N</mi>
</mfrac>
<mo>,</mo>
</mrow>
</math>
If the initial time is, sampling particles according to the method to form an initial particle distribution state; otherwise, sampling a part of particles at the position of the difference between the motion attention and the previous motion attention, and replacing the same number of particles with lower weight at the previous moment to serve as the initial particle state at the moment.
4. The method for detecting the moving object based on the particle filter visual attention model according to claim 3, wherein the step 2 comprises the following specific steps:
T-D color attention significance is recorded asThe magnitude of T-D attention is measured by the degree of similarity of the target feature to the image feature,
2.1) setting the color histogram as the target feature quantization expression method, and recording asm is the number of components, the particle targetThe color distribution of the region is defined as <math>
<mrow>
<msup>
<mover>
<mi>p</mi>
<mo>^</mo>
</mover>
<mrow>
<mo>(</mo>
<mi>u</mi>
<mo>)</mo>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<msub>
<mi>X</mi>
<mi>t</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>C</mi>
<mo>·</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>M</mi>
</munderover>
<mi>K</mi>
<mrow>
<mo>(</mo>
<mo>|</mo>
<mo>|</mo>
<mfrac>
<mrow>
<mi>x</mi>
<mo>-</mo>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
</mrow>
<mi>h</mi>
</mfrac>
<mo>|</mo>
<mo>|</mo>
<mo>)</mo>
</mrow>
<mo>·</mo>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<mi>b</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mi>u</mi>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
</math> Where Delta (□) is a Delta function,to normalize the factor so thatK (□) is an Epanechnikov kernel function defined as <math>
<mrow>
<msub>
<mi>K</mi>
<mi>E</mi>
</msub>
<mrow>
<mo>(</mo>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open='{' close=''>
<mtable>
<mtr>
<mtd>
<mn>1</mn>
<mo>-</mo>
<msup>
<mrow>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
</mrow>
<mn>2</mn>
</msup>
<mo>,</mo>
</mtd>
<mtd>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
<mo><</mo>
<mn>1</mn>
</mtd>
</mtr>
<mtr>
<mtd>
<mn>0</mn>
<mo>,</mo>
</mtd>
<mtd>
<mo>|</mo>
<mo>|</mo>
<mi>r</mi>
<mo>|</mo>
<mo>|</mo>
<mo>≥</mo>
<mn>1</mn>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>;</mo>
</mrow>
</math>
2.2) T-D attention significance calculation is defined as:wherein ρ is the Barcharian coefficient.
5. The method according to claim 4, wherein the step 3 specifically comprises the following steps:
3.1) importance sampling
Defining observation likelihood probabilities of B-U and T-D attention respectively, updating the particle weight by Bayesian fusion of the observation likelihood probabilities of B-U and T-D,
the importance sampling method adopts the motion significance characteristics to adjust the density of Gaussian random particle sampling as described in the step 1.4);
3.2) weight calculation
The elimination weight value by resampling is lowerThe particles of (2) are aggregated around the particles having a high weight, and the state of the particles is set toIf the observed value is Zk, the posterior probability density function at the time k is approximated as: <math>
<mrow>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>X</mi>
<mi>T</mi>
</msup>
<mo>|</mo>
<msup>
<mi>Z</mi>
<mi>T</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>≈</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msup>
<mi>λ</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
</msup>
<mo>·</mo>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>X</mi>
<mi>T</mi>
</msup>
<mo>-</mo>
<msubsup>
<mi>X</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
</math>
wherein,i is 1,2, …, N, assuming bi-directional fusion saliency map state x0:tB-U and T-D attention observations areAndthe posterior probability P (X)T|ZT) Is shown as
The assumption is that: (i) the temporal dynamic process conforms to the markov process; (ii) the observed values at different time are mutually independent, and the observed values are only related to the current state, so that the posterior of the bidirectional attention is obtained
The probability solution is simply derived as follows:
according to the sampling theorem of importance, the weight of a particle is λ(i)Proportional ratioIs represented as follows:
then there are:
wherein,represents the conditional probability of the B-U attention observation under the current particle attention state, andis the conditional probability of the T-D attention observation under the B-U observation and the current particle attention state,andthe weight of the updated particle is directly determined, and is defined as follows:
3.3) resampling
And after the weight of the particles is recalculated, resampling is added to eliminate the particles with lower weight, so that the particles are gathered around the particles with higher weight.
6. The method according to claim 5, wherein the step 4 specifically comprises the following steps:
4.1) particle saliency map SM't
According to the distribution state of the particles, the attention saliency is defined as follows on a two-dimensional space:
wherein (x, y) is the spatial position of the particle distribution, n is the number of particles,for the window width, the window function is a two-dimensional gaussian window function, and then the above equation is transformed:
superposing and normalizing the attention significance value obtained by calculation with the movement attention to be used as a current attention significance value;
4.2) target location
After the processing, the distribution state of all the particles in each frame of image is obtained, and the target position information is estimated by calculating the position mean value of the particle swarm, wherein the calculation formula is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410255267.5A CN104050685B (en) | 2014-06-10 | 2014-06-10 | Moving target detecting method based on particle filter visual attention model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410255267.5A CN104050685B (en) | 2014-06-10 | 2014-06-10 | Moving target detecting method based on particle filter visual attention model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104050685A true CN104050685A (en) | 2014-09-17 |
CN104050685B CN104050685B (en) | 2017-05-31 |
Family
ID=51503468
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410255267.5A Expired - Fee Related CN104050685B (en) | 2014-06-10 | 2014-06-10 | Moving target detecting method based on particle filter visual attention model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104050685B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104732559A (en) * | 2015-02-02 | 2015-06-24 | 大连民族学院 | Multi-target detecting and tracking method based on RGB-D data |
CN104778713A (en) * | 2015-04-27 | 2015-07-15 | 清华大学深圳研究生院 | Image processing method |
CN106951870A (en) * | 2017-02-15 | 2017-07-14 | 重庆警察学院 | The notable event intelligent detecting prewarning method of monitor video that active vision notes |
CN108921051A (en) * | 2018-06-15 | 2018-11-30 | 清华大学 | Pedestrian's Attribute Recognition network and technology based on Recognition with Recurrent Neural Network attention model |
CN109478248A (en) * | 2016-05-20 | 2019-03-15 | 渊慧科技有限公司 | Classified using collection is compared to input sample |
CN109816100A (en) * | 2019-01-30 | 2019-05-28 | 中科人工智能创新技术研究院(青岛)有限公司 | A kind of conspicuousness object detecting method and device based on two-way fusion network |
CN109902763A (en) * | 2019-03-19 | 2019-06-18 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating characteristic pattern |
CN110623680A (en) * | 2019-08-30 | 2019-12-31 | 中国民用航空飞行学院 | Method for testing psychological health of civil aviation flight trainees |
CN111723829A (en) * | 2019-03-18 | 2020-09-29 | 四川大学 | Full-convolution target detection method based on attention mask fusion |
CN115346180A (en) * | 2022-10-18 | 2022-11-15 | 湖北车安达信息科技有限公司 | Road congestion condition detection method and device based on low-frequency data |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521844A (en) * | 2011-11-30 | 2012-06-27 | 湖南大学 | Particle filter target tracking improvement method based on vision attention mechanism |
-
2014
- 2014-06-10 CN CN201410255267.5A patent/CN104050685B/en not_active Expired - Fee Related
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104732559B (en) * | 2015-02-02 | 2017-08-01 | 大连民族学院 | A kind of multi-target detection and tracking method based on RGB D data |
CN104732559A (en) * | 2015-02-02 | 2015-06-24 | 大连民族学院 | Multi-target detecting and tracking method based on RGB-D data |
CN104778713A (en) * | 2015-04-27 | 2015-07-15 | 清华大学深圳研究生院 | Image processing method |
CN104778713B (en) * | 2015-04-27 | 2017-07-07 | 清华大学深圳研究生院 | A kind of image processing method |
US11714993B2 (en) | 2016-05-20 | 2023-08-01 | Deepmind Technologies Limited | Classifying input examples using a comparison set |
CN109478248A (en) * | 2016-05-20 | 2019-03-15 | 渊慧科技有限公司 | Classified using collection is compared to input sample |
US12073304B2 (en) | 2016-05-20 | 2024-08-27 | Deepmind Technologies Limited | Classifying input examples using a comparison set |
CN109478248B (en) * | 2016-05-20 | 2022-04-05 | 渊慧科技有限公司 | Method, system, and storage medium for classifying input samples using a comparison set |
CN106951870B (en) * | 2017-02-15 | 2020-07-17 | 重庆警察学院 | Intelligent detection and early warning method for active visual attention of significant events of surveillance video |
CN106951870A (en) * | 2017-02-15 | 2017-07-14 | 重庆警察学院 | The notable event intelligent detecting prewarning method of monitor video that active vision notes |
CN108921051A (en) * | 2018-06-15 | 2018-11-30 | 清华大学 | Pedestrian's Attribute Recognition network and technology based on Recognition with Recurrent Neural Network attention model |
CN108921051B (en) * | 2018-06-15 | 2022-05-20 | 清华大学 | Pedestrian attribute identification network and technology based on cyclic neural network attention model |
CN109816100A (en) * | 2019-01-30 | 2019-05-28 | 中科人工智能创新技术研究院(青岛)有限公司 | A kind of conspicuousness object detecting method and device based on two-way fusion network |
CN111723829A (en) * | 2019-03-18 | 2020-09-29 | 四川大学 | Full-convolution target detection method based on attention mask fusion |
CN111723829B (en) * | 2019-03-18 | 2022-05-06 | 四川大学 | Full-convolution target detection method based on attention mask fusion |
CN109902763B (en) * | 2019-03-19 | 2020-05-15 | 北京字节跳动网络技术有限公司 | Method and device for generating feature map |
CN109902763A (en) * | 2019-03-19 | 2019-06-18 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating characteristic pattern |
CN110623680A (en) * | 2019-08-30 | 2019-12-31 | 中国民用航空飞行学院 | Method for testing psychological health of civil aviation flight trainees |
CN115346180A (en) * | 2022-10-18 | 2022-11-15 | 湖北车安达信息科技有限公司 | Road congestion condition detection method and device based on low-frequency data |
Also Published As
Publication number | Publication date |
---|---|
CN104050685B (en) | 2017-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104050685B (en) | Moving target detecting method based on particle filter visual attention model | |
EP3614308B1 (en) | Joint deep learning for land cover and land use classification | |
CN107657279B (en) | Remote sensing target detection method based on small amount of samples | |
Li et al. | Adaptive pyramid mean shift for global real-time visual tracking | |
CN111191667B (en) | Crowd counting method based on multiscale generation countermeasure network | |
CN104794737B (en) | A kind of depth information Auxiliary Particle Filter tracking | |
CN105022982A (en) | Hand motion identifying method and apparatus | |
CN106548153A (en) | Video abnormality detection method based on graph structure under multi-scale transform | |
CN104200494A (en) | Real-time visual target tracking method based on light streams | |
CN102682452A (en) | Human movement tracking method based on combination of production and discriminant | |
CN106503652A (en) | Based on the accident detection method that low-rank adaptive sparse is rebuild | |
CN112784736A (en) | Multi-mode feature fusion character interaction behavior recognition method | |
CN109359549A (en) | A kind of pedestrian detection method based on mixed Gaussian and HOG_LBP | |
CN110827262B (en) | Weak and small target detection method based on continuous limited frame infrared image | |
CN106780560A (en) | A kind of feature based merges the bionic machine fish visual tracking method of particle filter | |
CN104751493A (en) | Sparse tracking method on basis of gradient texture features | |
CN103902989A (en) | Human body motion video recognition method based on non-negative matrix factorization | |
CN104751111A (en) | Method and system for recognizing human action in video | |
Zhang et al. | Detection and tracking of human track and field motion targets based on deep learning | |
Zhu et al. | Tracking multiple objects through occlusion with online sampling and position estimation | |
CN103839280B (en) | A kind of human body attitude tracking of view-based access control model information | |
CN106056078A (en) | Crowd density estimation method based on multi-feature regression ensemble learning | |
Song et al. | 2d lidar map prediction via estimating motion flow with gru | |
Feng | Mask RCNN-based single shot multibox detector for gesture recognition in physical education | |
CN106503647A (en) | The accident detection method that structural sparse is represented is approached based on low-rank |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170531 Termination date: 20200610 |