CN103810691A - Video-based automatic teller machine monitoring scene detection method and apparatus - Google Patents

Video-based automatic teller machine monitoring scene detection method and apparatus Download PDF

Info

Publication number
CN103810691A
CN103810691A CN201210444071.1A CN201210444071A CN103810691A CN 103810691 A CN103810691 A CN 103810691A CN 201210444071 A CN201210444071 A CN 201210444071A CN 103810691 A CN103810691 A CN 103810691A
Authority
CN
China
Prior art keywords
image
pixel
monitoring
value
monitoring image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210444071.1A
Other languages
Chinese (zh)
Other versions
CN103810691B (en
Inventor
任烨
童俊艳
蔡巍伟
浦世亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201210444071.1A priority Critical patent/CN103810691B/en
Publication of CN103810691A publication Critical patent/CN103810691A/en
Application granted granted Critical
Publication of CN103810691B publication Critical patent/CN103810691B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

The invention discloses a video-based automatic teller machine (ATM) monitoring scene detection method and apparatus. The method comprises the following steps that: a background model related with an ATM monitoring scene is established, wherein a background image and a predetermined parameter respectively corresponding to each pixel point in the background image are determined; and after model establishment completion, when a frame of monitoring image X is obtain each time, the following steps are carried out: a binary foreground image of the monitoring image X is generated according to the background model; edge texture information of the monitoring image X and the background image is respectively obtained and edge similarity of the monitoring image X and the background image is determined according to the obtained edge texture information; and according to the generated binary foreground image and the determined edge similarity, whether someone is in the monitoring image X is determined.

Description

A kind of ATM (Automatic Teller Machine) monitoring scene detection method and device based on video
Technical field
The present invention relates to video technique, particularly a kind of ATM (Automatic Teller Machine) (ATM, AutomaticTeller Machine) monitoring scene detection method and device based on video.
Background technology
In prior art, how detect in ATM monitoring scene whether have people with physical sensors, as conventional infrared emission.But infrared emission is subject to foreign matter and disturbs, as domestic once there be the foreign matter of interference to appear at correlation range ring, there will be the wrong report that always has people, thereby cause the accuracy of testing result to reduce.
Summary of the invention
In view of this, the invention provides a kind of ATM monitoring scene detection method and device based on video, can improve the accuracy of testing result.
For achieving the above object, technical scheme of the present invention is achieved in that
An ATM monitoring scene detection method based on video, comprising:
Set up the background model about described ATM monitoring scene, comprise preset parameter corresponding to each pixel difference of determining in background image and background image;
After modeling completes, in the time often getting a frame monitoring image X, carry out respectively following processing:
Generate the two-value foreground image of monitoring image X according to background model;
Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting;
Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
An ATM monitoring scene pick-up unit based on video, comprising:
MBM, for setting up the background model about described ATM monitoring scene, comprises preset parameter corresponding to each pixel difference of determining in background image and background image, and set up background model is sent to detection module;
Described detection module, after completing in modeling, in the time often getting a frame monitoring image X, carries out respectively following processing: the two-value foreground image that generates monitoring image X according to background model; Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
Visible, adopt scheme of the present invention, the brightness prospect of combining image and Edge texture information determine in ATM monitoring scene, whether there is people, thereby have improved the accuracy of testing result; And scheme of the present invention, applicable to various ATM monitoring scene, has broad applicability, be convenient to universal and promote.
Accompanying drawing explanation
Fig. 1 is the process flow diagram that the present invention is based on the ATM monitoring scene detection method embodiment of video.
Fig. 2 is the schematic diagram of existing Sobel operator.
Embodiment
For problems of the prior art, a kind of ATM monitoring scene detection scheme based on video is proposed in the present invention, can improve the accuracy of testing result.
Monitoring image in scheme of the present invention utilizes ATM monitoring camera to photograph, and ATM monitoring camera needs to photograph storage/access money people's zone of action.
For make technical scheme of the present invention clearer, understand, referring to the accompanying drawing embodiment that develops simultaneously, scheme of the present invention is described in further detail.
Fig. 1 is the process flow diagram that the present invention is based on the ATM monitoring scene detection method embodiment of video.As shown in Figure 1, comprising:
Step 11: set up the background model about ATM monitoring scene, comprise preset parameter corresponding to each pixel difference of determining in background image and background image.
Because ATM monitoring scene environmental facies, to single, therefore can adopt single Gaussian Background modeling, single Gaussian Background modeling is applicable to unimodal distribution background.
Scheme of the present invention is only carried out modeling for the gray-scale value of each pixel, and the each pixel respectively preset parameter of correspondence comprises: average μ and variances sigma etc.
The specific implementation of this step can comprise:
A, obtain a frame monitoring image, by this monitoring image image as a setting;
For the each pixel in this background image, respectively using the gray-scale value of this pixel as average corresponding to this pixel, using the variance of the gray-scale value of this pixel as variance corresponding to this pixel.
Whether B, definite monitoring image number getting equal M, and M is greater than 1 positive integer, if so,, using the up-to-date background image obtaining as final required background image, complete modeling, if not, obtain the monitoring image that a frame is new, and perform step C.
Background image B after C, definite renewal new(x, y):
B new(x,y)=(1-ρ)B old(x,y)+ρI(x,y); (1)
Wherein, ρ represents renewal rate, and its value equals 1/N, and N represents the monitoring image number getting, and I (x, y) represents the up-to-date monitoring image getting, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variances sigma corresponding to this pixel new, have:
σ new=(1-ρ)σ old+ρd; (2)
Wherein, σ oldrepresent B old(x, y) variance corresponding to pixel identical with the coordinate position of this pixel in, d represents I (x, y) difference between the average that in, in the gray-scale value of the pixel identical with the coordinate position of this pixel and Bold (x, y), the pixel identical with the coordinate position of this pixel is corresponding;
Afterwards, repeated execution of steps B.
The concrete value of M can be decided according to the actual requirements, such as can be 100.
Illustrate:
The value of supposing M is 100, for ease of statement, this 100 frame monitoring image is numbered respectively to monitoring image 1~monitoring image 100 according to acquisition time by the order after arriving first;
First, set up initial background model according to monitoring image 1, by monitoring image 1 image as a setting, and determine respectively corresponding average and the variance of each pixel in this background image;
Afterwards, according to formula (1), (2), utilize monitoring image 2 to upgrade the up-to-date background model obtaining, comprise the background image of determining after renewal and determine respectively corresponding average and the variance etc. of each pixel in the background image after renewal, wherein, the value of ρ equals 1/2;
In addition after, according to formula (1), (2), utilize monitoring image 3 to upgrade the up-to-date background model obtaining, comprise the background image of determining after renewal and determine respectively corresponding average and the variance etc. of each pixel in the background image after renewal, wherein, the value of ρ equals 1/3;
The processing mode that is numbered 4~99 monitoring image repeats no more;
Finally, according to formula (1), (2), utilize monitoring image 100 to upgrade the up-to-date background model obtaining, comprise the background image of determining after renewal and determine respectively corresponding average and the variance etc. of each pixel in the background image after renewal, wherein, the value of ρ equals 1/100, and using the background image finally obtaining and the each pixel determined respectively corresponding average and variance as final required background model.
Step 12: after modeling completes, in the time often getting a frame monitoring image X, carry out respectively following processing: the two-value foreground image that generates monitoring image X according to background model; Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
For ease of statement, in scheme of the present invention, represent the monitoring image that arbitrary needs have or not people to detect with monitoring image X.
In actual applications, because ATM monitoring scene may change, therefore, can upgrade at any time to the background model of setting up in step 11 accuracy of the testing result when guaranteeing whether subsequent detection has people.Specifically, can, whether have people at every turn determining monitoring image X according to the two-value foreground image generating and the edge similarity determined after, utilize monitoring image X to upgrade original background model.
Correspondingly, for arbitrary monitoring image X, the realization of step 12 can be: the two-value foreground image that generates monitoring image X according to the up-to-date background model obtaining (utilizing the background model after the up-to-date monitoring image renewal getting before monitoring image X); Obtain respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine the edge similarity of monitoring image X and the up-to-date background image obtaining according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
Below above-mentioned related realization is described in detail respectively.
One) utilize monitoring image X to upgrade original background model
Specific implementation can comprise:
Determine the background image B after upgrading new(x, y):
B new(x,y)=(1-ρ)B old(x,y)+ρI(x,y); (1)
Wherein, I (x, y) represents monitoring image X, the i.e. up-to-date monitoring image getting, B old(x, y) represents the background image before renewal.
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding.
Wherein, ρ represents renewal rate, and its value can comprise several as follows:
1), in the time determining in I (x, y) nobody,, in the time determining in monitoring image X nobody, the value of ρ is set to 0.01, thereby background model is constantly updated, to adapt to the slow variation of the scenes such as illumination;
2) when determining while having people in I (x, y), the value of ρ is set to 0,, in the time having people in ATM monitoring scene, stops being upgraded by background model;
3) from T to T-t, in this time period, in ATM monitoring scene, there is people when determining always, and ATM monitoring scene is always in the time of transfixion state, and the value of ρ is set to 1, T and represents to get I (x, y) moment, t > 0;
For preventing that ATM monitoring scene from changing suddenly, as ATM monitoring scene is transformed, cause and be judged as people always, can be when having people in ATM monitoring scene, and the transfixion time, while exceeding predetermined threshold as 2 minutes (value that is t is 2 minutes), makes ρ=1, the background of resetting, by present image I (x, y) image as a setting.
In this time period, whether ATM monitoring scene always can be as follows in transfixion state from T to T-t for how to confirm:
For any two two field picture I that get in this time period from T to T-t 1(x, y) and I 2(x, y), carry out respectively following processing:
Calculate Dif (x, y)=I 1(x, y)-I 2(x, y); (3)
Wherein, Dif (x, y) represents frame difference image, I 1(x, y) is prior to I 2(x, y) gets;
For Dif (x, y) the each pixel in, whether the gray-scale value of determining respectively this pixel is greater than predetermined threshold T1, if so, the value of this pixel is set to 1, otherwise, be set to 0, thereby the poor bianry image Dif_Fg of the frame that obtains Dif (x, y) (x, y);
The number Dif_Num of the pixel that in statistics Dif_Fg (x, y), value is 1, and whether definite Dif_Num be less than predetermined threshold T2, if so, determines I 1(x, y) and I 2between (x, y) in transfixion state;
If all in transfixion state, can determine from T to T-t in this time period that ATM monitoring scene is always in transfixion state between any two two field pictures that get in this time period from T to T-t.
The concrete value of T1 and T2 all can be decided according to the actual requirements, such as, the value that the value of T1 can be 10, T2 can be 50.
Two) according to the two-value foreground image of the up-to-date background model generation monitoring image X obtaining
For the each pixel in monitoring image X, can carry out respectively following processing:
Calculate the difference d between the gray-scale value of this pixel and average corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Calculate wherein, σ represents variance corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Determine
Figure BDA00002372029100062
result of calculation whether be greater than predetermined threshold T0, if so, the value of this pixel is set to 1, otherwise, be set to 0, thereby generate the two-value foreground image of monitoring image X.
The concrete value of T0 can be decided according to the actual requirements, such as being 9.
After generating the two-value foreground image of monitoring image X, also can carry out successively dilation and erosion operation to the two-value foreground image generating, the isolated point forming to remove noise, and then guarantee whether subsequent detection has the accuracy of people's testing result.
Three) obtain respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine the edge similarity of monitoring image X and the up-to-date background image obtaining according to the Edge texture information getting
Specific implementation can comprise:
1) obtain respectively horizontal edge image and the vertical edge image of monitoring image X, and obtain respectively horizontal edge image and the vertical edge image of the up-to-date background image obtaining.
In actual applications, can utilize Sobel operator to obtain respectively horizontal edge image and the vertical edge image of monitoring image X and the up-to-date background image obtaining, how be retrieved as prior art.
Fig. 2 is the schematic diagram of existing Sobel operator.As shown in Figure 2, can utilize the Sobel operator on the left side to obtain the horizontal edge image of monitoring image X and the up-to-date background image obtaining, utilize the Sobel algorithm on the right obtain monitoring image X and the up-to-date background image obtaining vertical edge image.
2) according to the horizontal edge image of monitoring image X and vertical edge image, for the each pixel in monitoring image X, calculate respectively the gradient magnitude I of this pixel gxy:
I gxy=|I gx|+|I gy|; (4)
Wherein, I gxrepresent the horizontal gradient value of this pixel, I gyrepresent the vertical gradient value of this pixel, || represent to take absolute value;
According to the horizontal edge image of the up-to-date background image obtaining and vertical edge image, for the each pixel in the up-to-date background image obtaining, calculate respectively the gradient magnitude B of this pixel gxy:
B gxy=|B gx|+|B gy|;(5)
Wherein, B gxrepresent the horizontal gradient value of this pixel, B gyrepresent the vertical gradient value of this pixel.
3) the edge similarity ESIM of calculating monitoring image X and the up-to-date background image obtaining:
ESIM = Σ ( 2 * I gxy * B gxy ) Σ ( I gxy 2 + B gxy 2 ) ; - - - ( 6 )
Wherein, the span of x is from 1 to E, and the span of y is from 1 to F, and E represents the pixel number in monitoring image X horizontal direction, and F represents the pixel number on monitoring image X vertical direction.
Four) determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined
Specific implementation can comprise:
1) the pixel number Fg that in the two-value foreground image of Statistical monitor image X, value is 1 num.
2) determine whether to meet the following conditions:
Flag=(Fg num/Area>T3)∩(ESIM<T4);(7)
Wherein, Area represents the product of the pixel number on pixel number and the vertical direction in monitoring image X horizontal direction, and T3 and T4 all represent predetermined threshold, ∩ represent and;
If meet above-mentioned condition, the value of Flag is 1, determine in monitoring image X and have people, otherwise unmanned.
The concrete value of T3 and T4 all can be decided according to the actual requirements, such as, the value that the value of T3 can be 0.6, T4 can be 0.8.
Consider that brightness prospect changes interference to light and waits comparatively responsive, whether the simple brightness prospect that relies on has people's detection may cause erroneous judgement, therefore judge in monitoring image X, whether there is people in conjunction with brightness prospect and Edge texture information, thereby improved the accuracy of testing result.
So far, completed the introduction about the inventive method embodiment.
Based on above-mentioned introduction, the present invention discloses a kind of ATM monitoring scene pick-up unit based on video, comprising:
MBM, for setting up the background model about ATM monitoring scene, comprises preset parameter corresponding to each pixel difference of determining in background image and background image, and set up background model is sent to detection module;
Detection module, after completing in modeling, in the time often getting a frame monitoring image X, carries out respectively following processing: the two-value foreground image that generates monitoring image X according to background model; Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
Wherein, in MBM, can comprise:
The first processing unit, for obtaining successively M frame monitoring image, M is greater than 1 positive integer, and the each frame monitoring image getting is sent to respectively to the second processing unit;
The second processing unit, be used for the first frame monitoring image image as a setting receiving, and for the each pixel in this background image, respectively using the gray-scale value of this pixel as average corresponding to this pixel, using the variance of the gray-scale value of this pixel as variance corresponding to this pixel;
Afterwards, in the time often receiving a frame monitoring image, be handled as follows respectively:
Determine the background image B after upgrading new(x, y):
B new(x,y)=(1-ρ)B old(x,y)+ρI(x,y); (1)
Wherein, ρ represents renewal rate, and its value equals 1/N, and N represents the monitoring image number receiving, and I (x, y) represents the up-to-date monitoring image receiving, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding.
In detection module, can comprise:
The 3rd processing unit, for obtaining successively each frame monitoring image, and sends to respectively fourth processing unit by the each frame monitoring image getting;
Fourth processing unit, in the time often receiving a frame monitoring image X, carries out respectively following processing: the two-value foreground image that generates monitoring image X according to background model; Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
In detection module, also can further comprise:
The 5th processing unit, for determining when fourth processing unit after whether monitoring image X have people, utilizes monitoring image X to upgrade original background model;
Correspondingly, fourth processing unit generates the two-value foreground image of monitoring image X according to the up-to-date background model obtaining; Obtain respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine the edge similarity of monitoring image X and the up-to-date background image obtaining according to the Edge texture information getting.
Particularly,
The 5th processing unit is determined the background image B after upgrading new(x, y):
B new(x,y)=(1-ρ)B old(x,y)+ρI(x,y); (1)
Wherein, I (x, y) represents monitoring image X, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding;
Wherein, ρ represents renewal rate;
In the time determining in I (x, y) nobody, the value of ρ is set to 0.01;
When determining while having people in I (x, y), the value of ρ is set to 0;
From T to T-t, in this time period, in ATM monitoring scene, have people when determining, and ATM monitoring scene is always in the time of transfixion state, the value of ρ is set to the moment that 1, T represents to get I (x, y), t > 0 always.
For any two two field picture I that get in this time period from T to T-t 1(x, y) and I 2(x, y), the 5th processing unit carries out respectively following processing:
Calculate Dif (x, y)=I 1(x, y)-I 2(x, y); (3)
Wherein, Dif (x, y) represents frame difference image, I 1(x, y) is prior to I 2(x, y) gets;
For Dif (x, y) the each pixel in, whether the gray-scale value of determining respectively this pixel is greater than predetermined threshold T1, if so, the value of this pixel is set to 1, otherwise, be set to 0, obtain the poor bianry image Dif_Fg of frame (x, y) of Dif (x, y);
The number Dif_um of the pixel that in statistics Dif_Fg (x, y), value is 1, and whether definite Dif_Num be less than predetermined threshold T2, if so, determines I 1(x, y) and I 2between (x, y) in transfixion state;
If between any two two field pictures that get in this time period from T to T-t all in transfixion state, determine from T to T-t in this time period ATM monitoring scene always in transfixion state.
In fourth processing unit, can specifically comprise again:
Foreground detection subelement, for generating the two-value foreground image of monitoring image X according to the up-to-date background model obtaining, and sends to analysis subelement by the two-value foreground image of generation;
Edge similarity is determined subelement, for obtaining respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine according to the Edge texture information getting and the edge similarity of monitoring image X and the up-to-date background image obtaining the edge similarity of determining is sent to analysis subelement;
Analyze subelement, for determining according to the two-value foreground image and the edge similarity that receive whether monitoring image X has people.
Foreground detection subelement, for the each pixel in monitoring image X, carries out respectively following processing:
Calculate the difference d between the gray-scale value of this pixel and average corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Calculate d 22) -1; Wherein, σ represents variance corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Determine
Figure BDA00002372029100101
result of calculation whether be greater than predetermined threshold T0, if so, the value of this pixel is set to 1, otherwise, be set to 0.
Foreground detection subelement can be further used for, and after generating the two-value foreground image of monitoring image X, this two-value foreground image is carried out to dilation and erosion operation successively, and the two-value foreground image after dilation and erosion operation is sent to analysis subelement.
Edge similarity determines that subelement obtains respectively horizontal edge image and the vertical edge image of monitoring image X, and obtains respectively horizontal edge image and the vertical edge image of the up-to-date background image obtaining;
According to the horizontal edge image of monitoring image X and vertical edge image, for the each pixel in monitoring image X, calculate respectively the gradient magnitude I of this pixel gxy:
I gxy=|I gx|+|I gy|;(4)
Wherein, I gxrepresent the horizontal gradient value of this pixel, I gyrepresent the vertical gradient value of this pixel, || represent to take absolute value;
According to the horizontal edge image of the up-to-date background image obtaining and vertical edge image, for the each pixel in the up-to-date background image obtaining, calculate respectively the gradient magnitude B of this pixel gxy:
B gxy=|B gx|+|B gy|; (5)
Wherein, B gxrepresent the horizontal gradient value of this pixel, B gyrepresent the vertical gradient value of this pixel;
Calculate the edge similarity ESIM of monitoring image X and the up-to-date background image obtaining:
ESIM = Σ ( 2 * I gxy * B gxy ) Σ ( I gxy 2 + B gxy 2 ) ; - - - ( 6 )
Wherein, the span of x is from 1 to E, and the span of y is from 1 to F, and E represents the pixel number in monitoring image X horizontal direction, and F represents the pixel number on monitoring image X vertical direction.
The pixel number Fg that in the two-value foreground image of analysis subelement Statistical monitor image X, value is 1 num;
Determine whether to meet the following conditions:
Flag=(Fg num/Area>T3)∩(ESIM<T4); (7)
Wherein, Area represents the product of the pixel number on pixel number and the vertical direction in monitoring image X horizontal direction, and T3 and T4 all represent predetermined threshold;
If meet above-mentioned condition, determine in monitoring image X and have people, otherwise unmanned.
The specific works flow process of said apparatus embodiment please refer to the respective description in preceding method embodiment, repeats no more herein.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (20)

1. the ATM (Automatic Teller Machine) ATM monitoring scene detection method based on video, is characterized in that, comprising:
Set up the background model about described ATM monitoring scene, comprise preset parameter corresponding to each pixel difference of determining in background image and background image;
After modeling completes, in the time often getting a frame monitoring image X, carry out respectively following processing:
Generate the two-value foreground image of monitoring image X according to background model;
Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting;
Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
2. method according to claim 1, is characterized in that, the each pixel in described definite background image and the background image respectively preset parameter of correspondence comprises:
A, obtain a frame monitoring image, by this monitoring image image as a setting;
For the each pixel in this background image, respectively using the gray-scale value of this pixel as average corresponding to this pixel, using the variance of the gray-scale value of this pixel as variance corresponding to this pixel;
Whether B, definite monitoring image number getting equal M, and M is greater than 1 positive integer, if so,, using the up-to-date background image obtaining as final required background image, complete modeling, if not, obtain the monitoring image that a frame is new, and perform step C;
Background image B after C, definite renewal new(x, y): B new(x, y)=(1-ρ) B old(x, y)+ρ I (x, y); Wherein, ρ represents renewal rate, and its value equals 1/N, and N represents the monitoring image number getting, and I (x, y) represents the up-to-date monitoring image getting, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding; Repeated execution of steps B.
3. method according to claim 1, is characterized in that,
After whether having people in described definite monitoring image X, further comprise: utilize monitoring image X to upgrade original background model;
The described two-value foreground image according to background model generation monitoring image X comprises: the two-value foreground image that generates monitoring image X according to the up-to-date background model obtaining;
The described Edge texture information of obtaining respectively monitoring image X and background image, and determine that according to the Edge texture information getting the edge similarity of monitoring image X and background image comprises: obtain respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine the edge similarity of monitoring image X and the up-to-date background image obtaining according to the Edge texture information getting.
4. method according to claim 3, is characterized in that, the described monitoring image X that utilizes upgrades and comprises original background model:
Determine the background image B after upgrading new(x, y): B new(x, y)=(1-ρ) B old(x, y)+ρ I (x, y); Wherein, I (x, y) represents monitoring image X, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding;
Wherein, ρ represents renewal rate;
In the time determining in I (x, y) nobody, the value of ρ is set to 0.01;
When determining while having people in I (x, y), the value of ρ is set to 0;
In this time period, in described ATM monitoring scene, there is people when determining from T to T-t always, and described ATM monitoring scene is always in the time of transfixion state, and the value of ρ is set to 1, T and represents to get I (x, y) moment, t > 0.
5. method according to claim 4, is characterized in that, described determine from T to T-t in this time period described ATM monitoring scene comprise in transfixion state always:
For any two two field picture I that get in this time period from T to T-t 1(x, y) and I 2(x, y), carry out respectively following processing:
Calculate Dif (x, y)=I 1(x, y)-I 2(x, y); Wherein, Dif (x, y) represents frame difference image, I 1(x, y) is prior to I 2(x, y) gets;
For Dif (x, y) the each pixel in, whether the gray-scale value of determining respectively this pixel is greater than predetermined threshold T1, if so, the value of this pixel is set to 1, otherwise, be set to 0, obtain the poor bianry image Dif_Fg of frame (x, y) of Dif (x, y);
The number Dif_Num of the pixel that in statistics Dif_Fg (x, y), value is 1, and whether definite Dif_Num be less than predetermined threshold T2, if so, determines I 1(x, y) and I 2between (x, y) in transfixion state;
If between any two two field pictures that get in this time period from T to T-t all in transfixion state, determine from T to T-t in this time period described ATM monitoring scene always in transfixion state.
6. according to the method described in claim 3,4 or 5, it is characterized in that, the described two-value foreground image that generates monitoring image X according to the up-to-date background model obtaining comprises:
For the each pixel in monitoring image X, carry out respectively following processing:
Calculate the difference d between the gray-scale value of this pixel and average corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Calculate d 22) -1; Wherein, σ represents variance corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Determine
Figure FDA00002372029000031
result of calculation whether be greater than predetermined threshold T0, if so, the value of this pixel is set to 1, otherwise, be set to 0.
7. according to the method described in claim 3,4 or 5, it is characterized in that, after the two-value foreground image of described generation monitoring image X, further comprise:
The two-value foreground image of monitoring image X is carried out to dilation and erosion operation successively.
8. method according to claim 6, it is characterized in that, the described Edge texture information of obtaining respectively monitoring image X and the up-to-date background image obtaining, and determine that according to the Edge texture information getting the edge similarity of monitoring image X and the up-to-date background image obtaining comprises:
Obtain respectively horizontal edge image and the vertical edge image of monitoring image X, and obtain respectively horizontal edge image and the vertical edge image of the up-to-date background image obtaining;
According to the horizontal edge image of monitoring image X and vertical edge image, for the each pixel in monitoring image X, calculate respectively the gradient magnitude I of this pixel gxy: I gxy=| I gx|+| I gy| wherein, I gxrepresent the horizontal gradient value of this pixel, I gyrepresent the vertical gradient value of this pixel, || represent to take absolute value;
According to the horizontal edge image of the up-to-date background image obtaining and vertical edge image, for the each pixel in the up-to-date background image obtaining, calculate respectively the gradient magnitude B of this pixel gxy: B gxy=| B gx|+| B gy|; Wherein, B gxrepresent the horizontal gradient value of this pixel, B gyrepresent the vertical gradient value of this pixel;
Calculate the edge similarity ESIM of monitoring image X and the up-to-date background image obtaining: wherein, the span of x is from 1 to E, and the span of y is from 1 to F, and E represents the pixel number in monitoring image X horizontal direction, and F represents the pixel number on monitoring image X vertical direction.
9. method according to claim 8, is characterized in that, described according to generate two-value foreground image and the edge similarity of determining determine in monitoring image X, whether someone comprises:
The pixel number Fg that in the two-value foreground image of Statistical monitor image X, value is 1 num;
Determine whether to meet the following conditions: Flag=(Fg num/ Area > T3) ∩ (ESIM < T4); Wherein, Area represents the product of the pixel number on pixel number and the vertical direction in monitoring image X horizontal direction, and T3 and T4 all represent predetermined threshold;
If meet above-mentioned condition, determine in monitoring image X and have people, otherwise unmanned.
10. the ATM (Automatic Teller Machine) ATM monitoring scene pick-up unit based on video, is characterized in that, comprising:
MBM, for setting up the background model about described ATM monitoring scene, comprises preset parameter corresponding to each pixel difference of determining in background image and background image, and set up background model is sent to detection module;
Described detection module, after completing in modeling, in the time often getting a frame monitoring image X, carries out respectively following processing: the two-value foreground image that generates monitoring image X according to background model; Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
11. devices according to claim 10, is characterized in that, described MBM comprises:
The first processing unit, for obtaining successively M frame monitoring image, M is greater than 1 positive integer, and the each frame monitoring image getting is sent to respectively to the second processing unit;
Described the second processing unit, be used for the first frame monitoring image image as a setting receiving, and for the each pixel in this background image, respectively using the gray-scale value of this pixel as average corresponding to this pixel, using the variance of the gray-scale value of this pixel as variance corresponding to this pixel;
Afterwards, in the time often receiving a frame monitoring image, be handled as follows respectively:
Determine the background image B after upgrading new(x, y): B new(x, y)=(1-ρ) B old(x, y)+ρ I (x, y); Wherein, ρ represents renewal rate, and its value equals 1/N, and N represents the monitoring image number receiving, and I (x, y) represents the up-to-date monitoring image receiving, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding.
12. devices according to claim 10, is characterized in that, described detection module comprises:
The 3rd processing unit, for obtaining successively each frame monitoring image, and sends to respectively fourth processing unit by the each frame monitoring image getting;
Described fourth processing unit, in the time often receiving a frame monitoring image X, carries out respectively following processing: the two-value foreground image that generates monitoring image X according to background model; Obtain respectively the Edge texture information of monitoring image X and background image, and determine the edge similarity of monitoring image X and background image according to the Edge texture information getting; Determine in monitoring image X, whether there is people according to the two-value foreground image generating and the edge similarity determined.
13. devices according to claim 12, is characterized in that, described detection module further comprises:
The 5th processing unit, for determining when described fourth processing unit after whether monitoring image X have people, utilizes monitoring image X to upgrade original background model;
Described in described fourth processing unit, generate the two-value foreground image of monitoring image X according to the up-to-date background model obtaining; Obtain respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine the edge similarity of monitoring image X and the up-to-date background image obtaining according to the Edge texture information getting.
14. devices according to claim 13, is characterized in that,
Described the 5th processing unit is determined the background image B after upgrading new(x, y):
B new(x, y)=(1-ρ) B old(x, y)+ρ I (x, y); Wherein, I (x, y) represents monitoring image X, B old(x, y) represents the background image before renewal;
For B neweach pixel in (x, y), respectively using the gray-scale value of this pixel as average corresponding to this pixel, by (1-ρ) σ old+ ρ d is as variance corresponding to this pixel; Wherein, σ oldrepresent B oldvariance corresponding to pixel identical with the coordinate position of this pixel in (x, y), d represents gray-scale value and the B of pixel identical with the coordinate position of this pixel in I (x, y) olddifference between the average that in (x, y), the pixel identical with the coordinate position of this pixel is corresponding;
Wherein, ρ represents renewal rate;
In the time determining in I (x, y) nobody, the value of ρ is set to 0.01;
When determining while having people in I (x, y), the value of ρ is set to 0;
In this time period, in described ATM monitoring scene, there is people when determining from T to T-t always, and described ATM monitoring scene is always in the time of transfixion state, and the value of ρ is set to 1, T and represents to get I (x, y) moment, t > 0.
15. devices according to claim 14, is characterized in that,
For any two two field picture I that get in this time period from T to T-t 1(x, y) and I 2(x, y), described the 5th processing unit carries out respectively following processing:
Calculate Dif (x, y)=I 1(x, y)-I 2(x, y); Wherein, Dif (x, y) represents frame difference image, I 1(x, y) is prior to I 2(x, y) gets;
For Dif (x, y) the each pixel in, whether the gray-scale value of determining respectively this pixel is greater than predetermined threshold T1, if so, the value of this pixel is set to 1, otherwise, be set to 0, obtain the poor bianry image Dif_Fg of frame (x, y) of Dif (x, y);
The number Dif_Num of the pixel that in statistics Dif_Fg (x, y), value is 1, and whether definite Dif_Num be less than predetermined threshold T2, if so, determines I 1(x, y) and I 2between (x, y) in transfixion state;
If between any two two field pictures that get in this time period from T to T-t all in transfixion state, determine from T to T-t in this time period described ATM monitoring scene always in transfixion state.
16. according to the device described in claim 13,14 or 15, it is characterized in that, described fourth processing unit comprises:
Foreground detection subelement, for generating the two-value foreground image of monitoring image X according to the up-to-date background model obtaining, and sends to analysis subelement by the two-value foreground image of generation;
Edge similarity is determined subelement, for obtaining respectively the Edge texture information of monitoring image X and the up-to-date background image obtaining, and determine according to the Edge texture information getting and the edge similarity of monitoring image X and the up-to-date background image obtaining the edge similarity of determining is sent to described analysis subelement;
Described analysis subelement, for determining according to the two-value foreground image and the edge similarity that receive whether monitoring image X has people.
17. devices according to claim 16, is characterized in that,
Described foreground detection subelement, for the each pixel in monitoring image X, carries out respectively following processing:
Calculate the difference d between the gray-scale value of this pixel and average corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Calculate d 22) -1; Wherein, σ represents variance corresponding to pixel identical with the coordinate position of this pixel in the up-to-date background image obtaining;
Determine
Figure FDA00002372029000071
result of calculation whether be greater than predetermined threshold T0, if so, the value of this pixel is set to 1, otherwise, be set to 0.
18. devices according to claim 16, is characterized in that,
Described foreground detection subelement is further used for, after generating the two-value foreground image of monitoring image X, this two-value foreground image is carried out to dilation and erosion operation successively, and the two-value foreground image after dilation and erosion operation is sent to described analysis subelement.
19. devices according to claim 17, is characterized in that,
Described edge similarity determines that subelement obtains respectively horizontal edge image and the vertical edge image of monitoring image X, and obtains respectively horizontal edge image and the vertical edge image of the up-to-date background image obtaining;
According to the horizontal edge image of monitoring image X and vertical edge image, for the each pixel in monitoring image X, calculate respectively the gradient magnitude I of this pixel gxy: I gxy=| I gx|+| I gy|; Wherein, I gxrepresent the horizontal gradient value of this pixel, I gyrepresent the vertical gradient value of this pixel, || represent to take absolute value;
According to the horizontal edge image of the up-to-date background image obtaining and vertical edge image, for the each pixel in the up-to-date background image obtaining, calculate respectively the gradient magnitude B of this pixel gxy: B gxy=| B gx|+| B gy|; Wherein, B gxrepresent the horizontal gradient value of this pixel, B gyrepresent the vertical gradient value of this pixel;
Calculate the edge similarity ESIM of monitoring image X and the up-to-date background image obtaining:
Figure FDA00002372029000081
wherein, the span of x is from 1 to E, and the span of y is from 1 to F, and E represents the pixel number in monitoring image X horizontal direction, and F represents the pixel number on monitoring image X vertical direction.
20. devices according to claim 19, is characterized in that,
The pixel number Fg that in the two-value foreground image of described analysis subelement Statistical monitor image X, value is 1 num;
Determine whether to meet the following conditions: Flag=(Fg num/ Area > T3) ∩ (ESIM < T4); Wherein, Area represents the product of the pixel number on pixel number and the vertical direction in monitoring image X horizontal direction, and T3 and T4 all represent predetermined threshold;
If meet above-mentioned condition, determine in monitoring image X and have people, otherwise unmanned.
CN201210444071.1A 2012-11-08 2012-11-08 Video-based automatic teller machine monitoring scene detection method and apparatus Active CN103810691B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210444071.1A CN103810691B (en) 2012-11-08 2012-11-08 Video-based automatic teller machine monitoring scene detection method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210444071.1A CN103810691B (en) 2012-11-08 2012-11-08 Video-based automatic teller machine monitoring scene detection method and apparatus

Publications (2)

Publication Number Publication Date
CN103810691A true CN103810691A (en) 2014-05-21
CN103810691B CN103810691B (en) 2017-02-22

Family

ID=50707412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210444071.1A Active CN103810691B (en) 2012-11-08 2012-11-08 Video-based automatic teller machine monitoring scene detection method and apparatus

Country Status (1)

Country Link
CN (1) CN103810691B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657997A (en) * 2015-02-28 2015-05-27 北京格灵深瞳信息技术有限公司 Lens shifting detection methods and devices
CN107588857A (en) * 2016-07-06 2018-01-16 众智光电科技股份有限公司 Infrared ray position sensing apparatus
CN108090916A (en) * 2017-12-21 2018-05-29 百度在线网络技术(北京)有限公司 For tracking the method and apparatus of the targeted graphical in video

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000276602A (en) * 1999-03-23 2000-10-06 Nec Corp Device and method for detecting object and recording medium recording object detection program
WO2003009232A2 (en) * 2001-07-16 2003-01-30 Hewlett-Packard Company Method and apparatus for sub-pixel edge detection
CN101276499A (en) * 2008-04-18 2008-10-01 浙江工业大学 Intelligent monitoring apparatus of ATM equipment based on all-directional computer vision
CN101404060A (en) * 2008-11-10 2009-04-08 北京航空航天大学 Human face recognition method based on visible light and near-infrared Gabor information amalgamation
CN101950448A (en) * 2010-05-31 2011-01-19 北京智安邦科技有限公司 Detection method and system for masquerade and peep behaviors before ATM (Automatic Teller Machine)
CN102236902A (en) * 2011-06-21 2011-11-09 杭州海康威视软件有限公司 Method and device for detecting targets

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000276602A (en) * 1999-03-23 2000-10-06 Nec Corp Device and method for detecting object and recording medium recording object detection program
WO2003009232A2 (en) * 2001-07-16 2003-01-30 Hewlett-Packard Company Method and apparatus for sub-pixel edge detection
CN101276499A (en) * 2008-04-18 2008-10-01 浙江工业大学 Intelligent monitoring apparatus of ATM equipment based on all-directional computer vision
CN101404060A (en) * 2008-11-10 2009-04-08 北京航空航天大学 Human face recognition method based on visible light and near-infrared Gabor information amalgamation
CN101950448A (en) * 2010-05-31 2011-01-19 北京智安邦科技有限公司 Detection method and system for masquerade and peep behaviors before ATM (Automatic Teller Machine)
CN102236902A (en) * 2011-06-21 2011-11-09 杭州海康威视软件有限公司 Method and device for detecting targets

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
AMIT SATPATHY等: "Difference of Gaussian Edge-Texture Based Background Modeling for Dynamic Traffic Conditions", 《LECTURE NOTES IN COMPUTER SCIENCE》 *
TASKEED JABID等: "An Edge-Texture based Moving Object Detection for Video Content Based Application", 《PROCEEDINGS OF 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2011) 》 *
庄晓丽等: "基于梯度幅度值的结构相似度的图像质量评价方法", 《计算机应用与软件》 *
李斌等: "基于纹理的运动目标检测", 《计算机工程与应用》 *
李海波: "视频监控中基于人的检测和跟踪", 《中国优秀硕士学位论文全文数据库信息科技辑(月刊)》 *
杨涛等: "一种基于多层背景模型的前景检测算法", 《中国图象图形学报》 *
汤一平等: "动态图像理解技术在ATM智能监控中的应用", 《计算机测量与控制》 *
黄鑫娟等: "自适应混合高斯背景模型的运动目标检测方法", 《计算机应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657997A (en) * 2015-02-28 2015-05-27 北京格灵深瞳信息技术有限公司 Lens shifting detection methods and devices
CN104657997B (en) * 2015-02-28 2018-01-09 北京格灵深瞳信息技术有限公司 A kind of lens shift detection method and device
CN107588857A (en) * 2016-07-06 2018-01-16 众智光电科技股份有限公司 Infrared ray position sensing apparatus
CN108090916A (en) * 2017-12-21 2018-05-29 百度在线网络技术(北京)有限公司 For tracking the method and apparatus of the targeted graphical in video
CN108090916B (en) * 2017-12-21 2019-05-07 百度在线网络技术(北京)有限公司 Method and apparatus for tracking the targeted graphical in video

Also Published As

Publication number Publication date
CN103810691B (en) 2017-02-22

Similar Documents

Publication Publication Date Title
CN106327520B (en) Moving target detection method and system
EP2858008B1 (en) Target detecting method and system
US7982774B2 (en) Image processing apparatus and image processing method
CN102201121A (en) System and method for detecting article in video scene
CN110232359B (en) Retentate detection method, device, equipment and computer storage medium
CN102867175B (en) Stereoscopic vision-based ATM (automatic teller machine) machine behavior analysis method
CN103886598A (en) Tunnel smoke detecting device and method based on video image processing
US12002195B2 (en) Computer vision-based anomaly detection method, device and electronic apparatus
CN102855459A (en) Method and system for detecting and verifying specific foreground objects
CN105574891A (en) Method and system for detecting moving object in image
CN102955940A (en) System and method for detecting power transmission line object
CN103945089A (en) Dynamic target detection method based on brightness flicker correction and IP camera
CN110114801B (en) Image foreground detection device and method and electronic equipment
CN106408563B (en) A kind of snow noise detection method based on the coefficient of variation
CN110795975B (en) Face false detection optimization method and device
CN108596032B (en) Detection method, device, equipment and medium for fighting behavior in video
CN108629254A (en) A kind of detection method and device of moving target
JP2010015469A (en) Still area detection method, and apparatus, program and recording medium therefor
CN103810691A (en) Video-based automatic teller machine monitoring scene detection method and apparatus
EP3376438A1 (en) A system and method for detecting change using ontology based saliency
CN105447863A (en) Residue detection method based on improved VIBE
CN104483712A (en) Method, device and system for detecting invasion of foreign objects in power transmission line
CN115272917A (en) Wire galloping early warning method, device, equipment and medium based on power transmission line
CN111062941A (en) Point light source lamp point fault detection device and method
CN112927178A (en) Occlusion detection method, occlusion detection device, electronic device, and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant