CN110427815A

CN110427815A - Realize the method for processing video frequency and device of the effective contents interception of gate inhibition

Info

Publication number: CN110427815A
Application number: CN201910551347.8A
Authority: CN
Inventors: 谢超
Original assignee: Terminus Beijing Technology Co Ltd
Current assignee: Terminus Beijing Technology Co Ltd
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2019-11-08
Anticipated expiration: 2039-06-24
Also published as: CN110427815B

Abstract

The invention discloses the method for processing video frequency for realizing the effective contents interception of gate inhibition, the following steps are included: carrying out moving object detection to the video pictures of acquisition first, to identify the video pictures for including motion target area, then whether judgement belongs to human body target comprising moving target corresponding in the video pictures of motion target area, whether human body target region included in video pictures of the detection comprising human body target includes effective facial image later, and finally the video pictures comprising effective facial image are saved and uploaded.This method is only saved and is uploaded to the video image for including effective facial image, without all collected monitor video images are all saved and are uploaded, the requirement to data space is reduced in terms of data storage, data carrying cost is reduced, while also reducing transmitted data on network amount and transmission cost in terms of uploaded videos picture data.

Description

Realize the method for processing video frequency and device of the effective contents interception of gate inhibition

Technical field

The present invention relates to technical field of image processing, in particular to a kind of video processing for realizing the effective contents interception of gate inhibition Method and a kind of video process apparatus for realizing the effective contents interception of gate inhibition.

Background technique

More and more communities, building replace traditional access control system of swiping the card using video access control system at present.Video Access control system shoots the video pictures of front certain space range with camera, therefrom identifies character face region, and from people Object plane portion extracted region face characteristic, by the registration face characteristic progress of the face characteristic of extraction and registration in advance in the database It compares, judges whether it has gate inhibition's right-of-way by way of identifying piece identity, gate inhibition is opened if having and is let pass, it is no Then refuse to let pass.For not having the personage of gate inhibition's right-of-way, then the target room number that can be provided according to the personage, by video Picture or character face's area transmissions give the target room, carry out manual verification.

Other than the basic function of gate inhibition's access control, video gate inhibition can also save above-mentioned video pictures, or will The data of video pictures upload to background server by network and are saved, and enter depositing for community or building as personage Grade, the purposes such as realization admission record, security are recorded, traced afterwards.

But for video gate inhibition, most of video pictures of shooting do not contain effective character face region. Such as: the picture shot when unmanned current is unmanned video pictures, when someone is current due to task distance farther out, personage front The video pictures for causing face not recognized effectively towards factors such as not positive (such as backwards to cameras), these video pictures are all Not comprising effective character face region, it is not belonging to effective video picture, therefore these video pictures are for the note of access control system Record the necessity for not having to save for shelves.And these video pictures without containing effective character face region are saved, Memory capacity can be wasted, improves data carrying cost, while also will increase transmitted data on network amount, improve data transfer cost.

Summary of the invention

(1) goal of the invention

It is complete with lower data carrying cost and transmission cost for the ease of the monitoring system of community and building based on this The identification of the facial image acquired in pairs of video monitoring regional and retention record are guaranteeing that retaining record can accurately reflect Under the premise of face information, monitoring operation cost is reduced, the invention discloses following technical schemes.

(2) technical solution

As the first aspect of the present invention, the invention discloses realize the effective contents interception of gate inhibition method for processing video frequency, Include:

Moving object detection is carried out to the video pictures of acquisition, to identify the video picture for including motion target area Face；

Whether judgement belongs to human body target comprising moving target corresponding in the video pictures of the motion target area；

Whether human body target region included in video pictures of the detection comprising the human body target includes effective face Image；

Video pictures comprising effective facial image are saved and/or uploaded.

In a kind of possible embodiment, the video pictures of described pair of acquisition carry out moving object detection and include:

Foreground area, and the prospect that will be obtained are identified from the video pictures of the acquisition using background subtraction method Region is as the motion target area；

Wherein, the background model that the background subtraction method uses is mixed Gauss model or pixel grey scale mean value model.

In a kind of possible embodiment, using mixed Gauss model as background model to the video pictures of acquisition into The identification of row foreground area includes:

Each Gauss model progress sequence that each pixel in video pictures is sorted from high to low with priority Match, judgement and the matched Gauss model of the pixel；

Parameter update is carried out to the matched Gauss model of the pixel；

By in updated Gauss model highest priority and weight sum it is multiple high greater than background weight threshold This model is as background；

Multiple background Gauss model carry out sequence matchings that each pixel and priority are sorted from high to low determine The pixel for belonging to prospect out, obtains foreground area.

In a kind of possible embodiment, in judgement Gauss model matched with the pixel, in any Gauss Model is and in the unmatched situation of the pixel: selecting the smallest Gauss model of weight as matched with the pixel Gauss model.

In a kind of possible embodiment, drawn using pixel grey scale mean value model as video of the background model to acquisition Face carry out foreground area identification include:

The mean value of respective pixel in the training image of gray level image is converted into as background pixel value, obtains background mould Type；

The obtained background model is updated using the video image of present frame, obtains new background model；

The grey scale difference for calculating the video pictures to be detected and the new background model that are converted to gray level image, according to institute It states grey scale difference and obtains foreground pixel probability distribution；

Foreground area is obtained according to the foreground pixel probability distribution.

It is described to obtain foreground area according to the foreground pixel probability distribution and include: in a kind of possible embodiment

The foreground pixel is divided into multiple grids according to the foreground pixel probability distribution, is counted in each grid Pixel prospect probability cumulative and；

Whether the accounting for being accumulated in corresponding grid inner area as unit of grid according to each grid judges each grid Belong to prospect grid, and then obtains the foreground area being made of the prospect grid.

In a kind of possible embodiment, the judgement is comprising corresponding in the video pictures of the motion target area Whether moving target belongs to human body target

By the motion target area to coordinate axial projection, picture of the motion target area under each pixel rower is obtained Prime number；

According to the feature of each pixel rower pixel number, phase corresponding at least three target sites of human body is determined Answer the pixel rower of quantity；

Judge that each pixel column corresponding with human body target position is marked on the distance ratio of distance from each other in reference axis Whether in corresponding human body distance than in range, if determining the movement if preset human body distance is than in range Moving target in target area belongs to human body target.

In a kind of possible embodiment, the detection is comprising included in the video pictures in the human body target region Human body target region whether include that effective facial image includes:

Pass through the picture area of the search window covering in the classifier judgement video pictures comprising motion target area Whether domain belongs to human face region；

Video pictures are realized by described search window mobile in the video pictures comprising motion target area Traversal；

Determine the positions and dimensions for the human face for including in the human face region；

Positional relationship based on the human face determines whether the human face region belongs to effective facial image.

In a kind of possible embodiment, the described pair of video pictures comprising effective facial image are saved And/or it uploads and includes:

The picture area for including at least face part is extracted from the video pictures comprising effective facial image It is saved and/or is uploaded.

In a kind of possible embodiment, people included in the video pictures comprising the human body target is being detected In the case that body target area includes effective facial image, the described pair of video pictures comprising effective facial image into Before row is saved and/or is uploaded:

The effective facial image for including by present frame video pictures is compared with former frame video pictures, and in former frame When in video pictures including the facial image for all human body targets that present frame video pictures include, cancel to present frame video The preservation and upload of picture.

As a second aspect of the invention, the invention also discloses realize that the video of the effective contents interception of gate inhibition is handled to fill It sets, comprising:

Moving object detection module, the video pictures for acquiring to video capture device carry out moving object detection, with Identify include motion target area video pictures；

Human body target judgment module, for judging that the moving object detection module identifies comprising the moving target Whether corresponding moving target belongs to human body target in the video pictures in region；

Face information detection module includes the human body target for detect the human body target judgment module judgement Whether human body target region included in video pictures includes effective facial image；

Video pictures retain module, for the face information detection module detect comprising effective face figure The video pictures of picture are saved and/or are uploaded.

In a kind of possible embodiment, the moving object detection module uses background subtraction method from the acquisition Foreground area is identified in video pictures, and using the obtained foreground area as the motion target area；

In a kind of possible embodiment, the moving object detection module includes first object detection sub-module, is used In the identification that as background model the video pictures of acquisition are carried out with foreground area using mixed Gauss model；

The first object detection sub-module includes:

Model Matching unit is high for each of each pixel and the priority in video pictures sorting from high to low This model carry out sequence matching, judgement and the matched Gauss model of the pixel；

Parameter updating unit, for the Model Matching units match go out with the matched Gauss model of the pixel Carry out parameter update；

Foreground selection unit, for highest priority in the Gauss model after being updated the parameter updating unit And the sum of weight greater than background weight threshold multiple Gauss models as background；

First prospect acquiring unit, multiple backgrounds for each pixel and priority to sort from high to low Selecting unit selects background Gauss model carry out sequence matching, determines the pixel for belonging to prospect, obtains foreground area.

In a kind of possible embodiment, in the Model Matching unit judges and the matched Gaussian mode of the pixel When type, in any Gauss model and in the unmatched situation of the pixel: the Model Matching Unit selection weight is minimum Gauss model as with the matched Gauss model of the pixel.

In a kind of possible embodiment, the moving object detection module includes the second target detection submodule, is used In the identification that as background model the video pictures of acquisition are carried out with foreground area using pixel grey scale mean value model；

The second target detection submodule includes:

Background acquiring unit, for being converted into the mean value of respective pixel in the training image of gray level image as background picture Element value, obtains background model；

Context update unit, the background model that the background acquiring unit is obtained for the video image using present frame It is updated, obtains new background model；

Probability calculation unit, for calculating the video pictures to be detected and the context update unit that are converted to gray level image The grey scale difference of updated background model obtains foreground pixel probability distribution according to the grey scale difference；

Second prospect acquiring unit, before the foreground pixel probability distribution for calculating according to the probability calculation unit obtains Scene area.

In a kind of possible embodiment, the second prospect acquiring unit includes:

Cumulative statistics subelement, for the foreground pixel to be divided into multiple nets according to the foreground pixel probability distribution Lattice, count the cumulative of pixel prospect probability in each grid and；

Accounting judgment sub-unit, for the corresponding grid inner area of being accumulated in as unit of grid according to each grid Accounting judges whether each grid belongs to prospect grid, and then obtains the foreground area being made of the prospect grid.

In a kind of possible embodiment, the human body target judgment module includes:

Row pixel number statistic unit, for the motion target area to coordinate axial projection, to be obtained the moving target Pixel number of the region under each pixel rower；

Target site matching unit, each pixel rower pixel number for being counted according to the row pixel number statistic unit Feature, determine the pixel rower of respective numbers corresponding at least three target sites of human body；

Distance is than judging unit, for judging the target site of each and human body that the target site matching unit is determined Corresponding pixel column be marked in reference axis distance from each other the ratio between two-by-two whether in corresponding human body distance than range It is interior, if determining that the moving target in the motion target area belongs to human body mesh if preset human body distance is than in range Mark.

In a kind of possible embodiment, the face information detection module includes:

Human face region search unit, for by the classifier judgement video pictures comprising motion target area Whether the picture area of search window covering belongs to human face region；

Picture search spread unit, for by being searched described in the movement in the video pictures comprising motion target area The traversal of rope window realization video pictures；

Organ characteristic's acquiring unit, for determining the people for including in human face region that the human face region search unit determines The positions and dimensions of face；

It closes the position of effective face judging unit, the human face for being determined based on organ characteristic's acquiring unit System determines whether the human face region belongs to effective facial image.

In a kind of possible embodiment, the video pictures retain module from described comprising effective facial image Video pictures in extract include at least face part picture area saved and/or uploaded.

In a kind of possible embodiment, described device further include:

Picture retains judgment module, for detecting the view comprising the human body target in the face information detection module In the case that human body target region included in frequency picture includes effective facial image, module is retained in the video pictures Before video pictures comprising effective facial image are saved and/or are uploaded: including by present frame video pictures Effective facial image is compared with former frame video pictures, and includes present frame video pictures in former frame video pictures When the facial image for all human body targets for including, cancels the video pictures and retain preservation of the module to present frame video pictures And upload.

(3) beneficial effect

The method for processing video frequency and device disclosed by the invention for realizing the effective contents interception of gate inhibition, can be to access control system reality When the monitor video image that acquires simplified, to need the monitor video image acquired to access control system to carry out that shelves is stayed to save Using as admission record, security record and in the future tracing of material when, only to include effective facial image video image carry out It saves and uploads, without all collected monitor video images are all saved and uploaded, in terms of data storage The requirement to data space is reduced, data carrying cost is reduced, while also being dropped in terms of uploaded videos picture data In addition low transmitted data on network amount and transmission cost also in a disguised form increase inspection video pictures are carried out with personage's retrieval in the future when Rope efficiency.

Detailed description of the invention

It is exemplary below with reference to the embodiment of attached drawing description, it is intended to for the explanation and illustration present invention, and cannot manage Solution is the limitation to protection scope of the present invention.

Fig. 1 is the embodiment flow diagram of method for processing video frequency disclosed by the invention.

Fig. 2 is the schematic diagram of a certain frame video pictures of access control system acquisition.

Fig. 3 is the example structure block diagram of video process apparatus disclosed by the invention.

Specific embodiment

To keep the purposes, technical schemes and advantages of the invention implemented clearer, below in conjunction in the embodiment of the present invention Attached drawing, technical solution in the embodiment of the present invention is further described in more detail.

Implement below with reference to the method for processing video frequency for realizing the effective contents interception of gate inhibition disclosed in Fig. 1 the present invention is described in detail Example.As shown in Figure 1, method for processing video frequency disclosed in the present embodiment mainly comprises the steps that

Step 100, moving object detection is carried out to the video pictures of acquisition, to identify including motion target area Video pictures.The view that access control system passes through certain area coverage of the video capture devices such as the camera acquisition comprising gate inhibition inlet Frequency image obtains each frame video pictures comprising gate inhibition's entrance area, then the moving object detection mould of video process apparatus Block carries out moving object detection to each frame video pictures that access control system acquires.Moving target refers to actually occurring mobile behavior Target.

Motion target area refers to that the picture area that moving target is indicated in video pictures, the picture area usually only account for A part of whole frame video pictures, therefore motion target area is contained in video pictures.Such as Fig. 2 show access control system and adopts A wherein frame video pictures schematic diagram for collection, for the access control system of office building inlet, the personage P1 in building is just in face It comes up to access control system, it is mobile towards observation position that symbol " ⊙ " represents target；Personage P2 in building is backwards to access control system It leaves, symbol " ⊕ " represents the direction that target faces away from observation position；Reception counseling platform R in building is fixed；Vapour outside building Automobile C on lane is being moved to the left.In four targets in Fig. 2, P1, P2 and C are actual moving target, profile region Motion target area should belong in domain, and the reception fixed structures such as counseling platform R and ground, load bearing wall are actual non-athletic target, Background area should belong in its contour area.

Moving object detection module can regard each frame by the methods of background subtraction method, frame differential method, optical flow method Frequency picture carry out moving object detection, with identify and obtain include motion target area video pictures, hereinafter referred to as transport Moving-target video pictures.Video pictures shown in Fig. 2 can be identified as including motion target area by moving object detection module Video pictures.In above-mentioned three kinds of moving target detecting methods, background subtraction method and frame differential method are suitble to fairly simple back Scene area, but more sensitive to the variation of light, and access control system majority is arranged in indoor has artificially lighting environment In, therefore background subtraction method and frame differential method are relatively specific for access control system applied by the present embodiment, specific embodiment party Formula, which is shown in, to be hereinafter described.

Step 200, whether judgement belongs to human body mesh comprising moving target corresponding in the video pictures of motion target area Mark.After obtaining above-mentioned Moving Targets Based on Video Streams picture, the human body target judgment module of video process apparatus moves mesh to these The motion target area for including in mark video pictures is judged, if the moving target that motion target area indicates is human body mesh Mark, the then moving target for being included are the video pictures of human body target, hereinafter referred to as human body target video pictures.

Such as human body target judgment module sentences four targets that Moving Targets Based on Video Streams picture shown in Fig. 2 includes Disconnected, the region of discovery target P1 and P2 are actual human body target region, therefore video pictures shown in Fig. 2 are human body target Video pictures.

Step 300, whether human body target region included in video pictures of the detection comprising human body target includes effective Facial image.After obtaining above-mentioned human body target video pictures, the face information detection module of video process apparatus is to these The human body target region for including in human body target video pictures is detected, to obtain therefrom being identified effective face figure The video pictures of picture, hereinafter referred to as face information video pictures.Effective facial image refers to determine therefrom that out personage's body The facial image of part information allows face information video pictures to enter script holder by enter community or building as personage It records, the data of security record, subsequent tracing of material.

Such as face information detection module two human body target P1 including to human body target video pictures shown in Fig. 2 and The human body target region of P2 carries out detection identification, since target P1 is moved towards access control system, the image-region of target P1 In include be P1 the positive face of personage, and target P2 is mobile towards the direction far from access control system, apart from access control system camera Farther out, and positive face relative to camera angle not just, positive face is backwards to camera, therefore in the image-region of target P2 It absolutely not include the positive face of personage of P2.Therefore only the image-region of target P1 includes effective facial image, the figure of target P2 As region fails comprising effective facial image.But it since video pictures shown in Fig. 2 include effective facial image, is consequently belonging to Face information video pictures.If not including any effective facial image in video pictures, it is not belonging to face information video pictures.

Step 400, the video pictures comprising effective facial image are saved and/or is uploaded.

After obtaining above-mentioned face information video pictures, the video pictures of video process apparatus retain module and believe face It ceases video pictures and carries out local preservation, can also upload in the background server of monitoring system and be saved and shown, to make It is recorded for the admission record for the corresponding personage of effective facial image for including in picture, security, subsequent tracing of material.

Method for processing video frequency provided in this embodiment can carry out essence to the monitor video image that access control system acquires in real time Letter, with need the monitor video image that access control system is acquired stay shelves save using as admission record, security record and In the future when tracing of material, only the video image for including effective facial image is saved and uploaded, without being adopted all The monitor video image collected is all saved and is uploaded, and is reduced in terms of data storage and is wanted to data space It asks, reduces data carrying cost, while also reducing transmitted data on network amount and transmission in terms of uploaded videos picture data In addition cost also in a disguised form increases recall precision video pictures are carried out with personage's retrieval in the future when.

In one embodiment, carrying out moving object detection to the video pictures of acquisition in step 100 includes:

Step 110, foreground area, and the prospect that will be obtained are identified from the video pictures of acquisition using background subtraction method Region is as motion target area.

Background subtraction method is the pixel value for carrying out approximate background image using the context parameter model pre-established, by present frame Image and background model carry out differential comparison, to realize detection to motion target area.To current frame image and background When image is compared, the biggish pixel region of the difference compared is considered as foreground area, distinguishes lesser pixel region It is considered as background area.Region of the moving target in video pictures is referred to as prospect, and other do not actually occur movement The region of target (non-athletic target) in video pictures is then background.Such as in Fig. 2, before target P1, P2 and C are actual Scene area, target R is actual background area, therefore the region of target P1, P2 and C are motion target area.In step 110 The background model that the background subtraction method used is selected is mixed Gauss model or pixel grey scale mean value model.

In one embodiment, prospect is carried out as video pictures of the background model to acquisition using mixed Gauss model The identification in region the following steps are included:

Step A1 carries out each pixel in video pictures with each Gauss model that priority sorts from high to low Sequence matches, judgement and the matched Gauss model of pixel.

Mixed Gauss model is denoted as η (I_t,μ_i,t), i=1,2 ..., K, I_tFor t moment (namely t frame number) Pixel, μ_i,tIt is i-th of Gauss model in the mean value of t moment, K is the quantity of Gauss model, is typically set at three to five It is a.In each Gauss model, ω_i,tFor weight of i-th of Gauss model on t moment current pixel, andFor the priority of i-th of Gauss model.

Judge that pixel is matched with Gauss model according to formula (1) in step A1:

|I_t-μ_i,t-1|≤D_i*δ_i,t-1Formula (1)；

Wherein, μ_i,t-1Mean value for i-th of Gaussian function in t-1 frame, δ_i,t-1For i-th of Gaussian function t-1 frame mark Poor, the D of standard_iFor constant.

Model is trained it is understood that mixed Gauss model needs to first pass through multiframe continuous video pictures in advance It obtains.

When judgement is with pixel matched Gauss model in step A1, if any Gauss model with pixel not Match, then select the smallest Gauss model of weight as with the matched Gauss model of pixel.

Step A2 carries out parameter update to the matched Gauss model of pixel.The parameter of update includes Gauss model Weight ω_i,t, varianceAnd mean μ_i,t, and be updated respectively according to formula (2) to (4):

ω_i,t=(1- α) * ω_i,t-1+ α formula (2)；

μ_i,t=(1- ρ) * μ_i,t-1+ρ*I_tFormula (3)；

Wherein, α is customized learning rate, and the size of 0≤α≤1, α determine the speed that model is updated, with model Renewal speed is directly proportional.ρ is parameter learning rate, ρ ≈ α/ω_i,t。

Pervious mean value and variance are not kept then with the matched Gauss model of pixel, and weight is then according to ω_i,t=(1- α)*ω_i,t-1Decaying.

If failing to match pixel Gauss model in step A1, therefore select the selection the smallest Gauss model of weight When, then when carrying out parameter update to the smallest Gauss model of the selection weight, its mean value is updated to I_t, standard deviation is updated to δ₀, right value update ω_K,t=(1- α) * ω_K,t-1+α。

Step A3, by updated Gauss model highest priority and weight sum greater than background weight threshold Nb Gauss model is as background.After mixed Gauss model parameter updates, order standard of the foundation priority as Gauss model, Each Gauss model is subjected to sequence from big to small, the higher Gauss model of priority is more forward in the sequence, also more may be Background.It can use foundation of the threshold value T of the part sum of background weight as screening model, if T is less than the power of preceding Nb model Value and, then using preceding Nb model as background distributions.

Step A4, the Nb Gauss model progress sequence as background that each pixel and priority are sorted from high to low Matching, determines the pixel for belonging to prospect, obtains foreground area.Assuming that current pixel point is I_t, according to the priority of model Sequence matches the Nb Gauss model filtered out in the pixel and step A3 one by one, if meeting formula (5), determines I_tFor foreground point, I is otherwise determined_tFor background dot, the foreground area in video pictures is thus obtained, that is, target P1 in Fig. 2, The region of P2 and C.

|I_t-μ_i,t|>D₂*δ_i,t, i=1,2 ..., Nb formula (5)；

Wherein, D₂For customized constant.

In one embodiment, it is carried out using pixel grey scale mean value model as video pictures of the background model to acquisition The identification of foreground area includes:

Step B1 is converted into the mean value of respective pixel in the training image of gray level image as background pixel value, obtains Background model.Training image refers to the video sequence for training background model, first one before selection current time Divide training image, then convert gray level image for these training images, using the mean value of respective pixel in training image as back Scape pixel value obtains background model B, that is, obtains background image.

Background model B is indicated by formula (6):

Wherein, I_tFor the gray level image of t moment (t frame number), T be for train the video sequence of background model it is total when It is long, i.e., it is above-mentioned " a part of training image before current time ", therefore T also corresponds to the frame number of video pictures.T value is bigger, The video pictures for being equivalent to selection are more, then the background model obtained is more accurate, but operation time is longer.

Step B2 is updated obtained background model using the video image of present frame, obtains new background model. Since background model is not forever constant, it is therefore desirable to be updated to background model.Such as worked as using formula (7) to calculate The pixel value of preceding background model, and then the background model updated；

Wherein, p_tFor the pixel value of t moment image, u_t-1For the pixel value of corresponding current background model, α is customized Habit rate, the size of 0≤α≤1, α determine current frame image to the adjustment influence degree of background model, and α is bigger, current frame image Bigger on the adjustment of background model influence, background model adapts to faster with environmental change.When α=1, be equivalent to using present frame as New background replaces original background model.

Step B3 calculates the gray scale difference of the video pictures to be detected and updated background model that are converted to gray level image Point, foreground pixel probability distribution is obtained according to grey scale difference.

If there is moving target in the video sequence, the difference of comparison current video picture and background frame can be passed through Moving target is not detected.Gray level image is converted by video pictures to be detected first, is then calculated by formula (8) to be detected The grey scale difference dI of frame and background model_t(x):

Wherein, I_tIt (x) is video pictures to be detected, B (x) is background image, and thr is constant.Grey scale difference indicates current The degree that the gray value of frame changes relative to background gray levels shows that the variation is due to moving mesh if changing and being greater than thr Caused by mark.

Judge whether pixel belongs to foreground pixel especially by formula (9), and then obtain foreground area:

Wherein, P_t(x) approximate representation for belonging to the probability of prospect for current pixel, if P_t(x) very little, then can tentatively by The pixel is judged as background pixel, if P_t(x) very big, then illustrate that the pixel is foreground pixel.

Step B4 obtains foreground area according to foreground pixel probability distribution.By all pixels for being judged as foreground pixel Point combines, and foreground area can be obtained.Specifically, in order to eliminate ambient noise, so that foreground area will not be divided into More than the multiple regions of actual foreground region quantity, and be to try to be similar to actual foreground region, step B4 the following steps are included:

Foreground pixel is divided into multiple grids according to foreground pixel probability distribution, counts pixel in each grid by step B41 Prospect probability cumulative and.Cumulative and A is specifically calculated such as formula (10):

A=∑ P (x) x ∈ L formula (10)；

Wherein, L is a certain regional area, that is, grid, and x is the pixel in the L of region.

Step B42, the accounting for being accumulated in corresponding grid inner area as unit of grid according to each grid judge each grid Whether belong to prospect grid, and then obtains the foreground area being made of prospect grid.Due to 0≤P (x)≤1, the maximum of A Value is exactly the area S of region L.Then judge that region belongs to foreground area or the decision condition of background area can as unit of region To use formula (11):

Wherein, β is invariant, and 0≤β≤1, β is bigger, and region is judged as belonging to the prospect that foreground area to be included The quantitative requirement of pixel is higher, is also just more difficult to be judged as foreground area.

In one embodiment, judge to move accordingly in the video pictures comprising motion target area in step 200 Target whether belong to human body target the following steps are included:

Step 210, motion target area is obtained into picture of the motion target area under each pixel rower to coordinate axial projection Prime number.By taking Fig. 2 as an example, the motion target area of moving target P1, P2 and C are converted into binary image, so that moving target Pixel point value in region is 1, and the value in other regions is 0.Then the row of moving target profile is obtained to coordinate axial projection respectively Mark-pixel number statistical graph, the X-axis in chart is pixel rower, and referring to motion target area altogether includes multirow pixel, Y Axis is the pixel quantity that value is 1.Since the shape (namely profile) of motion target area is different, every row pixel rower is corresponding Pixel quantity it is also different, the corresponding pixel quantity of a circular every row pixel rower is in the song fallen after rising along X-axis Line, the corresponding pixel quantity of every row pixel rower of the horizontal positioned triangle in a bottom edge is along the linear ascendant trend of X-axis.

Step 220, the feature according to each pixel rower pixel number is determined corresponding at least three target sites of human body Respective numbers pixel rower.

The feature of the pixel rower pixel number of motion target area of different shapes is different, corresponding in rower-pixel number system What is presented on meter chart is exactly stripe shape and lifting trend difference.Since human body target has certain features of shape, because The stripe shape that this general human body is presented in correspondence graph is usual are as follows: is to fall after rising and drop from the head in X-axis to neck To the reduced levels of pixel quantity, and X-axis head portion is relatively close to the distance between neck, and increases since neck, intermediate There may be some liftings, then arriving foot is then nearby decline.

It follows that wherein stablizing the most constant is exactly head, neck and foot, thus select these three positions as Target site.And neck to the part between foot can due to the stature of human body and different, a but elder generation from head to neck Dropped after rising and neck after rising, the decline before foot be then general human body pixel quantity linear feature.

Therefore can using head, neck and foot as the target site of human body, and since head and foot are both ends, Therefore its pixel number is zero, and neck is due to most thin, and pixel number is one minimum in all troughs in lines, and distance Head is closer.

It follows that the head of target site and foot are located in X-axis, the neck of target site is higher than one in X-axis A trough determines head in three charts, neck, foot in X respectively after the chart for obtaining moving target P1, P2 and C Position on axis.

Step 230, judgement respectively pixel column corresponding with human body target position be marked in reference axis distance from each other away from From than whether with a distance from corresponding human body than determining to move if if preset human body distance is than in range in range Moving target in target area belongs to human body target.

Due to general human body head to neck and neck to the distance between foot than being the presence of rule, in X On axis, the distance between head to neck HN and neck to the distance between foot distance NF ratio in a certain range, such as 0.1≤HN/NF≤0.15, human body distance is exactly [0.1,0.15] than range, if the HN/NF of moving target is in human body Otherwise distance is determined as non-human target than in range, being then determined as human body target.Target P1 and P2 can be judged as human body mesh Mark, target C can be judged as non-human target.

Can also similarly choose head to the distance between neck HN and head between the distance between foot HF away from From the judgment basis than whether belonging to human body target as moving target, applicable human body distance has phase than range at this time It should change.

In one embodiment, human body mesh included in video pictures of the detection comprising human body target in step 300 Mark region whether include effective facial image the following steps are included:

Step 310, the picture by classifier judgement comprising the search window covering in the video pictures of motion target area Whether face region belongs to human face region.Search window is the window for being less than video pictures size, for delimiting one piece of region As the region of identification face, and classifier can classify to sample, therefore can use classifier and draw to search window Fixed region carries out image classification and realizes the identification of face to obtain facial image class.

Step 320, pass through the mobile search window realization video pictures in the video pictures comprising motion target area Traversal.Search window can be moved in video pictures with the distance for being less than search window length/width, be drawn with traversing entire video Face.Such as Fig. 2, face information detection module can detect the face area for including in the region of target P1 by traversal Domain.Target P2 will not be recognized as face due to not including face.

Step 330, the positions and dimensions for the human face for including in human face region are determined.Such as the angle of face is different, The organ shown is also different, and positive face can show that all eyebrows, eye, nose, mouth, but possibly can not show ear, and side Face is only able to display out eyebrow, eye, the ear of wherein side, and shows nose, mouth.

Step 340, the positional relationship based on human face determines whether human face region belongs to effective facial image.Distance is taken the photograph As the close positive face of head, organ clarity height, angle just, belong to effective facial image, such as the region of the target P1 in Fig. 2 includes Effective facial image.The side face remote apart from camera, organ clarity is not high, angle just, is not belonging to effective facial image.

In one embodiment, the picture area of search window covering is judged in step 310 by Adaboost classifier Whether domain belongs to human face region.The mode of training Adaboost classifier is as follows:

Given training sample set S, wherein X and Y corresponds respectively to face positive example sample and non-face negative example sample, and T is instruction Experienced maximum cycle.Training sample is N number of altogether, and the corresponding weight of each sample is identical, initialization in N number of training sample Sample weights are 1/N, as the initial probability distribution of training sample.Under this sample distribution by N number of training sample into Row training obtains first Weak Classifier.

For the sample of classification error, its corresponding weight is increased, and for correct sample of classifying, its weight is reduced, The sample of misclassification so just is highlighted out, to obtain a new sample distribution.Under new sample distribution, again to sample Originally it is trained, obtains second Weak Classifier.

And so on, it is recycled by T times, obtains T Weak Classifier, this T Weak Classifier is pressed certain weighted superposition Get up, the strong classifier finally wanted.

After strong classifier trains, it can use finally obtained strong classifier and correctly identify people in video pictures Face area information.Such as strong classifier can identify the human face region P1f of target P1 from Fig. 2, therefore Fig. 2 belongs to and includes The video pictures of facial image are imitated, it includes have an effective face P1f.

The video pictures comprising cognizable character face region (effective video picture) are obtained after identified, but due to same One personage can generate a large amount of effective video picture during passage, be directed to for the record of the personage, these are effectively Effective content that video pictures include is all the same.For example, for the continuous M frame video pictures acquired in a period of time, J frame video pictures (regardless of whether being continuous frame) is wherein shared comprising effective face figure if identifying through this embodiment Whole image contents of the j frame video pictures are then locally saved or are uploaded to monitoring system in access control system by picture Background server in, using as staying shelves to record.

But due to not needing to save background area and the low region of interest-degree sometimes, in one embodiment, The video pictures comprising effective facial image are saved and/or uploaded in step 400, specifically may is that from comprising effective The picture area extracted in the video pictures of facial image including at least face part is saved and/or is uploaded.Such as For video pictures Fig. 2, the region of the target P1 comprising human face region P1f, that is, video pictures only can be saved or uploaded The human region of middle P1, or even only save or upload human face region P1f.It can be further reduced the data volume of storage and upper in this way The data volume of biography.

In addition, effective face that above-mentioned j frame video pictures include may be the face of same personage, therefore in step When video pictures being saved and uploaded in 400, the content for saving and uploading has a large amount of duplicate contents, but it is not necessary to All video pictures comprising effective people information are all preserved, memory capacity otherwise can be wasted, data is improved and deposits Cost is stored up, while also will increase transmitted data on network amount, improve data transfer cost.Therefore in one embodiment, in step Detect that human body target region included in the video pictures comprising human body target includes effective facial image in rapid 300 In the case of, before the video pictures comprising effective facial image are saved and/or are uploaded in step 400:

The effective facial image for first including by present frame video pictures is compared with former frame video pictures, and previous When in frame video pictures including the facial image for all human body targets that present frame video pictures include, cancels and present frame is regarded The preservation and upload of frequency picture.

It include effective facial image in Fig. 2, therefore meet and be saved by taking present frame video pictures shown in Fig. 2 as an example The condition of biography, but by being found after it is compared with former frame video pictures, former frame video pictures and present frame video Personage corresponding to effective facial image that picture includes is all the same, is all target P1, therefore this two frames video pictures is used as Script holder record, security record meaning and contribution be identical, therefore be saved and uploaded former frame video pictures or In the case where having saved with the meaning of former frame video pictures and the identical more previous video pictures of contribution, then without pair Present frame video pictures are saved and are uploaded, to be further reduced the data volume of storage and the data volume of upload.

Specifically, access control system can cache former frame video pictures, it is made whether to need to present frame video pictures When preservation upload judges, use the former frame video pictures of caching as reference, then no matter whether former frame video pictures are protected It deposits or uploads, can guarantee among one section of continuous video frame, if the face that every frame video pictures include is all the same, to admission Record, the meaning of security record are also all the same, then only first frame video pictures are saved and upload, alternatively, passing through every detection When one new video frame, the quality of effective facial image will be compared, quality of human face image is higher than to the view saved The same meaning video frame of frequency frame replaces previously stored video frame.

It should be noted that the video pictures due to preservation can be recorded as admission record and security, it is only continuous One section of video frame comprising identical effective face in, the preservation of wherein most video frame can be saved, if otherwise target P1 occurred primary and after being collected, uploading in monitoring system before one month, and today after one month goes out again When now and collected, even if its video frame occurred is spaced one month, belong to discontinuous video frame, as having saved Information and no longer save, this just cannot achieve admission record and security record function.

Implement below with reference to the video process apparatus for realizing the effective contents interception of gate inhibition disclosed in Fig. 3 the present invention is described in detail Example.The present embodiment is for implementing method for processing video frequency above-mentioned.

As shown in figure 3, video process apparatus disclosed in the present embodiment, comprising:

Human body target judgment module, the view comprising motion target area identified for judging moving object detection module Whether corresponding moving target belongs to human body target in frequency picture；

Face information detection module, for detecting the video pictures comprising human body target of human body target judgment module judgement Included in human body target region whether include effective facial image；

Video pictures retain module, the video comprising effective facial image for detecting to face information detection module Picture is saved and/or is uploaded.

In one embodiment, moving object detection module is identified from the video pictures of acquisition using background subtraction method Foreground area out, and using obtained foreground area as motion target area；

Wherein, the background model that background subtraction method uses is mixed Gauss model or pixel grey scale mean value model.

In one embodiment, moving object detection module includes first object detection sub-module, for using mixing Gauss model carries out the identification of foreground area as background model to the video pictures of acquisition；

First object detection sub-module includes:

Model Matching unit is high for each of each pixel and the priority in video pictures sorting from high to low This model carry out sequence matching, judgement and the matched Gauss model of pixel；

Parameter updating unit carries out parameter with the matched Gauss model of pixel for what is gone out to Model Matching units match It updates；

Foreground selection unit, in the Gauss model after being updated parameter updating unit highest priority and The sum of weight greater than background weight threshold multiple Gauss models as background；

First prospect acquiring unit, multiple Foreground selection units for each pixel and priority to sort from high to low Background Gauss model carry out sequence matching is selected, the pixel for belonging to prospect is determined, obtains foreground area.

In one embodiment, in Model Matching unit judges Gauss model matched with pixel, any high This model is and in the unmatched situation of pixel: the smallest Gauss model of Model Matching Unit selection weight as with pixel Matched Gauss model.

In one embodiment, moving object detection module includes the second target detection submodule, for using pixel Gray average model carries out the identification of foreground area as background model to the video pictures of acquisition；

Second target detection submodule includes:

Context update unit carries out the background model that background acquiring unit obtains for the video image using present frame It updates, obtains new background model；

Probability calculation unit is updated for calculating the video pictures to be detected for being converted to gray level image and context update unit The grey scale difference of background model afterwards obtains foreground pixel probability distribution according to grey scale difference；

Second prospect acquiring unit, the foreground pixel probability distribution for calculating according to probability calculation unit obtain foreground zone Domain.

In one embodiment, the second prospect acquiring unit includes:

Cumulative statistics subelement is counted for foreground pixel to be divided into multiple grids according to foreground pixel probability distribution In each grid pixel prospect probability cumulative and；

Accounting judgment sub-unit, for the accounting for being accumulated in corresponding grid inner area as unit of grid according to each grid Judge whether each grid belongs to prospect grid, and then obtains the foreground area being made of prospect grid.

In one embodiment, human body target judgment module includes:

Row pixel number statistic unit, for motion target area to coordinate axial projection, to be obtained motion target area each Pixel number under pixel rower；

Target site matching unit, the spy of each pixel rower pixel number for being counted according to row pixel number statistic unit Sign, determines the pixel rower of respective numbers corresponding at least three target sites of human body；

For distance than judging unit, the target site of each and human body for judging that target site matching unit is determined is corresponding Pixel column be marked in reference axis distance from each other the ratio between two-by-two whether in corresponding human body distance than in range, if In preset human body distance than then determining that the moving target in motion target area belongs to human body target in range.

In one embodiment, face information detection module includes:

Human face region search unit, for including the search in the video pictures of motion target area by classifier judgement Whether the picture area of window covering belongs to human face region；

Picture search spread unit, for passing through the mobile search window reality in the video pictures comprising motion target area The traversal of existing video pictures；

Organ characteristic's acquiring unit, for determine human face region search unit determine human face region in include face device The positions and dimensions of official；

Effective face judging unit, the positional relationship of the human face for being determined based on organ characteristic's acquiring unit are sentenced Determine whether human face region belongs to effective facial image.

In one embodiment, video pictures retain module extracted from the video pictures comprising effective facial image to The picture area comprising face part is saved and/or is uploaded less.

In one embodiment, device further include:

Picture retains judgment module, in the video pictures that face information detection module detects comprising human body target In the case that the human body target region for being included includes effective facial image, module is retained to including effective people in video pictures Before the video pictures of face image are saved and/or uploaded: the effective facial image for including by present frame video pictures is with before One frame video pictures are compared, and include all human body mesh that present frame video pictures include in former frame video pictures When target facial image, cancels video pictures and retain preservation and upload of the module to present frame video pictures.

It should be understood that in the accompanying drawings, from beginning to end same or similar label indicate same or similar element or Element with the same or similar functions.Described embodiments are some of the embodiments of the present invention, rather than whole implementation Example, in the absence of conflict, the features in the embodiments and the embodiments of the present application can be combined with each other.Based in the present invention Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, It shall fall within the protection scope of the present invention.

Herein, " first ", " second " etc. are only used for mutual differentiation, rather than indicate their significance level and sequence Deng.

The division of module, unit or assembly herein is only a kind of division of logic function, in actual implementation may be used To there is other division modes, such as multiple modules and/or unit can be combined or are integrated in another system.As separation The module of part description, unit, component are also possible to indiscrete may be physically separated.It is shown as a unit Component can be physical unit, may not be physical unit, it can is located at a specific place, may be distributed over grid In unit.Therefore some or all of units can be selected to realize the scheme of embodiment according to actual needs.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by those familiar with the art, all answers It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with the scope of protection of the claims It is quasi-.

Claims

1. a kind of method for processing video frequency for realizing the effective contents interception of gate inhibition characterized by comprising

Moving object detection is carried out to the video pictures of acquisition, to identify the video pictures for including motion target area；

Whether human body target region included in video pictures of the detection comprising the human body target includes effective facial image；

Video pictures comprising effective facial image are saved and/or uploaded.

2. the method as described in claim 1, which is characterized in that the video pictures of described pair of acquisition carry out moving object detection packet It includes:

Foreground area, and the foreground area that will be obtained are identified from the video pictures of the acquisition using background subtraction method As the motion target area；

3. method according to claim 2, which is characterized in that using mixed Gauss model as background model to the view of acquisition Frequency picture carry out foreground area identification include:

Each Gauss model carry out sequence matching that each pixel in video pictures is sorted from high to low with priority, sentences The disconnected and matched Gauss model of the pixel；

Parameter update is carried out to the matched Gauss model of the pixel；

By in updated Gauss model highest priority and weight sum greater than background weight threshold multiple Gaussian modes Type is as background；

Multiple background Gauss model carry out sequence matchings that each pixel and priority are sorted from high to low, determine to belong to In the pixel of prospect, foreground area is obtained.

4. the method as described in claim 1, which is characterized in that the judgement includes the video pictures of the motion target area In corresponding moving target whether belong to human body target and include:

By the motion target area to coordinate axial projection, pixel of the motion target area under each pixel rower is obtained Number；

According to the feature of each pixel rower pixel number, respective counts corresponding at least three target sites of human body are determined The pixel rower of amount；

Judge each pixel column corresponding with human body target position be marked on distance from each other in reference axis distance compare whether In corresponding human body distance than in range, if determining the moving target if preset human body distance is than in range Moving target in region belongs to human body target.

5. the method as described in claim 1, which is characterized in that the described pair of video pictures comprising effective facial image into Row is saved and/or is uploaded

The picture area progress for including at least face part is extracted from the video pictures comprising effective facial image It saves and/or uploads.

6. a kind of video process apparatus for realizing the effective contents interception of gate inhibition characterized by comprising

Moving object detection module, the video pictures for acquiring to video capture device carry out moving object detection, with identification It out include the video pictures of motion target area；

Human body target judgment module, for judging that the moving object detection module identifies comprising the motion target area Video pictures in corresponding moving target whether belong to human body target；

Face information detection module, the video comprising the human body target determined for detecting the human body target judgment module Whether human body target region included in picture includes effective facial image；

Video pictures retain module, for the face information detection module detect comprising effective facial image Video pictures are saved and/or are uploaded.

7. device as claimed in claim 6, which is characterized in that the moving object detection module uses background subtraction method from institute It states and identifies foreground area in the video pictures of acquisition, and using the obtained foreground area as the motion target area；

8. device as claimed in claim 7, which is characterized in that the moving object detection module includes first object detection Module, for as background model the video pictures of acquisition to be carried out with the identification of foreground area using mixed Gauss model；

The first object detection sub-module includes:

Model Matching unit, each Gaussian mode for each pixel in video pictures to sort from high to low with priority Type carry out sequence matching, judgement and the matched Gauss model of the pixel；

Parameter updating unit is carried out for what is gone out to the Model Matching units match with the matched Gauss model of the pixel Parameter updates；

Foreground selection unit, in the Gauss model after being updated the parameter updating unit highest priority and The sum of weight greater than background weight threshold multiple Gauss models as background；

First prospect acquiring unit, multiple Foreground selections for each pixel and priority to sort from high to low Unit selection goes out background Gauss model carry out sequence matching, determines the pixel for belonging to prospect, obtains foreground area.

9. device as claimed in claim 6, which is characterized in that the human body target judgment module includes:

Row pixel number statistic unit, for the motion target area to coordinate axial projection, to be obtained the motion target area Pixel number under each pixel rower；

Target site matching unit, the spy of each pixel rower pixel number for being counted according to the row pixel number statistic unit Sign, determines the pixel rower of respective numbers corresponding at least three target sites of human body；

For distance than judging unit, the target site of each and human body for judging that the target site matching unit is determined is corresponding Pixel column be marked in reference axis distance from each other the ratio between two-by-two whether in corresponding human body distance than in range, if In preset human body distance than then determining that the moving target in the motion target area belongs to human body target in range.

10. device as claimed in claim 6, which is characterized in that the video pictures, which retain module, to be had from described comprising described The picture area that extraction includes at least face part in the video pictures of facial image is imitated to be saved and/or uploaded.