CN103325121B

CN103325121B - Method and system for estimating network topological relations of cameras in monitoring scenes

Info

Publication number: CN103325121B
Application number: CN201310270349.2A
Authority: CN
Inventors: 张红广; 崔建竹; 唐潮; 田飞; 王鹏; 邓娜娜; 蒋建彬; 马娜; 高会武; 徐尚鹏; 季益华; 马铁; 宋成国
Original assignee: SMART CITY INFORMATION TECHNOLOGY Co Ltd; Bianco Robot Co Ltd; Shanghai Advanced Research Institute of CAS
Current assignee: SMART CITY INFORMATION TECHNOLOGY Co Ltd; Bianco Robot Co Ltd; Shanghai Advanced Research Institute of CAS
Priority date: 2013-06-28
Filing date: 2013-06-28
Publication date: 2017-05-17
Anticipated expiration: 2033-06-28
Also published as: CN103325121A

Abstract

The invention relates to the technical field of security and protection, and provides a method and system for establishing network topological relations of cameras in monitoring scenes. The method comprises the steps of resolving the monitoring scenes in video streams shot in a monitoring network into grids, obtaining color histogram information of light streams of all grids in the monitoring scenes, conducting clustering on the grids in the monitoring scenes according to the color histogram information of the light streams of the grids in the monitoring scenes to obtain semantic region segmentation results of the monitoring scenes, and determining the network topological relations among all the cameras according to the semantic region segmentation results in the monitoring scenes. The method and system for establishing the network topological relations among the cameras in the monitoring scenes solves the problem that in the prior art, due to the fact that the topological relations among the cameras are all based on locating and tracking of specific target activities, when obstructions exist in a monitoring environment or the monitoring image resolution ratio is low, algorithm performance decreases sharply.

Description

Camera network topological relation evaluation method and system in a kind of monitoring scene

Technical field

The invention belongs to camera network topological relation estimation side in technical field of security and protection, more particularly to a kind of monitoring scene Method and system.

Background technology

The topology of camera network estimates a key issue for being camera network deployment, and accurate topology is estimated not only Will appreciate that the motor pattern of the targets such as personnel, crowd in monitor area, it is also possible to by feedback, further Optimization deployment.

Prior art carries out the topology estimation of camera network there is provided certain methods, including：

First, the personnel's detect and track result rejected based on image background, obtains the mistake of crowd activity between multiple video cameras Control relatedness, the goal activitiess pattern to analyze and setting up whole scene provides foundation.

2nd, the personnel's paces information caught using multiple shooting robots, obtains the general modfel of personnel activity, and according to this mould Formula readjusts video camera deployment, realizes reaching monitoring objective with more preferable visual angle and less video camera number.

3rd, the mixing Multilayer networks device based on Parzen windows and gaussian kernel come estimate by time interval, turnover observation The probability density function of the movement velocity equivalent composition when position of the ken and the turnover ken, whole estimation procedure is by learning instruction The method for practicing collection data is realized.

4th, represent that observed object goes out in next video camera using a kind of Fuzzy Time interval in terms of time-domain constraints Existing probability, this probability is estimated to obtain by the equation of motion.

5th, it is a multiple-camera monitoring network by the method for unsupervised learning using substantial amounts of target observation data The time-space domain topological relation between video camera is automatically set up.On this basis, they give verification algorithm performance Method simultaneously realizes target tracking in the network.

6th, using the more generally theory of information thought trusted with regard to statistics, uncertain correspondence and Bayes side Method combines, and reduces assumed condition and embodies preferable performance.

7th, assume that all of video camera all has potential annexation, then impossible connection is gone by observation Fall, experiment proves their method in terms of extensive camera network topological relation is learnt, especially less in learning sample In the case of, with preferable efficiency and effect.

8th, extensive work carries out global activity analyses and pedestrian recognizes again using the topological relation of multiple-camera.

But, above topology inference algorithm is all based on the positioning to objectives activity, follows the trail of, to monitor video matter substantially Amount requires higher, blocks or when monitoring image resolution is relatively low when existing in monitors environment, and algorithm performance will drastically decline.

The content of the invention

The purpose of the embodiment of the present invention be provide in a kind of monitoring scene camera network topological relation evaluation method and System, to solve prior art presence, existing topology inference algorithm is all based on the positioning to objectives activity, chases after substantially Track, it is higher to monitor video prescription, block or when monitoring image resolution is relatively low when existing in monitors environment, algorithm performance By the problem for drastically declining.

What embodiments of the invention were realized in, camera network topological relation evaluation method in a kind of monitoring scene, The method comprising the steps of:

Monitoring scene in the video flowing that every video camera in monitoring network is photographed is decomposed into grid；

For each monitoring scene, the color histogram information of the light stream of each grid in the monitoring scene is obtained；

For each monitoring scene, according to the color histogram information of the light stream of each grid in the monitoring scene to described Grid in monitoring scene is clustered, and obtains the semantic region segmentation result of the monitoring scene；

Network in monitoring network between each video camera is determined according to the semantic region segmentation result of each monitoring scene Topological relation.

The purpose of another embodiment of the present invention is to provide camera network topological relation estimation in a kind of monitoring scene System, the system includes：

Resolving cell, for the monitoring scene in the video flowing that photographs in every video camera in monitoring network net is decomposed into Lattice；

Acquiring unit, for for each monitoring scene, the color for obtaining the light stream of each grid in the monitoring scene to be straight Square figure information；

Cluster cell is straight according to the color of the light stream of each grid in the monitoring scene for for each monitoring scene Square figure information is clustered to the grid in the monitoring scene, obtains the semantic region segmentation result of the monitoring scene；

Determining unit, for determining each shooting in monitoring network according to the semantic region segmentation result of each monitoring scene Network topology between machine.

The embodiment of the present invention calculates the color histogram feature of the light stream of grid by optical flow algorithm, further calculates and takes the photograph Topological relation between camera, and need not clearly obtain very much the positioning of moving target or be tracked, solve existing What technology was present, the topological relation between calculating video camera is all based on the positioning to objectives activity, follows the trail of, to monitor video Prescription is higher, block or when monitoring image resolution is relatively low when existing in monitors environment, and algorithm performance will drastically decline Problem.

Description of the drawings

Technical scheme in order to be illustrated more clearly that the embodiment of the present invention, below will be to embodiment or description of the prior art Needed for the accompanying drawing to be used be briefly described, it should be apparent that, drawings in the following description be only the present invention some Embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can be with attached according to these Figure obtains other accompanying drawings.

Fig. 1 is the realization stream of camera network topological relation evaluation method in the monitoring scene that one embodiment of the invention is provided Cheng Tu；

Fig. 2 is the video camera topological relation estimated result figure that another embodiment of the present invention is provided；

Fig. 3 is the floor that another embodiment of the present invention is provided and video camera deployment sketch；

Fig. 4 is the module of camera network topological relation estimating system in the monitoring scene that another embodiment of the present invention is provided Structure chart.

Specific embodiment

In order that the objects, technical solutions and advantages of the present invention become more apparent, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, and It is not used in the restriction present invention.

One embodiment of the invention provides camera network topological relation evaluation method in monitoring scene, and methods described is as schemed Shown in 1, concrete steps include:

In step S101, the monitoring scene in the video flowing that every video camera in monitoring network is photographed is decomposed into net Lattice.

In the present embodiment, monitoring network includes at least two video cameras, and every video camera all photographs video flowing, depending on Frequency stream the inside includes multiple frames, is all piece image per frame, in multiple frames, has the image that moving target is passed through to be exactly to monitor field Scape.

It should be noted that sizing grid is generally 10*10, it is also possible to default, but regarding of photographing of all video cameras Sizing grid after monitoring scene in frequency stream decomposes is consistent.

In step s 102, for each monitoring scene, the color for obtaining the light stream of each grid in the monitoring scene is straight Square figure information.

Specifically, realize that the method for obtaining the color histogram information of the light stream of each grid in the monitoring scene is concrete For：

The video flowing that definition video camera is photographed is I_n(X), wherein X is the grid in the monitoring scene of the video flowing Coordinate, X=（x;y）^T, the x is the abscissa of the grid, and the y is the vertical coordinate of the grid, the T representing matrixs Transposition, the n is the numbering of the frame of video that the video stream packets contain；

Definition

Wherein, W represents deformable template, p=（P1, p2, p3, p4, p5, p6）^T, described p1, p2, p3, p4 are 0, described P5, p6 are the Optic flow information of grid；

Definition

Wherein, Δ p represents the difference of the p of iteration twice, T（x）Represent the grid that the frame of video flowing first decomposes；

It should be noted that the grid that the frame of video flowing first decomposes refers to the net in video flowing after the decomposition of the first two field picture Lattice.

According to（3）、（4）、（5）It is iterated, until meeting Δ p less than predetermined threshold value ε；

Wherein, the Ix represents grid gradient map in the direction of the x axis, and the Iy represents grid ladder in the y-axis direction Degree figure, the ▽ I represent grid through deformable template W (X；P) gradient map after converting；

Calculate and meet Δ p less than p5, p6 during predetermined threshold value ε；

The color Optic flow information that light stream obtains grid is obtained from tri- components of RGB of the Optic flow information of grid；

According to the color Optic flow information of the grid, histogram information of the light stream in 8 directions is calculated, the light stream is 8 The histogram information in individual direction is the color histogram feature of the light stream of grid, the color histogram feature of the light stream of the grid Including the light stream v ' on the light stream u ' b and vertical direction in horizontal direction_b。

It should be noted that 8 directions are specially every 45 degree of directions.

In step s 103, it is straight according to the color of the light stream of each grid in the monitoring scene for each monitoring scene Square figure information is clustered to the grid in the monitoring scene, obtains the semantic region segmentation result of the monitoring scene.

Specifically, realize that step S103 is specially：

In step S104, each shooting in monitoring network is determined according to the semantic region segmentation result of each monitoring scene Network topology between machine.

Specifically, by taking two video cameras as an example, two video cameras are respectively the first video camera and the second video camera, institute State first and second not representative order, be only used for distinguish video camera；First video camera obtains the first video flowing, the second video camera Obtain the second video flowing, video flowing includes multiple frames, be all piece image per frame, the first video flowing includes the first image, second Video flowing includes the second image；In the first image, there is the image that moving target is passed through for the first monitoring scene, the motion mesh Mark includes people, animal or other material objects, in the second image, has the image that moving target is passed through to be the second monitoring scene.

Wherein, a_iThe color histogram feature of the light stream of the first grid is represented, first grid is by the first monitoring Scene is decomposed, and first monitoring scene is shot by the first video camera, a_jRepresent the color histogram of the light stream of the second grid Figure feature, second grid is decomposed by the second monitoring scene, and second monitoring scene is shot by the second video camera, and described One video camera and the second video camera are any two video cameras in monitoring network, and the c represents the second grid after the τ moment, institute StateRepresent the color histogram feature of the color histogram feature of the light stream of the first grid and the light stream of the second grid The degree of association, it is describedRepresent the time shift of the first grid and the second grid, the Ψ_i,jRepresent that the first video camera and second is taken the photograph The topological relation estimated result of camera.

It should be noted that Ψ_i,jDuring more than 0.5, represent that the first video camera and the second video camera are topological related；In step In rapid S104, the topological relation estimated result between any two video cameras need to be calculated.

Another embodiment of the present invention provides video camera topological relation estimated result as shown in Fig. 2 video camera topological relation Estimated result is as follows：

The embodiment of the present invention chooses 7 video camera of institute administrative building one layer such as Chinese Academy of Sciences's Shanghai height, therefrom chooses At ten one points at noon one day calculates video camera topological relation estimated result to the video flowing of any in afternoon as sample.

7 camera units are deployed on same floor in experiment, wherein 1. number and 3. a number video camera is deployed in lift port, remaining is taken the photograph Camera is deployed at 5 entrance and exit of the passage.Floor and video camera deployment sketch are as shown in Figure 3.

Circled numbers represent camera number in Fig. 2, corresponding with Fig. 3 camera numbers, two shootings in experimental result There is solid line to be connected between machine and represent presence association between this two camera supervised targets, i.e., same target occurs in two video cameras and regards Yezhong, can reflect goal activitiess trend from the angle of probability statistics.Between video camera without solid line be connected represent this two video cameras without Association or relatedness very little.As 1. 6. 7. there is very strong association between number video camera, because 7. a number video camera position is whole The main entrance of administrative building, the elevator at 1. number video camera must be passed through into after floor upstairs or by the passage at 6. number video camera Into Stall rear end dining room.Because the selected time period is the lunchtime, second floor above so many people is needed by 1. number video camera Place's elevator reaches Stall, then enters dining room from passage at 6. number video camera, and after lunch, personnel understand backtracking.2. number video camera 3. there is no association between number video camera（Or relatedness very little）, because elevator is Cargo Lift at 3. number video camera, only for dining room inside Use, be dining room backstage kitchen between video camera at two, there is no direct path.

Another embodiment of the present invention provides camera network topological relation estimating system in monitoring scene, the system Modular structure as shown in Figure 4, is specifically included:

Resolving cell 41, is decomposed into for the monitoring scene in the video flowing that photographs in every video camera in monitoring network Grid；

Acquiring unit 42, for for each monitoring scene, obtaining the color of the light stream of each grid in the monitoring scene Histogram information；

Cluster cell 43, for for each monitoring scene, according to the color of the light stream of each grid in the monitoring scene Histogram information is clustered to the grid in the monitoring scene, obtains the semantic region segmentation result of the monitoring scene；

Determining unit 44, takes the photograph for each for being determined in monitoring network according to the semantic region segmentation result of each monitoring scene Network topology between camera.

Optionally, the acquiring unit 42 specifically for：

Definition

According to the color Optic flow information of the grid, histogram information of the light stream in 8 directions is calculated, the light stream is 8 The histogram information in individual direction is the color histogram feature of the light stream of grid, the color histogram feature of the light stream of the grid Including the light stream u ' in horizontal direction_bWith the light stream v ' in vertical direction_b。

Optionally, the cluster cell 43 specifically for：

Optionally, the determining unit 44 specifically for：

Wherein, a_iThe color histogram feature of the light stream of the first grid is represented, first grid is by the first monitoring Scene is decomposed, and first monitoring scene is shot by the first video camera, a_jRepresent the color histogram of the light stream of the second grid Figure feature, second grid is decomposed by the second monitoring scene, and second monitoring scene is shot by the second video camera, and described One video camera and second video camera are to monitor any two video cameras in network, and the c represents the second net after the τ moment Lattice, it is describedRepresent the color histogram of the color histogram feature of the light stream of the first grid and the light stream of the second grid The degree of association of feature, it is describedRepresent the time shift of the first grid and the second grid, the Ψ_i,jRepresent the first video camera and The topological relation estimated result of the second video camera.

Optionally, 8 directions are specially every 45 degree of directions.

One of ordinary skill in the art will appreciate that the modules included by above-described embodiment are simply patrolled according to function Collect what is divided, but be not limited to above-mentioned division, as long as corresponding function can be realized；In addition, each function mould The specific name of block is also only to facilitate mutually differentiation, is not limited to protection scope of the present invention.

Those of ordinary skill in the art are further appreciated that all or part of step realized in above-described embodiment method is can Completed with instructing the hardware of correlation by program, described program can be described in read/write memory medium is stored in Storage medium, including ROM/RAM etc..

Presently preferred embodiments of the present invention is the foregoing is only, not to limit the present invention, all essences in the present invention Any modification, equivalent and improvement made within god and principle etc., should be included within the scope of the present invention.

Claims

1. it is a kind of to monitor camera network topological relation evaluation method in network, it is characterised in that methods described includes：

For each monitoring scene, according to the color histogram information of the light stream of each grid in the monitoring scene to the monitoring Grid in scene is clustered, and obtains the semantic region segmentation result of the monitoring scene；

Network topology in monitoring network between each video camera is determined according to the semantic region segmentation result of each monitoring scene Relation；

The color histogram information for obtaining the light stream of each grid in the monitoring scene is specially：

The video flowing that definition video camera is photographed is I_n(X), wherein X is the coordinate of the grid in the monitoring scene of the video flowing, X=(x；y)^T, the x is the abscissa of the grid, and the y is the vertical coordinate of the grid, the T representing matrixs turn Put, the n is the numbering of the frame of video that the video stream packets contain；

Definition

Wherein, W represents deformable template, p=(p1, p2, p3, p4, p5, p6)^T, described p1, p2, p3, p4 are 0, described p5, p6 For the Optic flow information of grid；

Definition

Wherein, Δ p represents the difference of the p of iteration twice, and T (x) represents the grid that the frame of video flowing first decomposes；

It is iterated according to (3), (4), (5), until meeting Δ p less than predetermined threshold value ε；

&dtri; I = W (Δ x; p) = (\begin{matrix} I x + p 5 \\ I y + p 6 \end{matrix}); - - - (3)

Wherein, the Ix represents grid gradient map in the direction of the x axis, and the Iy represents grid gradient map in the y-axis direction, It is describedRepresent grid through deformable template W (X；P) gradient map after converting；

H = \underset{x}{Σ} {[&dtri; I]}^{T} [&dtri; I]; - - - (4)

Δ p = H^{- 1} * \underset{x}{Σ} {[&dtri; I]}^{T} [T (x) - I (W (x; p))]; - - - (5)

According to the color Optic flow information of the grid, histogram information of the light stream in 8 directions is calculated, the light stream is 8 sides To histogram information be grid light stream color histogram feature, the color histogram feature of the light stream of the grid includes Light stream u ' in horizontal direction_bWith the light stream v ' in vertical direction_b。

2. the method for claim 1, it is characterised in that described for each monitoring scene, according to the monitoring scene In the color histogram information of light stream of each grid the grid in the monitoring scene is clustered, obtain the monitoring scene Semantic region segmentation result be specially：

u_{n} = Σ_{b &Element; r_{n}} u_{b}^{'}; - - - (6)

v_{n} = Σ_{b &Element; r_{n}} v_{b}^{'} . - - - (7)

3. method as claimed in claim 2, it is characterised in that the semantic region segmentation result according to each monitoring scene It is determined that the network topology in monitoring network between each video camera is specially：

ρ_{a_{i}, a_{j}} (τ) = \frac{E [a_{i} c]}{\sqrt{E [{a_{i}}^{2}] E [c^{2}]}}; - - - (8)

{\hat{τ}}_{a_{i}, a_{j}} = \underset{τ}{\arg \max} \frac{Σ ρ_{a_{i}, a_{j}} (τ)}{Γ}; - - - (9)

Ψ_{i, j} = ρ_{a_{i}, a_{j}} (τ) (1 - {\hat{τ}}_{a_{i}, a_{j}}) - - - (10)

Wherein, a_iThe color histogram feature of the light stream of the first grid is represented, first grid is by the first monitoring scene point Solution, first monitoring scene is shot by the first video camera, a_jThe color histogram feature of the light stream of the second grid is represented, Second grid is decomposed by the second monitoring scene, and second monitoring scene is shot by the second video camera, first shooting Machine and the second video camera are any two video cameras in monitoring network, and the c represents the second grid after the τ moment, describedRepresent the pass of the color histogram feature of the color histogram feature of the light stream of the first grid and the light stream of the second grid Connection degree, it is describedRepresent the time shift of the first grid and the second grid, the Ψ_i,jRepresent the first video camera and the second video camera Topological relation estimated result.

4. the method for claim 1, it is characterised in that 8 directions are specially every 45 degree of directions.

5. it is a kind of to monitor camera network topological relation estimating system in network, it is characterised in that the system includes：

Resolving cell, for the monitoring scene in the video flowing that photographs in every video camera in monitoring network grid is decomposed into；

Acquiring unit, for for each monitoring scene, obtaining the color histogram of the light stream of each grid in the monitoring scene Information；

Cluster cell, for for each monitoring scene, according to the color histogram of the light stream of each grid in the monitoring scene Information is clustered to the grid in the monitoring scene, obtains the semantic region segmentation result of the monitoring scene；

Determining unit, for according to the semantic region segmentation result of each monitoring scene determine in monitoring network each video camera it Between network topology；

The acquiring unit specifically for：

Definition

W (X; p) = (\begin{matrix} (1 + p 1) \times x + p 3 \times y + p 5 \\ p 2 \times x + (1 + p 4) \times y + p 6 \end{matrix}); - - - (1)

Definition

&dtri; I = W (Δ x; p) = (\begin{matrix} I x + p 5 \\ I y + p 6 \end{matrix}); - - - (3)

H = \underset{x}{Σ} {[&dtri; I]}^{T} [&dtri; I]; - - - (4)

Δ p = H^{- 1} * \underset{x}{Σ} {[&dtri; I]}^{T} [T (x) - I (W (x; p))]; - - - (5)

6. system as claimed in claim 5, it is characterised in that the cluster cell specifically for：

u_{n} = Σ_{b &Element; r_{n}} u_{b}^{'}; - - - (6)

v_{n} = Σ_{b &Element; r_{n}} v_{b}^{'} . - - - (7)

7. system as claimed in claim 6, it is characterised in that the determining unit specifically for：

ρ_{a_{i}, a_{j}} (τ) = \frac{E [a_{i} c]}{\sqrt{E [{a_{i}}^{2}] E [c^{2}]}}; - - - (8)

{\hat{τ}}_{a_{i}, a_{j}} = \underset{τ}{\arg \max} \frac{Σ ρ_{a_{i}, a_{j}} (τ)}{Γ}; - - - (9)

Ψ_{i, j} = ρ_{a_{i}, a_{j}} (τ) (1 - {\hat{τ}}_{a_{i}, a_{j}}); - - - (10)

Wherein, a_iThe color histogram feature of the light stream of the first grid is represented, first grid is by the first monitoring scene point Solution, first monitoring scene is shot by the first video camera, a_jThe color histogram feature of the light stream of the second grid is represented, Second grid is decomposed by the second monitoring scene, and second monitoring scene is shot by the second video camera, first shooting Machine and second video camera are to monitor any two video cameras in network, and the c represents the second grid after the τ moment, institute StateRepresent the color histogram feature of the color histogram feature of the light stream of the first grid and the light stream of the second grid The degree of association, it is describedRepresent the time shift of the first grid and the second grid, the Ψ_i,jRepresent the first video camera and the second shooting The topological relation estimated result of machine.

8. system as claimed in claim 6, it is characterised in that 8 directions are specially every 45 degree of directions.