CN109359530B

CN109359530B - Intelligent video monitoring method and device

Info

Publication number: CN109359530B
Application number: CN201811061880.8A
Authority: CN
Inventors: 谢海斌; 李立伟; 庄东晔; 郑永斌; 徐婉莹; 白圣建; 李兴玮
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2018-09-12
Filing date: 2018-09-12
Publication date: 2022-01-25
Anticipated expiration: 2038-09-12
Also published as: CN109359530A

Abstract

The invention discloses an intelligent video monitoring method and a device, wherein the method comprises the following steps: s1, monitoring a target video in real time, and detecting a face image from a video image in real time and outputting the face image; s2, coding the face image detected in the step S1 to obtain a code of the image to be detected; s3, respectively matching and comparing the codes of the images to be detected with database codes obtained from the face images of the database in advance to obtain and output identification results; the device comprises a human face real-time monitoring module, an image intelligent coding module and a human face rapid retrieval module. The invention can realize intelligent video monitoring, improve the intelligent level of video monitoring and greatly reduce the occupied space of video data storage and transmission.

Description

Intelligent video monitoring method and device

Technical Field

The invention relates to the technical field of video monitoring, in particular to an intelligent video monitoring method and device.

Background

With the increasing requirement on safety, the video monitoring system has more and more obvious effect. Traditional video monitoring is used for video capture, storage and playback, is difficult to play the effect of early warning and alarming, and real-time monitoring needs the people for incessantly monitoring the video, causes the waste of a large amount of manpowers and time, and simultaneously along with the explosive growth of data volume, traditional monitoring mode also can't satisfy the demand to the storage and the transmission of magnanimity video data. The traditional video monitoring adopts a manual operation and a post evidence obtaining old mode, has low intelligent degree, and has various problems that massive video data can occupy a large amount of storage and transmission resources and the like.

In view of the above problems of the conventional monitoring mode, intelligent video monitoring is developed, which is also called as video analysis technology, that is, video is automatically understood and analyzed by means of artificial intelligence and machine vision. Aiming at the video monitoring technology, the following methods are mainly adopted at present: 1. carrying out target detection from the background modeling angle; 2. tracking the target from the aspects of single-camera and multi-camera tracking; 3. classifying and identifying the target; 4. when a behavior recognition algorithm is adopted, when the human face is retrieved in video monitoring, the pictures to be retrieved are generally compared with all the pictures in a database one by one at present, the method has the disadvantages of large time consumption and low actual retrieval efficiency, particularly for massive video data, the image data amount required to be processed is huge, the human face retrieval method cannot rapidly identify the target human face image from the video monitoring, and simultaneously, the method needs to store all the pictures in the database and still needs large storage and transmission space.

The current methods for realizing rapid retrieval mainly comprise the following categories: 1) a search method based on a bag of words model; 2) a retrieval method based on a KD tree; 3) a retrieval method based on vector clustering and quantization; 4) hash-based retrieval method. The above-mentioned several types of fast retrieval methods are usually complex to implement, and the object of fast retrieval is usually database pictures such as CIFAR-10, INRIA horidays and the like which are mainly based on scenes, and usually not suitable for implementing fast retrieval of face images, but because the face has a variable property, the extraction and description of the features of the face are easily affected by expressions, illumination and the like, especially the face pose and the expression in a large face database are changed violently, and it is difficult to extract a stable descriptor for fast retrieval, and the method is directly suitable for the current fast retrieval method to perform fast retrieval of the face, and cannot implement accurate retrieval. Therefore, it is highly desirable to provide an intelligent video monitoring method, so as to enable quick face retrieval, improve the intelligent monitoring level, and reduce the occupied space for video data storage and transmission.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: aiming at the technical problems in the prior art, the invention provides the intelligent video monitoring method and the intelligent video monitoring device which are simple in implementation method, high in intelligent degree, high in monitoring precision and small in storage and transmission occupied space.

In order to solve the technical problems, the technical scheme provided by the invention is as follows:

an intelligent video monitoring method comprises the following steps:

s1, monitoring the face in real time: monitoring a target video in real time, and detecting human face image output from a video image in real time;

s2, image intelligent coding: coding the face image detected in the step S1 to obtain a code of the image to be detected;

s3, face quick retrieval: and respectively matching and comparing the codes of the images to be detected with the codes of the database obtained by the face images of the database in advance to obtain and output the identification result.

As a further improvement of the method of the present invention, in step S1, the BGP face recognition method based on heuristic prior information is used for detecting the face image for the video image, and the method includes the steps of:

BGP extraction: performing BGP feature extraction on the image to be detected based on a BGP algorithm, dividing the BGP feature image into a plurality of sub-blocks, counting BGP histograms of the sub-blocks, and splicing the statistical histograms of the sub-blocks to obtain corresponding BGP feature vectors;

characteristic weighting: weighting the statistical histograms of partial sub-blocks in the BGP feature vector according to heuristic prior information containing face structure feature information to obtain a processed BGP feature vector;

face recognition: and identifying by using the processed BGP feature vector, and outputting a face identification result.

As a further improvement of the process of the invention: and the heuristic prior information comprises position range information of the facial features, and the statistical histogram of the target subblock corresponding to the positions of the facial features is found from the BGP feature vector for weighting according to the position range information of the facial features, so that the weight of the statistical histogram of the corresponding subblock is increased according to the importance degree of the facial structure feature information.

As a further improvement of the process of the invention: the specific steps of obtaining the processed BGP feature vector are as follows: in the BGP feature image, counting sub-block ordinal numbers in longitudinal M/6 to M/2 and transverse N/5 to N2/5 to obtain a BGP block ordinal number occupied by a left eye position, counting sub-block ordinal numbers in longitudinal M/6 to M/2 and transverse N3/5 to N4/5 regions to obtain a BGP block ordinal number occupied by a right eye position, counting sub-block ordinal numbers in longitudinal M/3 to M and transverse N2/5 to N3/5 regions to obtain a BGP block ordinal number occupied by a nose and a mouth position, wherein M N is the block number of the BGP feature image during blocking, and after the BGP feature vector is obtained, finding out a statistical histogram corresponding to the BGP block ordinal number occupied by the five sense organs position to perform weighting processing to obtain a processed BGP feature vector.

As a further improvement of the method of the present invention, the step of step S1 includes:

s11, video monitoring: monitoring video information of a target video in real time to obtain real-time monitored video information;

s12, face detection: detecting a human face image from the video information monitored in real time and outputting the human face image;

s13, face grabbing: recognizing and extracting a face part from the face image output in the step S12 to obtain a face target image and outputting the face target image;

s14, image enhancement: and performing enhancement and denoising processing on the human face target image, and outputting a final human face image.

As a further improvement of the process of the invention: in step S2, the image to be detected is subjected to multi-stage BGP feature extraction based on a BGP algorithm, a BGP feature image obtained by previous-stage BPG feature extraction is used as an input for next-stage BPG feature extraction, and the feature vector obtained by each stage of BGP feature extraction is encoded to obtain a multi-layer code corresponding to each stage of BGP, and the length of each layer of code is sequentially shortened step by step to form a pyramid coding structure, thereby obtaining the code of the image to be detected.

As a further improvement of the method of the present invention, the step of step S3 includes: sequentially extracting multiple stages of BGP features from each face image in a database in advance based on a BGP algorithm, taking the BGP feature image obtained by extracting the BPG features from the previous stage as the input of the next stage of BPG feature extraction, and coding the feature vector obtained by extracting the BGP features from each stage to obtain multilayer codes corresponding to each stage of BGP, wherein the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure to obtain database image codes; and when the face retrieval is carried out, comparing the image code to be detected with the database image code layer by layer, and identifying the target face according to the comparison result.

An intelligent video monitoring device, comprising:

the human face real-time monitoring module is used for monitoring a target video in real time and detecting human face image output from a video image in real time;

the image intelligent coding module is used for coding the face image detected by the face real-time monitoring module to obtain the code of the image to be detected;

and the face quick retrieval module is used for respectively matching and comparing the codes of the images to be detected with the database codes obtained from the face images of the database in advance to obtain and output the identification result.

As a further improvement of the device of the invention: the real-time face monitoring module comprises a face recognition module and is used for detecting a face image by adopting a BGP face recognition method based on heuristic prior information on a video image, and the face recognition module comprises:

the BGP extraction unit is used for carrying out BGP feature extraction on the image to be detected based on a BGP algorithm, dividing the BGP feature image into a plurality of sub-blocks after obtaining the BGP feature image, counting BGP histograms of the sub-blocks, and splicing the obtained statistical histograms of the sub-blocks to obtain corresponding BGP feature vectors;

the characteristic weighting unit is used for weighting the statistical histogram of a part of sub-blocks in the BGP characteristic vector according to heuristic prior information containing face structure characteristic information to obtain a processed BGP characteristic vector;

and the face recognition unit is used for recognizing by using the processed BGP characteristic vector and outputting a face recognition result.

As a further improvement of the device of the invention: in the image intelligent coding module, multi-stage BGP feature extraction is carried out on an image to be detected based on a BGP algorithm, a BGP feature image obtained by previous stage BPG feature extraction is used as the input of next stage BPG feature extraction, a feature vector obtained by each stage BGP feature extraction is coded to obtain multi-layer codes corresponding to each stage BGP, the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure, and the code of the image to be detected is obtained;

in the face quick retrieval module, sequentially extracting multiple stages of BGP features from each face image in a database in advance based on a BGP algorithm, taking the BGP feature image obtained by extracting the BPG features of the previous stage as the input of the BPG feature extraction of the next stage, and coding the feature vector obtained by extracting the BGP features of each stage to obtain multilayer codes corresponding to each stage of BGP, wherein the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure to obtain database image codes; and when the face retrieval is carried out, comparing the image code to be detected with the database image code layer by layer, and identifying the target face according to the comparison result.

Compared with the prior art, the invention has the advantages that:

1. according to the intelligent video monitoring method and device, the face is automatically detected and intelligently monitored by sequentially carrying out real-time face monitoring, intelligent image coding and quick face retrieval, and meanwhile, quick retrieval can be realized by carrying out intelligent coding on the face image and comparing the image code to be detected with the database code, so that the face retrieval efficiency can be improved, the occupied space for storing and transmitting video data can be greatly reduced, the intelligent level and efficiency of video monitoring are effectively improved, and various real-time application requirements can be met.

2. The intelligent video monitoring method and the intelligent video monitoring device further consider the structural characteristics of the face, perform weighting processing on the statistical histograms of partial subblocks in the BGP feature vector by combining with heuristic prior information of the face on the basis of realizing face identification by BGP (Border gateway protocol), improve the weight of the statistical histograms of the corresponding subblocks according to the importance degree of the face feature information, fully utilize the heuristic information and prior knowledge of the face to realize intelligent face identification, and improve the discriminability and robustness of subsequent coding by combining with BGP algorithm and the heuristic information on the premise of not increasing complexity, thereby improving the intelligent monitoring level.

3. The intelligent video monitoring method and the intelligent video monitoring device further carry out multistage BGP on the face image by applying a cascade binary gradient mode, can fully mine useful information such as deeper-level edge information, texture information, gradient information and the like of the image in a multistage BGP-based mode, form more stable and robust feature representation, thereby being capable of extracting richer texture information and improving face recognition precision, simultaneously realize hierarchical coding by coding the feature vector of the multistage BGP, form pyramid coding, and search layer by layer according to coding hierarchy based on the pyramid coding during searching, can realize search from fuzzy matching to precise matching and from coarse to fine, can effectively improve searching efficiency and realize quick face search while ensuring searching precision.

Drawings

Fig. 1 is a schematic diagram of an implementation flow of the intelligent video monitoring method according to the embodiment.

Fig. 2 is a schematic diagram of an implementation flow of detecting a face image based on heuristic prior information in this embodiment.

Fig. 3 is a schematic flow chart of implementing BGP feature extraction in this embodiment.

Fig. 4 is a schematic diagram of an implementation principle of partitioning a BGP image based on heuristic prior information in this embodiment.

Fig. 5 is a schematic diagram illustrating the effect of the face image detected in the embodiment of the present invention.

Fig. 6 is a schematic flow chart of the implementation of the smart code in this embodiment.

Fig. 7 is a schematic diagram of the pyramid coding constructed in the present embodiment.

Fig. 8 is a schematic structural diagram of an intelligent video monitoring apparatus used in an embodiment of the present invention.

Fig. 9 is a schematic diagram of an implementation principle of the intelligent video apparatus of the present embodiment.

Fig. 10 is a schematic diagram of an implementation principle of implementing intelligent monitoring in an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the drawings and specific preferred embodiments of the description, without thereby limiting the scope of protection of the invention.

As shown in fig. 1, the intelligent video monitoring method of the embodiment includes the steps of:

This embodiment is through carrying out people's face real-time supervision in proper order, image intelligence code, the automatic detection face is realized to the quick retrieval of people's face, can realize intelligent monitoring, make "analysis after the fact" change into "analysis of affairs", can solve traditional video monitoring manual operation, the old mode of collecting evidence after the fact and the problem that magnanimity video data occupy a large amount of storage and transmission resources, simultaneously through carrying out intelligence code to the face image, will wait to examine image code and compare with database code, can realize retrieving out the target fast, not only can improve face retrieval efficiency, can also reduce video data storage and transmission occupation space greatly, effectively promote video monitoring intelligence level and efficiency.

As shown in fig. 2, in step S1 of this embodiment, a BGP face recognition method based on heuristic prior information is used for detecting a face image for a video image, and the steps include:

The Binary Gradient Pattern (BGP) is proposed based on image gradient direction (IGO) and binary description, and its idea is derived from the grapientfaces new descriptor, which uses image gradient direction (IGO) to describe the face instead of pixel intensity, so as to realize robustness to illumination change, i.e. the features extracted in the gradient domain are more discriminative and robust than those from the intensity domain. In the BGP algorithm, the relationship among local pixels in an image gradient domain is measured, and a bottom layer local structure is effectively coded into a group of binary character strings, so that the discrimination is increased, and the calculation complexity is greatly simplified. In order to find a potential structure of a gradient domain, BGP calculates image gradients from multiple directions, encodes the image gradients into a series of binary strings, and can represent tiny boundary change and texture information, so that the method has strong discriminability, can obtain better identification precision even in the face of shielding, illumination, expression change and the like, and each encoded value contains information of neighborhood pixel relation instead of intensity information of pixels when the BGP is used for encoding the image, so that the BGP encoded image is more robust to various environmental changes, and particularly has strong illumination invariance.

Although higher recognition accuracy can be obtained in face recognition based on BGP, the BGP operator is used as a general operator, recognition is directly performed based on the extracted BGP feature vector, the particularity of the face is not considered in the process of recognizing the face, the particularity information of the face cannot be effectively utilized, the general positions of the facial features such as eyes, a nose, a mouth and the like in a face picture are fixed, namely the face has certain specific structural information and heuristic information, and the priori knowledge can be used as heuristic information to be applied to the face recognition to optimize the recognition performance.

In the embodiment, the structural characteristics of the face are considered, on the basis of realizing face recognition by BGP, the statistical histograms of partial sub-blocks in BGP feature vectors are weighted by combining face heuristic prior information, so that the weight of the statistical histograms of the corresponding sub-blocks is improved according to the importance degree of the face feature information, the face intelligent recognition can be realized by fully utilizing the heuristic information and the prior knowledge of the face, and the recognition accuracy, the recognition efficiency and the recognition robustness can be effectively improved by combining BGP algorithm and the heuristic information.

As shown in fig. 3, the BGP feature extraction performed in this embodiment includes the steps of:

performing feature extraction on input image data based on a BGP algorithm to obtain a BGP feature image;

dividing the BGP characteristic image into sub-blocks which are not overlapped with each other;

counting the BGP histogram of each sub-block;

and splicing all the obtained sub-block histograms in sequence to obtain a final BGP feature vector.

In a specific application embodiment, assuming that a neighborhood radius R is 1, the number P of BGP neighborhood pixels is 8, the number M of blocks is N (where M is N is 5), and since 8 neighborhood pixels determine 8 directions (4 main and 4 auxiliary), the BGP encoded value of any central pixel should be a four-bit binary number, which is converted into a decimal value of 0 to 15, 8 structural patterns out of 16 patterns, and a BGP histogram dimension d of each sub-block is 8, and the histogram dimension of the face image after single-stage BGP encoding should be d₁＝200(5*5*8)。

And performing BGP feature extraction on the gray level image to obtain a BGP feature image. In this embodiment, the BGP feature image is specifically divided into 5 × 5 blocks, and the arrangement order of the 25 sub-blocks is from left to right and from top to bottom as shown in table 1.

Table 1: BGP feature image blocking order

A₁	A₂	A₃	A₄	A₅
					A₆	A₇	A₈	A₉	A₁₀
A₁₁	A₁₂	A₁₃	A₁₄	A₁₅
					A₁₆	A₁₇	A₁₈	A₁₉	A₂₀
A₂₁	A₂₂	A₂₃	A₂₄	A₂₅

The statistical histogram can effectively reflect the frequency of 8 structural modes in the subblock, and X is used_A1Is represented by A₁The statistical histogram of the sub-blocks, i.e. the frequency of occurrence of the 8 structural patterns, is represented as an 8-dimensional row vector using X_AkIs represented by A_kAnd (3) statistical histograms of the sub-blocks, namely splicing the histogram vectors of all the sub-blocks to form a final vector P, namely a characteristic vector obtained by calculating the original gray level image in a binary gradient mode, namely:

P＝[X_A1,X_A2,...X_Ak,...X_A25] (1)

after the feature vector P is obtained, the heuristic prior information is utilized to carry out statistical histogram on part of sub-blocks in the feature vector P, so that the weight of the statistical histogram of the corresponding sub-blocks is increased according to the importance degree of the face structure feature information to form a processed feature vector P, the weight corresponding to the face structure feature item is increased or reduced according to the importance degree in the processed feature vector P, the face features can be represented more accurately, and more efficient recognition can be realized subsequently based on the feature vector.

In this embodiment, the heuristic prior information includes information of a position range where the facial features are located, and the statistical histogram of the target sub-block corresponding to the positions of the facial features is found from the BGP feature vector according to the information of the position range where the facial features are located, so as to increase the weight of the statistical histogram of the corresponding sub-block according to the importance degree of the facial structure feature information.

The position of the five sense organs in the face is basically fixed and is the key of face recognition, and the heuristic prior information of the embodiment specifically includes the position range information where the five sense organs of the face are located, so that the performance of face recognition is improved by fully utilizing the information of the five sense organs as the heuristic prior information. It will be appreciated that the heuristic prior information may also use or add other structural feature information to further improve the recognition performance, and the weighting is configured to increase or decrease the weighting according to the degree of importance. By using the prior knowledge and heuristic prior information that the facial features information is important and the positions of the facial features are relatively fixed when a human identifies the face, the position information and the structure information of the facial features can be fully utilized, and the BGP method is combined to add the facial features weight coefficient to the BGP block histogram characteristics of the positions of the facial features, so that the face characterization capability is enhanced, the intelligent level of subsequent coding can be effectively improved, and the discrimination and the robustness of the subsequent coding can be improved on the premise of not increasing the complexity.

Specifically, after the BGP feature image is divided into a plurality of sub-blocks according to M x N, an approximate area corresponding to the position of the five sense organ can be found in the BGP feature image by utilizing heuristic prior information, BGP block ordinal numbers corresponding to the position of the five sense organ can be obtained after BGP blocks occupied by the position of the five sense organ in the BGP feature image are counted, weighting processing is carried out on statistical histograms of the BGP blocks corresponding to the position of the five sense organ, and then processed BGP feature vectors can be obtained for subsequent identification.

After dividing the face image into three halves longitudinally and five halves transversely, the eyes are usually in the middle third of the longitudinal direction and in the second and fourth fifth of the transverse direction; the nose and mouth are in the second and third longitudinal thirds and in the middle fifth of the transverse direction. After the BGP feature image is obtained, according to the empirical knowledge and heuristic information of the relative positions of the five sense organs in the face, the BGP segmentation occupied by the positions of the five sense organs is determined in combination with the segmentation rule of the BGP method, so as to perform weighting processing on the corresponding statistical histogram.

Since the position of the eye is relatively close to the first third of the longitudinal direction, in order to enhance the robustness of the position determination of the five sense organs, the present embodiment performs fine adjustment on the position determination of the five sense organs, as shown in fig. 4, where M is N is 5, that is, when the BGP feature image is segmented by 5 × 5, the BGP feature image is vertically divided into six equal parts and horizontally divided into five equal parts, and the eye is in the second and third sixth areas of the longitudinal direction and in the second and fourth fifth areas of the horizontal direction; the nose and the mouth form heuristic information in the third to sixth areas in the longitudinal direction and in the middle fifth area in the transverse direction so as to determine the sub-blocks corresponding to the positions of the five sense organs.

In this embodiment, the specific steps of obtaining the processed BGP feature vector include: in the BGP feature image, counting the sub-block ordinal numbers in longitudinal M/6 to M/2 and transverse N/5 to N2/5 to obtain the BGP block ordinal number occupied by the left eye position, counting the sub-block ordinal numbers in longitudinal M/6 to M/2 and transverse N3/5 to N4/5 regions to obtain the BGP block ordinal number occupied by the right eye position, counting the sub-block ordinal numbers in longitudinal M/3 to M and transverse N2/5 to N3/5 regions to obtain the BGP blocks occupied by the nose and mouth positions, wherein M is the block number when the BGP feature image is subjected to blocking, and after the BGP feature vector is obtained, finding out a statistical histogram corresponding to the BGP block ordinal number occupied by the five-sense organ positions for weighting to obtain the processed BGP feature vector.

In a specific application embodiment, where M is equal to N is equal to 5, the BGP segmentation numbers occupied by the five sense organ positions obtained according to the above steps are a2, a7, a12, a4, a9, a14, A8, a13, a18, and a23, when statistical histogram features are performed on the BGP segmentation, the histogram features of the BGP segmentation occupied by the five sense organ positions are weighted, the weight coefficients represent the importance of the five sense organ information relative to other positions of the face that is considered by the computer, and the BGP feature vector obtained after processing is:

P＝[X_A1,w*X_A2,X_A3,...,X_A6,w*X_A7...w*X_A12,...,w*X_A23,X_A24,X_A25] (2)

wherein w is a weight coefficient.

According to the embodiment, through the steps, the prior knowledge and heuristic prior information which are particularly important and have fixed relative positions are applied when human faces are recognized, the general positions of the five sense organs distributed on the faces are determined according to human experience, then BGP operators are applied to carry out feature extraction on the faces, then the faces are divided into a plurality of sub-blocks to be counted to obtain the histogram features of each sub-block, finally the features corresponding to the positions of the five sense organs are weighted, and the BGP feature vectors after weighting processing are used for face recognition, so that the recognition accuracy and robustness can be effectively improved.

In this embodiment, the step S1 includes:

s12, face detection: detecting human face images from real-time monitored video information and outputting the human face images;

s14, image enhancement: and (4) processing the human face target image by enhancing and denoising, and outputting a final human face image.

Clear face images can be detected in real time through the steps, and targets can be identified after the face images detected in real time are subjected to subsequent intelligent coding and quick retrieval. In a specific application embodiment, the face image detected by the method is shown in fig. 5, and it can be known from fig. 5 that the frontal face image of the face can be accurately detected by the method.

In this embodiment, in step S2, the image to be detected is subjected to multi-stage BGP feature extraction based on a BGP algorithm, a BGP feature image obtained by extracting the BPG feature of the previous stage is used as an input for extracting the BPG feature of the next stage, and the feature vector obtained by extracting the BGP feature of each stage is encoded to obtain a multi-layer code corresponding to each stage of BGP, and the length of each layer of code is sequentially shortened step by step to form a pyramid coding structure, so as to obtain the code of the image to be detected.

In this embodiment, the step S3 includes: sequentially extracting multiple stages of BGP features from each face image in a database in advance based on a BGP algorithm, taking the BGP feature image obtained by extracting the BPG features from the previous stage as the input of the next stage of BPG feature extraction, and coding the feature vector obtained by extracting the BGP features from each stage to obtain multilayer codes corresponding to each stage of BGP, wherein the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure to obtain database image codes; when the face retrieval is carried out, the image code to be detected and the database image code are compared layer by layer, and the target face is identified according to the comparison result.

The purpose of face detection is to realize real-time accurate face detection, for a face detection technology, real-time performance and accuracy are a pair of mutually coupled quantities, a very ideal effect is difficult to achieve simultaneously, in consideration of practical application, real-time performance is a first consideration factor, meanwhile accuracy is easily influenced mainly due to different face posture changes, and in the embodiment, only a front face is detected, so that a detection program is simplified, recognition is performed in a cascade BGP mode, and higher accuracy can be obtained.

According to the method, the multistage BGP is carried out on the face image by applying the cascaded binary gradient mode, useful information such as deeper-layer edge information, texture information and gradient information of the image can be fully mined in a multistage BGP-based mode, more stable and robust feature representation is formed, accordingly, richer texture information can be extracted, the face recognition precision is improved, meanwhile, hierarchical coding is realized by coding the feature vector of the multistage BGP, pyramid coding is formed, retrieval is carried out layer by layer according to the coding hierarchy based on the pyramid coding during retrieval, retrieval from fuzzy matching to accurate matching and from coarse to fine can be realized, the retrieval efficiency can be effectively improved while the retrieval precision is ensured, and rapid face retrieval is realized.

In this embodiment, when the intelligent coding is performed in step S2, the BGP feature image obtained by extracting the BPG features of the previous stage is used as an input of extracting the BPG features of the next stage, and a plurality of feature vectors corresponding to each stage are finally obtained after extracting the BGP features of the multiple stages, where the extracting of the BPG features of each stage is as described above. Specifically, BGP features are extracted on the basis of an original image to obtain a BGP feature map which still has gray information, texture information, gradient information and the like, the BGP feature map is subjected to BGP feature extraction once again, useful information such as deeper, more implicit and richer texture information and edge information can be further extracted, and after the features of the coding feature image are extracted in sequence according to the mode, a multistage BGP feature image can be obtained; and finally, respectively coding the multiple groups of feature vectors to form pyramid codes.

Firstly, BGP feature extraction is carried out on each face image in an input database respectively according to the steps to obtain a level 1 BGP feature image and a level 1 BGP feature vector, and BGP feature extraction is carried out on the i-1 level BGP feature image sequentially according to the steps to obtain an i-level feature image and an i-level BGP feature vector; in step S2, BGP feature extraction is performed on the image to be detected according to the above steps in sequence to obtain a level 1 BGP feature image and a level 1 BGP feature vector, and BGP feature extraction is performed on the i-1 level BGP feature image according to the above steps in sequence to obtain an i-level feature image and an i-level BGP feature vector, where i is 2,3,4 … … n, and n is a BGP feature extraction level to be executed.

In this embodiment, when the feature vector obtained by extracting each stage of BGP features is encoded, the BGP encoding length of each layer is gradually shortened by adjusting the BGP neighborhood radius or the BGP partition number, so as to form a pyramid encoding structure. The purpose of constructing the coding pyramid is to form hierarchical codes with different lengths, so that the subsequent hierarchical codes based on the different lengths can realize quick retrieval from coarse to fine.

As shown in fig. 6, the process of constructing the coding pyramid in this embodiment is to obtain a plurality of groups of feature vectors by inputting each stage of BGP feature images as a single-stage BGP algorithm according to the above steps, respectively encode the feature vectors to obtain a plurality of groups of codes, and shorten the BGP coding length layer by adjusting the radius of the BGP field or the number of segments as the number of levels increases, thereby implementing the construction of the coding pyramid. Forming a pyramid coding structure, specifically, predetermining the number N of pyramid layers, selecting N groups M, N and R parameters according to the image size, wherein M × N is the number of BGP blocks, R is the BGP neighborhood radius, and selecting N groups { M } according to the selected groups_i,N_i,R_iCalculate n layers of codes P_iAnd the length of each layer code is made to satisfy D₁>D₂>D₃…>D_nWhere i is 1 … n.

In this embodiment, when a pyramid is constructed, the coding length decreases with the increase of the number of coding levels, and the coding length is specifically calculated according to the formula Di ═ Mi × Ni ×, where Di is the coding length of the ith level, Mi × Ni is the number of BGP blocks during the ith level coding, Di is the dimension of the statistical histogram of each sub-block during the ith level coding, and Di is only related to Ri, so that the coding length depends on the number of Mi, Ni, and Ri, that is, the number of BGP blocks and the domain radius. In this embodiment, specifically, Mi is equal to Ni, Ri is equal to 1 or 2, and in actual operation, Ri is equal to 1, and the values of Mi and Ni are still kept equal to each other, so that the extraction of codes at different levels is realized, thereby completing the construction of the coding pyramid, where the constructed pyramid is shown in fig. 7, the width of each layer of the pyramid indicates the coding length of the layer, and as can be seen from the figure, the higher the number of levels is, the smaller the BGP coding length is.

After the construction of the coding pyramid is completed, quick retrieval can be performed based on the pyramid coding. In step S3, the to-be-detected image code and the database image code are sequentially compared layer by layer from the highest layer to the first layer, and the current layer code in the to-be-detected image code is compared with the current layer code of each image retrieved last time, and a plurality of images that match are retrieved and output. By comparing the image codes of the database with the image codes to be detected from the highest layer to the first layer in a layer-by-layer coding manner, the retrieval range is quickly reduced by the comparison result of each coding, so that background information is quickly eliminated, more calculated amount can be guaranteed to be applied to possible face areas, and the retrieval efficiency is effectively improved.

The specific steps of step S3 in this embodiment include:

s31, comparing the nth layer code Pn in the image code to be detected with the nth layer code Pn in the image code of the database, and searching a plurality of pictures which are closest to the image to be detected in the database according to the coding result, wherein n is the layer number of the pyramid coding structure;

s32, comparing the n-1 layer code in the image code to be detected with the n-1 layer code of each image searched last time, searching a plurality of images which are closest to the image to be detected in each image searched last time according to the code comparison result, and circularly executing the step S32 until the comparison of the first layer code is completed and the most similar target images are searched;

and S33, comparing the image to be detected with the multiple closest target images respectively to obtain a final identification result.

Through the steps, fuzzy to accurate matching can be achieved based on pyramid coding, the searching range is quickly reduced through matching coding, accurate matching is conducted in a determined small range, and quick and accurate face retrieval can be achieved.

In the embodiment, offline operation is adopted for constructing the image coding pyramid in the database, the database image codes are obtained and then stored, only the coding pyramid is constructed on line for the image to be retrieved, the stored database image codes are called to be matched to realize final identification, and the database pyramid codes containing a large number of images are taken as offline operation, so that the retrieval efficiency and the real-time performance can be greatly improved.

This embodiment intelligence video monitoring device includes:

the image intelligent coding module is used for coding the face image monitored by the face real-time monitoring module to obtain the code of the image to be detected;

In this embodiment, the real-time face monitoring module includes:

the video monitoring module is used for monitoring the video information of the target video in real time to obtain the video information monitored in real time;

the face recognition module is used for recognizing face images from the video information monitored in real time and outputting the face images;

a face capturing module, configured to recognize and extract a face part from the face image output in step S12, so as to obtain a face target image and output the face target image;

an image enhancement module for performing the processing including enhancement and denoising on the human face target image and outputting the final human face image

The embodiment of the face recognition module specifically detects a face image by using a BGP face recognition method based on heuristic prior information for a video image, which includes:

the characteristic weighting unit is used for weighting the statistical histograms of partial sub-blocks in the BGP characteristic vector according to heuristic prior information containing face structure characteristic information to obtain a processed BGP characteristic vector;

In this embodiment, in the image intelligent encoding module, multi-stage BGP feature extraction is performed on an image to be detected based on a BGP algorithm, a BGP feature image obtained by previous-stage BPG feature extraction is used as an input of next-stage BPG feature extraction, and a feature vector obtained by each-stage BGP feature extraction is encoded to obtain a multi-layer code corresponding to each stage of BGP, and lengths of the codes of the respective layers are sequentially shortened step by step to form a pyramid coding structure, so as to obtain an image code to be detected.

In the embodiment, in the face quick retrieval module, multiple stages of BGP feature extraction are sequentially performed on each face image in a database in advance based on a BGP algorithm, a BGP feature image obtained by previous stage BPG feature extraction is used as the input of next stage BPG feature extraction, a feature vector obtained by each stage of BGP feature extraction is encoded to obtain multiple layers of codes corresponding to each stage of BGP, and the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure to obtain a database image code; when the face retrieval is carried out, the image code to be detected and the database image code are compared layer by layer, and the target face is identified according to the comparison result.

The intelligent video monitoring device of the embodiment corresponds to the intelligent video monitoring method one by one, and is not repeated again.

The intelligent video device adopted by the method for realizing the intelligent video monitoring in the specific application embodiment of the invention is shown in fig. 8 and comprises a front end and a rear end, wherein the front end comprises a human face real-time monitoring module, an image intelligent coding module and a human face rapid retrieval module, the human face real-time monitoring module specifically comprises a video receiving module, a human face detection module, a human face capturing module and an image enhancement module, the rear end is a database management module, the front end and the rear end data are transmitted through a cloud network, and a large amount of data can be stored through the cloud network, wherein the video receiving module is responsible for collecting video from a camera, the human face detection module is responsible for rapidly and automatically detecting human faces in the video, the human face capturing module is used for independently extracting detected human faces, and the image enhancement module is responsible for enhancing, denoising and the like a human face target image; the intelligent image coding module codes the face image so as to solve the problem of overhigh consumption of traditional hardware resources; the face quick retrieval module is used for realizing quick comparison between the target face code and the database face code.

Through the structure, the functions of automatic face detection, intelligent image coding, quick face retrieval and the like can be realized, the intellectualization of video monitoring is realized, the intelligent level of video monitoring can be improved, and hardware resources for storing and transmitting mass data are saved.

As shown in fig. 9, the working of the above-mentioned intelligent video apparatus of this embodiment includes front-end operation, background operation and cloud operation, and specifically is:

a front-end operation section: the method comprises the steps that a large amount of video information is monitored in real time through widely distributed cameras, the received video information is processed in an intelligent chip in real time, firstly, human face real-time monitoring is carried out, and human face frames appearing in a video are extracted to be used as pictures to be inquired; then, carrying out image enhancement processing on the face picture by image processing methods such as interpolation, denoising and the like; then, the obtained target picture is intelligently coded to obtain a concise, robust and discriminative target code; the object code is transmitted to the cloud through the cloud network.

A background operation part: storing a large number of face images of the attention object in a face database, and carrying out image enhancement intelligent coding and other processing on all the images by using the same method to obtain a coding database; and transmitting all the codes in the whole coding database to the cloud end through the cloud network.

The cloud operation part: after receiving the codes and the target codes of the face database, the cloud terminal quickly retrieves the most similar object from the code database according to the target codes, so as to know whether the target object belongs to the face database and achieve the purpose of early warning and alarming.

The device disclosed by the invention is specifically applied as shown in fig. 10, and comprises four typical application scenes, in the application of escaping arrest and riot arrest, the front-end monitoring equipment performs image processing and intelligent coding on the detected face, compares the coded face with the background escaping or riot face database, and finds out that a suspect gives an alarm in time; in hotel security and community security applications, the monitoring equipment automatically transmits the detected face codes to the cloud at any time, compares the detected face codes with the face codes registered in the database, and finds that foreign people generate early warning.

The foregoing is considered as illustrative of the preferred embodiments of the invention and is not to be construed as limiting the invention in any way. Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical spirit of the present invention should fall within the protection scope of the technical scheme of the present invention, unless the technical spirit of the present invention departs from the content of the technical scheme of the present invention.

Claims

1. An intelligent video monitoring method is characterized by comprising the following steps:

s3, face quick retrieval: respectively matching and comparing the codes of the images to be detected with database codes obtained from database face images in advance to obtain recognition results and output the recognition results;

in step S2, performing multistage BGP feature extraction on the image to be detected based on a BGP algorithm, using a BGP feature image obtained by previous stage BPG feature extraction as input for next stage BPG feature extraction, and encoding a feature vector obtained by each stage BGP feature extraction to obtain multilayer codes corresponding to each stage BGP, and sequentially reducing the length of each layer code step by step to form a pyramid coding structure to obtain an image code to be detected;

the step of step S3 includes: sequentially extracting multiple stages of BGP features from each face image in a database in advance based on a BGP algorithm, taking the BGP feature image obtained by extracting the BPG features from the previous stage as the input of the next stage of BPG feature extraction, and coding the feature vector obtained by extracting the BGP features from each stage to obtain multilayer codes corresponding to each stage of BGP, wherein the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure to obtain database image codes; when face retrieval is carried out, comparing the image code to be detected with the database image code layer by layer, and identifying a target face according to a comparison result;

when the feature vector extracted from each stage of BGP features is coded, the BGP coding length of each layer is gradually shortened by adjusting the BGP neighborhood radius or the BGP partition number, the coding length of each layer is calculated according to the formula Di (Mi Ni Di), wherein Di is the coding length of the ith layer, Mi Ni is the BGP partition number in the ith stage of coding, and Di is the statistical histogram dimension of each sub-block of BGP in the ith stage of coding, so that the pyramid coding structure is formed.

2. The intelligent video monitoring method according to claim 1, wherein in step S1, the BGP face recognition method based on heuristic prior information is applied to the video image to detect the face image, and the method includes:

3. The intelligent video monitoring method according to claim 2, wherein the heuristic prior information includes information of the location range where the facial features are located, and the statistical histogram of the target sub-block corresponding to the positions of the facial features is found from the BGP feature vector for weighting according to the information of the location range where the facial features are located, so as to increase the weight of the statistical histogram of the corresponding sub-block according to the importance degree of the facial structure feature information.

4. The intelligent video monitoring method according to claim 3, wherein the specific steps of obtaining the processed BGP feature vector are: in the BGP feature image, counting sub-block ordinal numbers in longitudinal M/6 to M/2 and transverse N/5 to N2/5 to obtain a BGP block ordinal number occupied by a left eye position, counting sub-block ordinal numbers in longitudinal M/6 to M/2 and transverse N3/5 to N4/5 regions to obtain a BGP block ordinal number occupied by a right eye position, counting sub-block ordinal numbers in longitudinal M/3 to M and transverse N2/5 to N3/5 regions to obtain a BGP block ordinal number occupied by a nose and a mouth position, wherein M N is the block number of the BGP feature image during blocking, and after the BGP feature vector is obtained, finding out a statistical histogram corresponding to the BGP block ordinal number occupied by the five sense organs position to perform weighting processing to obtain a processed BGP feature vector.

5. The intelligent video monitoring method according to any one of claims 1 to 4, wherein the step S1 includes:

6. An intelligent video monitoring device, comprising:

the face quick retrieval module is used for respectively matching and comparing the codes of the images to be detected with database codes obtained by database face images in advance to obtain and output identification results;

in the image intelligent coding module, multi-stage BGP feature extraction is carried out on an image to be detected based on a BGP algorithm, a BGP feature image obtained by previous stage BPG feature extraction is used as the input of next stage BPG feature extraction, a feature vector obtained by each stage BGP feature extraction is coded to obtain multi-layer codes corresponding to each stage BGP, the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure, and the code of the image to be detected is obtained;

in the face quick retrieval module, sequentially extracting multiple stages of BGP features from each face image in a database in advance based on a BGP algorithm, taking the BGP feature image obtained by extracting the BPG features of the previous stage as the input of the BPG feature extraction of the next stage, and coding the feature vector obtained by extracting the BGP features of each stage to obtain multilayer codes corresponding to each stage of BGP, wherein the length of each layer of codes is sequentially shortened step by step to form a pyramid coding structure to obtain database image codes; when face retrieval is carried out, comparing the image code to be detected with the database image code layer by layer, and identifying a target face according to a comparison result;

when the intelligent image coding module and the rapid face retrieval module code the feature vectors extracted from each stage of BGP features, the BGP coding length of each layer is gradually shortened by adjusting the BGP neighborhood radius or the BGP block number, so as to form the pyramid coding structure.

7. The intelligent video monitoring device according to claim 6, wherein the real-time face monitoring module comprises a face recognition module for detecting a face image by a BGP face recognition method based on heuristic prior information for a video image, the face recognition module comprising: