CN111724426B

CN111724426B - Background modeling method and camera for background modeling

Info

Publication number: CN111724426B
Application number: CN201910209064.5A
Authority: CN
Inventors: 张彩红; 刘刚; 张昱升; 曾峰
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2019-03-19
Filing date: 2019-03-19
Publication date: 2023-08-04
Anticipated expiration: 2039-03-19
Also published as: CN111724426A

Abstract

The embodiment of the invention provides a background modeling method and a camera for background modeling, wherein a frame of image is acquired from an acquired video image sequence, pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and the statistical number of each pixel value are respectively counted, a background sub-model corresponding to each pixel point is built according to the pixel values counted for each pixel point and the statistical number of each pixel value, and the background sub-model corresponding to each pixel point is utilized to build a background model of the video image sequence. According to the method, a more accurate background model can be established by using a pixel value counting mode, complex modeling and calculating processes are not needed, and the speed and efficiency of background modeling are improved.

Description

Background modeling method and camera for background modeling

Technical Field

The invention relates to the technical field of video monitoring, in particular to a background modeling method and a camera for background modeling.

Background

Detecting dynamic objects from a video sequence is the primary and fundamental task of video surveillance. Currently, many tracking systems rely on background extraction technology, that is, a currently input video frame is compared with a background model, and whether a pixel point is a target pixel or a background pixel is determined according to the deviation degree of the pixel point of the current video frame from the background model. Then, the pixels considered as targets are further processed to identify the targets and determine the positions of the targets, so that tracking is realized. Therefore, the establishment of the background model directly influences the accuracy of target tracking, and how to establish the accurate background model is a key for realizing the target tracking.

At present, a popular background modeling method is a mixed Gaussian modeling method, which consists of weighting K (generally 3-5) Gaussian models, if the matching degree between a pixel point in a current video frame and one of the K models of the pixel point is higher, the background is considered, otherwise, the background is considered as a foreground, the pixel point is taken as a new model, and the existing K models are updated. The model established by the Gaussian mixture modeling method is an intuitive probability density model and can adapt to illumination changes and multi-mode scenes.

However, since K gaussian models need to be built for each pixel, the gaussian model building process is complex and the calculation amount is large, resulting in slow speed and low efficiency of background modeling.

Disclosure of Invention

The embodiment of the invention aims to provide a background modeling method and a camera for background modeling, so as to improve the speed and efficiency of background modeling. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a background modeling method, where the method includes:

acquiring a frame of image from an acquired video image sequence;

respectively counting pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and the counting number of each pixel value, wherein the pixel values indicate gray values and/or color values;

According to the pixel values counted for each pixel point and the counted number of each pixel value, establishing a background sub-model corresponding to each pixel point, wherein the background sub-model comprises the pixel values and the counted number of the pixel values;

and building a background model of the video image sequence by utilizing the background sub-model corresponding to each pixel point.

Optionally, the color value is a value of a designated color channel or an array formed by values of all color channels, and the gray value is an average value of the values of all color channels.

Optionally, the establishing a background sub-model corresponding to each pixel point according to the pixel values counted for each pixel point and the counted number of each pixel value includes:

for each pixel point, the following operations are respectively executed:

counting the number of types of pixel values in a preset neighborhood range of a first pixel, wherein the first pixel is any pixel in the pixel points;

judging whether the number of the types is larger than the number of preset background samples or not;

if the number of the background sub-model is larger than the number of the preset background samples, reading a plurality of pixel values with the same number as the number of the preset background samples and the statistical number of each pixel value in the plurality of pixel values to form the background sub-model corresponding to the first pixel point;

And if the number of the pixel values is not greater than the number of the pixel values, the statistical number of each pixel value and the pixel values with zero values are read, a background submodel corresponding to the first pixel point is formed by the pixel values with zero values and the statistical number with zero values, and the number of the pixel values with zero values and the statistical number with zero values are equal to the difference between the number of the preset background samples and the number of the types.

Optionally, after the respectively counting the pixel values of all the pixel points in the preset neighborhood range of each pixel point in the image and the counted number of each pixel value, the method further includes:

arranging elements consisting of pixel values and the statistical number of the pixel values according to the sequence of the statistical number from large to small to obtain a pixel value set;

the reading of the plurality of pixel values with the same number as the preset background sample number and the statistical number of each pixel value in the plurality of pixel values form a background submodel corresponding to the first pixel point, and the reading comprises the following steps:

and reading a plurality of elements from the pixel value set according to the preset background sample number and the arrangement sequence from front to back to form a background submodel corresponding to the first pixel point.

Optionally, after the background sub-model corresponding to each pixel point is used to build the background model of the video image sequence, the method further includes:

acquiring pixel values of all pixel points in a current video frame;

updating the background sub-model corresponding to each pixel point in the current video frame based on a judgment result of whether the pixel value of each pixel point in the current video frame is respectively contained in the background sub-model corresponding to each pixel point in the current video frame according to the pixel value of each pixel point in the current video frame;

and determining an updated background model according to the updated background sub-model corresponding to each pixel point in the current video frame.

Optionally, the updating the background sub-model corresponding to each pixel point in the current video frame based on the determination result of whether the pixel value of each pixel point in the current video frame is included in the background sub-model corresponding to each pixel point in the current video frame, according to the pixel value of each pixel point in the current video frame, includes:

for each pixel point in the current video frame, the following operations are respectively executed:

judging whether a pixel value of a second pixel point is contained in a background sub-model corresponding to the second pixel point, wherein the second pixel point is any pixel point in the current video frame;

If yes, determining elements with the same pixel value as the pixel value of the second pixel point in a background sub-model corresponding to the second pixel point, wherein the background sub-model comprises a plurality of elements, and each element comprises the pixel value and the statistical number of the pixel value; adding 1 to the statistical number in the element, and subtracting 1 from the statistical number in other elements except the element in the background submodel corresponding to the second pixel point, wherein if any statistical number is 0, keeping any statistical number as 0;

if not, subtracting 1 from the statistical number of all elements in the background submodel corresponding to the second pixel point, wherein if any statistical number is 0, keeping any statistical number as 0; determining the sequence numbers of the elements with non-zero and minimum statistical numbers in the background sub-model corresponding to the second pixel point; judging whether the sequence number is smaller than the number of preset background samples or not; if the number of the pixels in the background submodel corresponding to the second pixel point is smaller than the number of the pixels in the background submodel corresponding to the second pixel point, setting the statistical number in the element of the next sequence number as 1, and setting the pixel value in the element of the next sequence number as the pixel value of the second pixel point; and if not, setting the statistical number in the last element in the background submodel corresponding to the second pixel point as 1, and setting the pixel value in the last element as the pixel value of the second pixel point.

Optionally, before said adding 1 to the statistical number in the element, the method further comprises:

judging whether the statistical number in the element reaches a preset threshold value or not;

if so, keeping the statistical number in the element unchanged as the preset threshold value;

if not, the adding 1 of the statistical number in the element is performed.

Optionally, after the updated background model is determined according to the updated background sub-model corresponding to each pixel point in the current video frame, the method further includes:

acquiring a pixel value of a third pixel point in the current video frame and a statistical number of the pixel values of the third pixel point, wherein the third pixel point is any pixel point in the current video frame;

superposing the statistical numbers in the background sub-model corresponding to the third pixel point in the background model to obtain the sum of the statistical numbers corresponding to the third pixel point;

and calculating the foreground probability of the third pixel point according to the statistical number of the pixel values of the third pixel point and the sum of the statistical number.

Optionally, the calculating the foreground probability of the third pixel according to the statistical number of the pixel values of the third pixel and the sum of the statistical number includes:

According to the statistical number of the pixel values of the third pixel point and the sum of the statistical number, calculating the foreground probability of the third pixel point by using a foreground probability calculation formula, wherein the foreground probability calculation formula is as follows:

wherein FG (x, y) is the foreground probability of the third pixel point, and (x, y) is the coordinates of the third pixel point, sn _k For the statistical number of pixel values of the third pixel point,for the sum of the statistical numbers, sn _i And (3) the ith statistical number in the background submodel corresponding to the third pixel point, wherein N is the preset background sample number.

In a second aspect, an embodiment of the present invention provides a background modeling apparatus, including:

the acquisition module is used for acquiring a frame of image from the acquired video image sequence;

the statistics module is used for respectively counting pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and the statistical number of each pixel value, wherein the pixel values indicate gray values and/or color values;

the establishing module is used for establishing a background sub-model corresponding to each pixel point according to the pixel value counted for each pixel point and the counted number of each pixel value, wherein the background sub-model comprises the pixel values and the counted number of the pixel values;

And the building module is used for building a background model of the video image sequence by utilizing the background sub-model corresponding to each pixel point.

Optionally, the establishing module is specifically configured to:

for each pixel point, the following operations are respectively executed:

if the number of the background sub-model is larger than the preset number of the background samples, a plurality of pixel values with the same number as the preset number of the background samples and the statistical number of each pixel value in the plurality of pixel values are read, and the background sub-model corresponding to the first pixel point is formed;

Optionally, the apparatus further includes:

the arrangement module is used for arranging elements consisting of pixel values and the statistical number of the pixel values according to the sequence of the statistical number from large to small to obtain a pixel value set;

the establishing module is configured to, when implementing the reading of the plurality of pixel values having the same number as the preset background sample number and the statistical number of each pixel value in the plurality of pixel values to form the background submodel corresponding to the first pixel point, specifically:

Optionally, the acquiring module is further configured to acquire a pixel value of each pixel point in the current video frame;

the apparatus further comprises:

the updating module is used for updating the background sub-model corresponding to each pixel point in the current video frame based on the judgment result of whether the pixel value of each pixel point in the current video frame is respectively contained in the background sub-model corresponding to each pixel point in the current video frame according to the pixel value of each pixel point in the current video frame;

And the determining module is used for determining an updated background model according to the updated background sub-model corresponding to each pixel point in the current video frame.

Optionally, the updating module is specifically configured to:

Optionally, the apparatus further includes:

the judging module is used for judging whether the statistical number in the element reaches a preset threshold value or not;

the maintaining module is used for maintaining the statistical number in the elements to be the preset threshold value unchanged if the judging result of the judging module is reached;

the updating module is specifically configured to execute the adding 1 to the statistical number in the element if the judging result of the judging module is not reached.

Optionally, the obtaining module is further configured to obtain a pixel value of a third pixel in the current video frame and a statistical number of the pixel values of the third pixel, where the third pixel is any pixel in the current video frame;

the apparatus further comprises:

the calculation module is used for superposing the statistical numbers in the background sub-model corresponding to the third pixel point in the background model to obtain the sum of the statistical numbers corresponding to the third pixel point; and calculating the foreground probability of the third pixel point according to the statistical number of the pixel values of the third pixel point and the sum of the statistical number.

Optionally, the computing module is specifically configured to:

In a third aspect, an embodiment of the present invention provides a camera for background modeling, the camera including a camera, a processor, and a memory;

the camera is used for collecting video image sequences;

the memory is used for storing a computer program;

the processor is configured to implement the method steps described in the first aspect of the embodiment of the present invention when executing the computer program stored in the memory.

In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having a computer program stored therein, which when executed by a processor implements the method steps of the first aspect of the embodiments of the present invention.

According to the background modeling method and the camera for background modeling, a frame of image is obtained from an acquired video image sequence, pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and the statistical number of each pixel value are respectively counted, a background sub-model corresponding to each pixel point is built according to the pixel values counted for each pixel point and the statistical number of each pixel value, and the background sub-model corresponding to each pixel point is utilized to build a background model of the video image sequence. The background sub-model of the pixel point is built by using the pixel values and the statistical numbers of the pixel values of all the pixel points in the preset neighborhood range of the pixel point, so that the background model of the video image sequence is built, and the background sub-model is used as the background pixel point, and the statistical numbers of the pixel values of the pixel point are usually larger.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a background modeling method according to an embodiment of the present invention;

FIG. 2 is a schematic view of a neighborhood range of a background modeling method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a background model update of a background modeling method according to an embodiment of the present invention;

FIG. 4a is an image acquired by a camera according to an embodiment of the present invention;

FIG. 4b is a foreground effect image corresponding to FIG. 4 a;

FIG. 4c is another image acquired by a camera according to an embodiment of the present invention;

FIG. 4d is a foreground effect image corresponding to FIG. 4 c;

FIG. 4e is a still further image acquired by a camera according to an embodiment of the present invention;

FIG. 4f is a foreground effect image corresponding to FIG. 4 e;

FIG. 4g is a still further image acquired by a camera in accordance with an embodiment of the present invention;

FIG. 4h is a foreground effect image corresponding to FIG. 4 g;

FIG. 4i is a still further image acquired by a camera according to an embodiment of the present invention;

FIG. 4j is a foreground effect image corresponding to FIG. 4 i;

FIG. 4k is a still further image acquired by a camera in accordance with an embodiment of the present invention;

FIG. 4l is a foreground effect image corresponding to FIG. 4 k;

FIG. 4m is a still further image acquired by a camera in accordance with an embodiment of the present invention;

FIG. 4n is a foreground effect image corresponding to FIG. 4 m;

FIG. 5 is a flow chart of background modeling and foreground output of a background modeling method according to an embodiment of the present invention;

FIG. 6 is a functional block diagram of a background modeling apparatus according to an embodiment of the present invention;

FIG. 7 is a block diagram of a camera for background modeling according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order to improve the speed and efficiency of background modeling, the embodiment of the invention provides a background modeling method, a background modeling device, a camera for background modeling and a computer readable storage medium. In the following, a background modeling method provided by an embodiment of the present invention is first described.

The execution subject of the background modeling method provided by the embodiment of the invention can be electronic equipment such as a camera and the like with an image processing function, and the embodiment of the invention is mainly introduced by taking the camera as the execution subject. In order to be able to increase the speed and efficiency of background modeling, the execution body should include at least a processor on which a core processing chip is mounted. The manner of implementing the background modeling method provided by the embodiment of the invention can be at least one of software, hardware circuits and logic circuits arranged in an execution body.

As shown in fig. 1, a background modeling method provided by an embodiment of the present invention may include the following steps.

S101, acquiring a frame of image from an acquired video image sequence.

In this embodiment, the video image sequence v= { f is acquired by a camera in the camera ₀ ,f ₁ ,…,f _t …, a frame of image may be acquired from either as a reference image for establishing the initial background model. Since the background model isIn a continuously updated, therefore, the first frame image in the sequence of video images is typically acquired as a reference image for establishing the initial background model.

Alternatively, the acquired image may be the first frame image in the sequence of video images.

S102, respectively counting pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and counting number of each pixel value, wherein the pixel values indicate gray values and/or color values.

The preset neighborhood range is a certain image range where a certain pixel point in the image is located, and one neighborhood range comprises a plurality of pixel points around the pixel point besides the pixel point for building the background submodel, and in general, the size of the neighborhood range is not suitable to be set to be too large or too small, and is usually set to be 3*3 or 5*5. The pixel value refers to a pixel value representing a pixel point, and may indicate a gray value and/or a color value, or may indicate an LBP (Local Binary Pattern ) feature value, or the like.

Alternatively, the color value may be a value of a designated color channel or an array of values of each color channel, and the gray value may be an average value of the values of each color channel.

In the RGB image, the color value can be the value of a designated channel in the R channel, the G channel and the B channel, or can be an array formed by the values of the three channels, namely the R channel, the G channel and the B channel, and the gray value is the average value of the values of the three channels.

After the image is obtained, each pixel point in the image can be traversed, taking the pixel point at (x, y) as an example, the pixel values in the preset neighborhood range and the statistical number of each pixel value are counted, and the pixel value set { (p) of the pixel point can be formed based on the counted pixel values and the counted number ₁ ,n ₁ ),(p ₂ ,n ₂ ),…,(p _k ,n _k ) Pi is the ith pixel value, n _i Is the statistical number of i-th pixel values. As shown in fig. 2, for the pixel at (x, y), the pixel values of each pixel in the region divided by the preset neighborhood of 3*3Within the range, a pixel value set { (20, 1), (25, 3), (50, 2), (30, 3) can be formed by a statistical number of 20, a statistical number of 25, a statistical number of 3, a statistical number of 30, and a statistical number of 50.

Optionally, after S102 is performed, the method provided by the embodiment of the present invention may further include the following steps:

and arranging elements consisting of pixel values and the statistical number of the pixel values according to the sequence of the statistical number from large to small to obtain a pixel value set.

Because the larger the statistical number is, the greater the possibility that the pixel point is a background pixel is, in order to better embody whether the pixel point is a background pixel or not and subsequently facilitate the background update, the elements obtained by statistics can be ordered according to the order from the large to the small of the statistical number to obtain { (sp) ₁ ,sn ₁ ),(sp ₂ ,sn ₂ ),…,(sp _k ,sn _k ) The set of pixel values obtained by ordering the statistical data shown in fig. 2 is { (25, 3), (30, 3), (50, 2), (20, 1) }, for example.

S103, according to the pixel values counted for each pixel point and the counted number of each pixel point, a background sub-model corresponding to each pixel point is established, wherein the background sub-model comprises the pixel values and the counted number of the pixel values.

After the pixel values of all the pixel points and the statistical number of each pixel value in the preset neighborhood range of each pixel point are obtained through statistics, a background sub-model corresponding to each pixel point can be built according to the statistical pixel values and the statistical number of each pixel value. Specifically, the counted pixel values and the counted number of each pixel value may be stored in a pixel value set, and then a background sub-module corresponding to each pixel point may be established according to each element in the pixel value set. The background sub-model is actually composed of a statistical pixel value and a statistical number of the pixel value, and may be selected from a set of pixel values. Specifically, the number of elements in the background sub-model may be the same as the number of preset background samples, and then the background sub-model corresponding to the jth pixel point is actually:

BG _j ＝{(val ₁ ,count ₁ ),(val ₂ ,count ₂ ),…,(val _N ,count _N )} (1)

Wherein val is a pixel value, count is a statistical number, and N is a preset background sample number.

Optionally, S103 may specifically include:

for each pixel point, the following operations are respectively executed:

counting the number of types of pixel values in a preset neighborhood range of a first pixel, wherein the first pixel is any pixel in all pixel points; judging whether the number of the varieties is larger than the number of preset background samples or not; if the number of the background sub-model is larger than the preset number of the background samples, reading a plurality of pixel values with the same number as the preset number of the background samples and the statistical number of each pixel value to form the background sub-model corresponding to the first pixel point; if the number of the pixel values is not greater than the number of the pixel values, the pixel values of all the pixel points in the preset neighborhood range and the statistical number of each pixel value are read, and a background sub-model corresponding to the first pixel point is formed by the pixel values with the value of zero and the statistical number with the value of zero, wherein the number of the pixel values with the value of zero and the statistical number with the value of zero are equal to the difference between the number of preset background samples and the number of types.

When the background submodel is established, whether the number of types of the pixel values in the counted preset neighborhood range is larger than the number of preset background samples can be judged. If not, all information of the description statistics can be used as elements in the background sub-model, and the rest elements can be set to 0, so that the total number of the elements of the background sub-model is ensured to be equal to the number of preset background samples; if it is larger, the group needs to pick out a part of the statistical information as an element in the background sub-model, and in general, a larger number of pixel values and statistical numbers are read as elements in the background sub-model. Because the counted pixel values and the counted number can be stored in the pixel value set according to the arrangement sequence of the counted number from large to small, optionally, the step of reading a plurality of pixel values with the same number as the preset background sample number and the counted number of each pixel value to form the background submodel corresponding to the first pixel point can be specifically:

And reading a plurality of elements from the pixel value set according to the preset background sample number and the arrangement sequence from front to back to form a background sub-model corresponding to the first pixel point.

Assuming that the number of preset background samples is N, the first N elements may be selected from the set of pixel values to form a background sub-model corresponding to the first pixel point. To ensure that a larger statistical number of elements can be chosen from the sequence of pixel values as elements in the background submodel.

S104, building a background model of the video image sequence by utilizing the background sub-model corresponding to each pixel point.

After a background sub-model is established for each pixel point in the image, a background model of the video image sequence can be established based on the background sub-model corresponding to each pixel point, and the background model can be a set of background sub-models corresponding to each pixel point.

By applying the embodiment, a frame of image is obtained from an acquired video image sequence, the pixel values of all the pixel points in a preset neighborhood range of each pixel point in the image and the statistical number of each pixel value are respectively counted, a background sub-model corresponding to each pixel point is built according to the pixel values respectively counted for each pixel point and the statistical number of each pixel value, and the background sub-model corresponding to each pixel point is utilized to build a background model of the video image sequence. The background sub-model of the pixel point is built by using the pixel values and the statistical numbers of the pixel values of all the pixel points in the preset neighborhood range of the pixel point, so that the background model of the video image sequence is built, and the background sub-model is used as the background pixel point, and the statistical numbers of the pixel values of the pixel point are usually larger. Moreover, the background modeling can be realized by presetting the number of the background samples to be 2, so that the storage space of the model can be reduced to the greatest extent under the condition of better effect.

The background model constructed in the embodiment shown in fig. 1 is actually an initialized background model, and during video monitoring, the background model needs to be updated continuously, and some pixels which are not changed frequently are updated to background pixels, so as to improve the accuracy of the background model, and therefore, after the background model of the video image sequence is constructed, the process of updating the background model as shown in fig. 3 needs to be executed.

S301, obtaining pixel values of all pixel points in the current video frame.

Let the current video frame be f _t Updating is performed by comparing the pixel values in the current frame with the pixel values in the background model. Therefore, first, the pixel value of each pixel point in the current video frame is obtained.

S302, updating the background submodel corresponding to each pixel point in the current video frame based on the judgment result of whether the pixel value of each pixel point in the current video frame is respectively contained in the background submodel corresponding to each pixel point in the current video frame according to the pixel value of each pixel point in the current video frame.

After the pixel values of the pixels in the current video frame are obtained, the background sub-model of each pixel can be updated by comparing the pixel values of each pixel in the current video frame with the corresponding background sub-model in the background model. Specifically, the comparison is performed by determining whether the pixel value is included in the background submodel.

Optionally, S302 may specifically perform, for each pixel point in the current video frame, the following operations respectively:

s3021, judging whether the pixel value of the second pixel is contained in the background sub-model corresponding to the second pixel, wherein the second pixel is any pixel in the current video frame, if yes, executing S3022 to S3023, otherwise executing S3024 to S3026.

Assume that the pixel value of the second pixel point is f _tq (x, y), the background submodel corresponding to the second pixel point is BG _q (x, y), the manner of determining whether the pixel value is included in the corresponding background submodel may be, in particular, determining f _tq (x, y) and BG _q (x, y) intersection, judging the intersection junctionWhether the fruit is empty. If it is empty, it indicates the background submodel BG _q The pixel value f is not included in (x, y) _tq (x, y); if not, the background submodel BG is described _q The (x, y) includes a pixel value f _tq (x,y)。

S3022, determining elements with the same pixel value as the second pixel point in the background sub-model corresponding to the second pixel point, wherein the background sub-model comprises a plurality of elements, and each element comprises the pixel value and the statistical number of the pixel values.

S3023, adding 1 to the statistical number in the element with the same pixel value as the second pixel point, and subtracting 1 from the statistical number in the other elements except the element in the background sub-model corresponding to the second pixel point, wherein if any statistical number is 0, the statistical number is kept to be 0.

If the pixel value f _tq (x, y) contained in background submodel BG _q In (x, y), the background sub-model BG needs to be determined first _q Pixel value and pixel value f in (x, y) _tq (x, y) the same element. Pixel value f _tq (x, y) appears in the background model, which indicates that the pixel point has the possibility of being a background pixel, possibly being a pixel point which is unchanged in a plurality of continuous frames in the monitored scene, and the confidence degree of the pixel point as the background pixel can be increased, so that the statistical number of the pixel values in the corresponding background submodel can be correspondingly increased; for other pixel values in the background sub-model, the confidence that the pixel values are background can be reduced by reducing the statistical number of the pixel values. Let f _tq (x,y)＝p _i I.e. pixel value f _tq (x, y) and background submodel BG _q The pixel value in the ith element in (x, y) is the same, then n will be _i Adding 1, namely adding 1 to the statistical number in the ith element, subtracting 1 from other n, and ensuring that n is more than or equal to 0, namely updating the background submodel based on the second pixel point to be:

{(p ₁ ,n ₁ -1),…,(p _i ,n _i -1),…,(p _N ,n _N -1)} (2)

s3024, subtracting 1 from the statistical number of all elements in the background submodel corresponding to the second pixel, wherein if any statistical number is 0, the statistical number is kept to be 0.

If background submodel BG _q The pixel value f is not included in (x, y) _tq (x, y) then it is stated that the pixel point is not a background pixel, then the background submodel BG can be reduced _q The statistical number of all pixel values in (x, y) reduces the confidence that the pixel point is a background pixel.

S3025, determining the sequence numbers of the elements with the non-zero and minimum statistical number in the background sub-model corresponding to the second pixel point.

S3026, judging whether the sequence number is smaller than the preset background sample number, if so, executing S3027, otherwise, executing S3028.

S3027, setting the statistical number in the element of the next sequence number in the background submodel corresponding to the second pixel point to 1, and setting the pixel value in the element of the next sequence number to the pixel value of the second pixel point.

S3028, setting the statistical number in the last element in the background submodel corresponding to the second pixel point to be 1, and setting the pixel value in the last element to be the pixel value of the second pixel point.

To cope with the situation that a new target appears in the monitored scene, which may not move in the subsequent monitored scene, a corresponding pixel sample may be added in the background sub-model, assuming a minimum non-zero n _i For the ith pixel sample, if i<N is set to N _i+1 =1, and p _i+1 ＝f _tq (x, y), if i is not less than N, setting N _N =1, and p _N =ft (x, y). Since the elements in the background submodel are typically arranged in order of the statistical number from large to small, this can be achieved by setting the last element of the element with a non-zero and minimum statistical number when adding the corresponding pixel sample and statistical number.

Optionally, before executing S3023, the method provided by the embodiment of the present invention may further execute the following steps:

judging whether the statistical number in the element reaches a preset threshold value or not; if so, keeping the statistical number in the element as a preset threshold value unchanged; if not, a step of adding 1 to the statistical number in the element in S3023 is performed.

If a pixel is unchanged in the video sequence, according to the method provided by the embodiment, the statistics number of the pixel values corresponding to the pixel is increased all the time, and if the statistics number value becomes very large, a very large storage space is required, so that the background model is abnormal, and meanwhile, the Ghost or smear of the foreground is serious. In order to avoid the occurrence of the Ghost phenomenon, a threshold value, for example, 255 may be set, before adding 1 to the statistics number, it is first determined whether the statistics number in the element reaches the threshold value of 255, if not, the operation of adding 1 to the statistics number may be performed, and if so, the threshold value of 255 is kept unchanged. Detecting the image shown in fig. 4a can obtain the foreground effect image shown in fig. 4b, wherein the dark area is foreground, and the Ghost area in fig. 4b is not obvious.

S303, determining an updated background model according to the updated background sub-model corresponding to each pixel point in the current video frame.

The background sub-model corresponding to each pixel point is updated based on each pixel point in the current frame, so that the purpose of updating the background model can be achieved, and for each video frame in the video image sequence, the background model is updated based on the pixel points in the video frame.

Optionally, after S303 is performed, the method provided by the embodiment of the present invention may further perform the following steps:

s3031, a pixel value of a third pixel point in the current video frame and the statistical number of the pixel values of the third pixel point are obtained, wherein the third pixel point is any pixel point in the current video frame.

S3032, the statistical numbers in the background sub-model corresponding to the third pixel point in the background model are overlapped to obtain the sum of the statistical numbers corresponding to the third pixel point.

S3033, calculating the foreground probability of the third pixel point according to the statistical number and the sum of the statistical numbers of the pixel values of the third pixel point.

When the updated background model is utilized to perform target recognition, a third pixel point (any pixel point) in the current video frame can be analyzed, a corresponding pixel value and a statistical number can be obtained, the statistical number can be obtained from a corresponding background sub-model, the statistical number is compared with the sum of the statistical numbers, the probability that the pixel point is the background can be obtained, and accordingly, the probability that the pixel point is the foreground can be converted.

Optionally, S3033 may specifically be:

according to the sum of the statistical number and the statistical number of the pixel values of the third pixel point, calculating the foreground probability of the third pixel point by using a foreground probability calculation formula, wherein the foreground probability calculation formula is as follows:

wherein FG (x, y) is the foreground probability of the third pixel point, and (x, y) is the coordinates of the third pixel point, sn _k Is the statistical number of pixel values of the third pixel point,for the sum of statistical numbers sn _i And (3) the ith statistical number in the background submodel corresponding to the third pixel point, wherein N is the preset background sample number.

By applying the embodiment, the background model is updated based on each video frame in the video image sequence, so that the background model is more accurate, in the process of updating the model, only the operation of adding 1 or subtracting 1 from the statistics number is needed, the calculation is simple, the foreground residue can be reduced to a greater degree objectively, the background model can be updated rapidly, and the Ghost phenomenon caused by overlarge factors is avoided due to the limitation of the threshold value of the statistics number.

For ease of understanding, the background modeling method provided by the embodiments of the present invention is described below with reference to a complete example and specific recognition results. As shown in fig. 5, a flow chart of the background modeling process for this example is shown.

The processing of the video image sequence is divided into two major parts, the first part is the background modeling process, and the second part is the foreground output. In the background modeling process, the method mainly comprises the following steps:

first, initializing.

Assuming that the video image size is w×h, the video sequence is v= { f ₀ ,f ₁ ,…,f _t …, the number of background samples is N, then the background model is:

BG＝{(val ₁ ,count ₁ ),(val ₂ ,count ₂ ),…,(val _N ,count _N )} (4)

where val is the pixel value and count is the statistical number.

The initialization process is to set all val and count in the background model to zero.

And secondly, counting neighborhood values.

Setting the size of a neighborhood window as Win, and traversing the first frame image f ₀ Taking the pixel at (x, y) as an example, counting the pixel values in the preset neighborhood range and the counting number of each pixel value, wherein the counting number can be obtained by using a counter:

{(p ₁ ,n ₁ ),(p ₂ ,n ₂ ),…,(p _k ,n _k )} (5)

sequencing the counts (in order of the number of statistics from large to small) gives:

{(sp ₁ ,sn ₁ ),(sp ₂ ,sn ₂ ),…,(sp _k ,sn _k )} (6)

third, background model BG.

Judging the magnitudes of k and N, if k is less than or equal to N, filling the pixel value and the statistical number corresponding to the formula (6) into the corresponding position (x, y) of the formula (1) (the background submodel of the embodiment shown in fig. 1); if k > N, the top N pixel values and statistics in equation (6) are filled into the (x, y) correspondence in equation (1).

The whole image is traversed, and a background model BG can be built.

Fourth step, the current frame pixel value f _t (x, y) is compared to the background model.

Assume that the current frame is f _t Updating is performed by comparing the pixel values in the current frame with the pixel values in the background model.

Fifth step, judging f _t And (x, y) whether the data belong to BG (x, y), if so, executing the sixth step, otherwise executing the seventh step.

Sixth, BG (x, y) pixel value counts.

If it isI.e. f _t (x, y) belongs to BG (x, y), then all counts in BG (x, y) are subtracted by 1, i.e.:

{(val ₁ ,count ₁ -1),(val ₂ ,count ₂ -1),…,(val _N ,count _N -1)}

＝{(sp ₁ ,sn ₁ -1),(sp ₂ ,sn ₂ -1),…,(sp _N ,sn _N -1)} (7)

wherein sn is guaranteed to be ≡ 0, assuming a minimum non-zero sn _i For the i-th pixel sample:

if i<N, then set sn _i+1 =1, and sp _i+1 ＝f _t (x,y)；

If i is not less than N, setting sn _N =1, and sp _N ＝f _t (x,y)。

Seventh, the new pixel value count replaces the BG minimum non-0 counter.

If it isI.e. f _t (x, y) does not belong to BG (x, y), assuming f _t (x,y)＝val _i Then sn is _i Adding 1, subtracting 1 from other sn, and ensuring that sn is more than or equal to 0, namely:

{(val ₁ ,count ₁ -1),…,(val _i ,count _i +1),…,(val _N ,count _N -1)}

＝{(sp ₁ ,sn ₁ -1),…,(sp _i ,sn _i +1),…,(sp _N ,sn _N -1)} (8)

eighth step, the counter ordering

Sequencing the formula (7) or the formula (8) by using the counting value sn to enable the background model of the current pixel point to be:

{(sp ₁ ,sn ₁ ),(sp ₂ ,sn ₂ ),…,(sp _N ,sn _N )} (9)

wherein sn ₁ >sn ₂ >…>sn _N 。

Based on the above steps, an updated background model can be obtained.

The foreground output process is mainly as follows: differentiating the foreground and the background of the current frame by using a formula (9), and dividing the current frame f _t Comparing with background model BG, setting current frame f _t The counter value corresponding to (x, y) is sn _k The foreground probability of the current pixel is:

wherein FG (x, y) is f _t Foreground probability of (x, y), sn _k In order to be a value of the counter,is the sum of the statistical numbers.

The background model is built by taking the gray scale of the graph as the pixel value, the foreground effect corresponding to the graph 4a is shown in the graph 4b, the foreground effect corresponding to the graph 4c is shown in the graph 4d, the foreground effect corresponding to the graph 4e is shown in the graph 4f, the foreground effect corresponding to the graph 4g is shown in the graph 4h, the foreground effect corresponding to the graph 4i is shown in the graph 4j, the foreground effect corresponding to the graph 4k is shown in the graph 4l, and the foreground effect corresponding to the graph 4m is shown in the graph 4n, which are calculated on the video image sequence (graph 4a, graph 4c, graph 4e, graph 4g, graph 4i, and graph 4 m) by using the formula (10).

Corresponding to the above method embodiments, the embodiment of the present invention provides a background modeling apparatus, as shown in fig. 6, which may include:

an acquiring module 610, configured to acquire a frame of image from the acquired video image sequence;

a statistics module 620, configured to respectively count pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and a statistical number of each pixel value, where the pixel values indicate a gray value and/or a color value;

The establishing module 630 is configured to establish a background sub-model corresponding to each pixel according to the pixel value and the statistical number of each pixel, where the pixel value and the statistical number of each pixel are counted separately for each pixel;

and the building module 640 is configured to build a background model of the video image sequence by using the background sub-model corresponding to each pixel point.

Optionally, the establishing module 630 may specifically be configured to:

for each pixel point, the following operations are respectively executed:

Optionally, the apparatus may further include:

the establishing module 630, when configured to implement the reading of the plurality of pixel values having the same number as the preset background sample number and the statistical number of each pixel value in the plurality of pixel values to form the background submodel corresponding to the first pixel point, may be specifically configured to:

Optionally, the obtaining module 610 may be further configured to obtain a pixel value of each pixel point in the current video frame;

The apparatus may further include:

Optionally, the updating module may be specifically configured to:

Optionally, the apparatus may further include:

The updating module may be specifically configured to execute the adding 1 to the statistical number in the element if the determination result of the determining module is not reached.

Optionally, the obtaining module 610 may be further configured to obtain a pixel value of a third pixel in the current video frame and a statistical number of pixel values of the third pixel, where the third pixel is any pixel in the current video frame;

the apparatus may further include:

Optionally, the computing module may be specifically configured to:

By applying the embodiment, a frame of image is obtained from an acquired video image sequence, the pixel values of all the pixel points in a preset neighborhood range of each pixel point in the image and the statistical number of each pixel value are respectively counted, a background sub-model corresponding to each pixel point is built according to the pixel values respectively counted for each pixel point and the statistical number of each pixel value, and the background sub-model corresponding to each pixel point is utilized to build a background model of the video image sequence. The background sub-model of the pixel point is built by using the pixel values and the statistical numbers of the pixel values of all the pixel points in the preset neighborhood range of the pixel point, so that the background model of the video image sequence is built, and the background sub-model is used as the background pixel point, and the statistical numbers of the pixel values of the pixel point are usually larger.

The embodiment of the invention also provides a camera for background modeling, as shown in fig. 7, comprising a camera 701, a processor 702 and a memory 703;

the camera 701 is configured to collect a video image sequence;

the memory 703 is used for storing a computer program;

the processor 702 is configured to implement all the steps of the background modeling method provided by the embodiment of the present invention when executing the computer program stored in the memory 703.

The camera 701, the processor 702 and the memory 703 may perform data transmission through a wired connection or a wireless connection, and the camera may communicate with other devices through a wired communication interface or a wireless communication interface. Fig. 7 shows only an example of data transmission through a bus, and is not limited to a specific connection method.

The Memory may include RAM (Random Access Memory ) or NVM (Non-Volatile Memory), such as at least one magnetic disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a CPU (Central Processing Unit ), NP (Network Processor, network processor), etc.; but also DSP (Digital Signal Processing, digital signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.

In this embodiment, the processor can implement by reading a computer program stored in the memory and by running the computer program: acquiring a frame of image from an acquired video image sequence, respectively counting the pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and the counting number of each pixel value, establishing a background sub-model corresponding to each pixel point according to the pixel values counted for each pixel point and the counting number of each pixel value, and constructing a background model of the video image sequence by utilizing the background sub-model corresponding to each pixel point. The background sub-model of the pixel point is built by using the pixel values and the statistical numbers of the pixel values of all the pixel points in the preset neighborhood range of the pixel point, so that the background model of the video image sequence is built, and the background sub-model is used as the background pixel point, and the statistical numbers of the pixel values of the pixel point are usually larger.

In addition, the embodiment of the invention also provides a computer readable storage medium, and a computer program is stored in the computer readable storage medium, and when the computer program is executed by a processor, all the steps of the background modeling method provided by the embodiment of the invention are realized.

In this embodiment, the computer-readable storage medium stores a computer program that executes the background modeling method provided by the embodiment of the present invention at the time of execution, and thus can realize: acquiring a frame of image from an acquired video image sequence, respectively counting the pixel values of all pixel points in a preset neighborhood range of each pixel point in the image and the counting number of each pixel value, establishing a background sub-model corresponding to each pixel point according to the pixel values counted for each pixel point and the counting number of each pixel value, and constructing a background model of the video image sequence by utilizing the background sub-model corresponding to each pixel point. The background sub-model of the pixel point is built by using the pixel values and the statistical numbers of the pixel values of all the pixel points in the preset neighborhood range of the pixel point, so that the background model of the video image sequence is built, and the background sub-model is used as the background pixel point, and the statistical numbers of the pixel values of the pixel point are usually larger.

For camera and computer readable storage medium embodiments for background modeling, the description is relatively simple as it relates to a method substantially similar to the method embodiments described above, as relevant see the section description of the method embodiments.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, camera for background modeling, and computer-readable storage medium embodiments, since they are substantially similar to method embodiments, the description is relatively simple, with reference to the section of the method embodiments being relevant.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. A background modeling method, the method comprising:

acquiring a frame of image from an acquired video image sequence;

for each pixel point, the following operations are respectively executed: counting the number of types of pixel values in a preset neighborhood range of a first pixel, wherein the first pixel is any pixel in the pixel points; judging whether the number of the types is larger than the number of preset background samples or not; if the number of the background sub-model is larger than the preset number of the background samples, a plurality of pixel values with the same number as the preset number of the background samples and the statistical number of each pixel value in the plurality of pixel values are read, and the background sub-model corresponding to the first pixel point is formed; if the number of the pixel values is not greater than the number of the pixel values of all the pixel points in the preset neighborhood range, the pixel values of all the pixel points in the preset neighborhood range and the statistical number of each pixel value are read, a background sub-model corresponding to the first pixel point is formed by a plurality of pixel values with zero values and the statistical number with zero values, the number of the pixel values with zero values and the statistical number with zero values are equal to the difference between the preset background sample number and the category number, and the background sub-model comprises the pixel values and the statistical number of the pixel values;

2. The method of claim 1, wherein the color value is a value of a specified color channel or an array of values of each color channel, and the gray value is an average value of the values of each color channel.

3. The method of claim 1, wherein after the separately counting the pixel values of all pixels within a preset neighborhood of each pixel in the image and the counted number of each pixel value, the method further comprises:

4. The method of claim 1, wherein after constructing the background model of the video image sequence using the background sub-model corresponding to each pixel, the method further comprises:

acquiring pixel values of all pixel points in a current video frame;

for each pixel point in the current video frame, the following operations are respectively executed: judging whether a pixel value of a second pixel point is contained in a background sub-model corresponding to the second pixel point, wherein the second pixel point is any pixel point in the current video frame; if yes, determining elements with the same pixel value as the pixel value of the second pixel point in a background sub-model corresponding to the second pixel point, wherein the background sub-model comprises a plurality of elements, and each element comprises the pixel value and the statistical number of the pixel value; adding 1 to the statistical number in the element, and subtracting 1 from the statistical number in other elements except the element in the background submodel corresponding to the second pixel point, wherein if any statistical number is 0, keeping any statistical number as 0; if not, subtracting 1 from the statistical number of all elements in the background submodel corresponding to the second pixel point, wherein if any statistical number is 0, keeping any statistical number as 0; determining the sequence numbers of the elements with non-zero and minimum statistical numbers in the background sub-model corresponding to the second pixel point; judging whether the sequence number is smaller than the number of preset background samples or not; if the number of the pixels in the background submodel corresponding to the second pixel point is smaller than the number of the pixels in the background submodel corresponding to the second pixel point, setting the statistical number in the element of the next sequence number as 1, and setting the pixel value in the element of the next sequence number as the pixel value of the second pixel point; if not, setting the statistical number in the last element in the background submodel corresponding to the second pixel point as 1, and setting the pixel value in the last element as the pixel value of the second pixel point;

5. The method of claim 4, wherein prior to said adding 1 to the statistical number in the element, the method further comprises:

if not, the adding 1 of the statistical number in the element is performed.

6. The method of claim 4, wherein after the determining the updated background model from the updated background sub-model corresponding to each pixel in the current video frame, the method further comprises:

acquiring a pixel value of a third pixel point in the current video frame and the statistical number of the pixel value of the third pixel point in a background sub-model corresponding to the third pixel point, wherein the third pixel point is any pixel point in the current video frame;

And calculating the foreground probability of the third pixel point according to the statistical number of the pixel value of the third pixel point in the background sub-model corresponding to the third pixel point and the sum of the statistical numbers.

7. The method of claim 6, wherein calculating the foreground probability of the third pixel according to the sum of the statistical number of the pixel values of the third pixel in the background sub-model corresponding to the third pixel and the statistical number comprises:

calculating the foreground probability of the third pixel point by using a foreground probability calculation formula according to the statistical number of the pixel value of the third pixel point in the background sub-model corresponding to the third pixel point and the sum of the statistical numbers, wherein the foreground probability calculation formula is as follows:

wherein FG (x, y) is the foreground probability of the third pixel point, and (x, y) is the coordinates of the third pixel point, sn _k For the statistical number of the pixel value of the third pixel point in the background sub-model corresponding to the third pixel point,for the sum of the statistical numbers, sn _i And (3) the ith statistical number in the background submodel corresponding to the third pixel point, wherein N is the preset background sample number.

8. A camera for background modeling, the camera comprising a camera head, a processor, and a memory;

the camera is used for collecting video image sequences;

the memory is used for storing a computer program;

the processor being adapted to implement the method of any of claims 1-7 when executing a computer program stored on the memory.