CN112669294B

CN112669294B - Camera shielding detection method and device, electronic equipment and storage medium

Info

Publication number: CN112669294B
Application number: CN202011630191.1A
Authority: CN
Inventors: 王杉杉; 胡文泽; 王孝宇
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2024-04-02
Anticipated expiration: 2040-12-30
Also published as: CN112669294A

Abstract

The embodiment of the invention provides a camera shielding detection method, a camera shielding detection device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a motion mask image of a current frame image according to optical flow information between the current frame image and a previous adjacent frame image in a preset frame image sequence, wherein the motion mask image comprises mask values; constructing a current background image according to the motion mask image and a frame image corresponding to the motion mask image; calculating the similarity between the current background image and a preset background image, and judging whether the similarity is greater than or equal to a similarity threshold value; and if the similarity is greater than or equal to a similarity threshold, confirming that the camera is blocked. The detection accuracy of the shielded camera is high; meanwhile, a worker is not required to detect the shielding of the camera, the detection efficiency of the shielding of the camera is improved, and further the operation and maintenance cost of the camera is reduced.

Description

Camera shielding detection method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of artificial intelligence, and in particular, to a method and apparatus for detecting camera occlusion, an electronic device, and a storage medium.

Background

With the intensive research of artificial intelligence, image recognition technologies such as traffic image recognition, entrance guard image recognition, factory pipeline image recognition and the like are continuously falling to the ground. In the image recognition technology of each scene, a target area is photographed by a camera installed in advance to photograph an image of the corresponding target area for recognition. With more and more cameras deployed, the operation and maintenance cost of the cameras is increased, a large number of people are required to detect and maintain the cameras in time to ensure the normal operation of the cameras, if enough workers do not detect and maintain the cameras or the workers do not detect the cameras in time, the images shot by the cameras can not be normally used for image recognition, for example, for a camera lens, the camera lens can be blocked due to various reasons, such as road construction, leaf blocking, insect cadaver adhesion and the like, if the lens is blocked, the camera lens can not shoot the complete images of the target area, and therefore the image recognition of the target area is affected. Therefore, the existing camera needs a large number of people and long time to perform shielding detection on the camera, and the shielding detection efficiency of the camera is low, so that the operation and maintenance cost of the camera is high.

Disclosure of Invention

The embodiment of the invention provides a camera shielding detection method, which can improve the detection efficiency of camera shielding and reduce the operation and maintenance cost of a camera.

In a first aspect, an embodiment of the present invention provides a method for detecting camera occlusion, where the method includes:

acquiring a motion mask image of a current frame image according to optical flow information between the current frame image and a previous adjacent frame image in a preset frame image sequence, wherein the motion mask image comprises mask values;

constructing a current background image according to the motion mask image and a frame image corresponding to the motion mask image;

calculating the similarity between the current background image and a preset background image, and judging whether the similarity is smaller than a similarity threshold value or not;

and if the similarity is smaller than a similarity threshold, confirming that the camera is blocked.

Optionally, the current background map includes a plurality of image areas, and the method further includes:

if the similarity is smaller than a similarity threshold, respectively calculating motion information of each image area according to the motion mask image and the corresponding frame image;

and judging whether the camera is blocked or not according to the motion information of each image area.

Optionally, the optical flow information includes horizontal optical flow information and vertical optical flow information, and the obtaining the motion mask map of the current frame image according to the optical flow information between the current frame image and the previous adjacent frame image in the preset frame image sequence includes:

Calculating an optical flow velocity from the horizontal optical flow information and the vertical optical flow information;

according to the optical flow speed, calculating to obtain mask values corresponding to all pixel points in the current frame image;

and generating a motion mask map according to the mask values corresponding to the pixel points.

Optionally, the number of the motion mask images is n frames, and the number of the frame images corresponding to the motion mask images is also n frames, where n is a positive integer greater than 0, and the constructing the current background image according to the motion mask images and the frame images corresponding to the motion mask images includes:

extracting a mask value sequence of each pixel point in the motion mask map according to the n-frame motion mask map, wherein the dimension of the mask value sequence is n;

extracting a corresponding target pixel index according to a mask value sequence corresponding to a target pixel, wherein the target pixel is any one of all pixels in the motion mask map;

and constructing a current background image based on the frame image corresponding to the motion mask image and the target pixel index.

Optionally, the target pixel point index includes a target frame image position and a target pixel point position, and the constructing the current background image based on the frame image corresponding to the motion mask image and the target pixel point index includes:

Determining a target frame image according to the target frame image position, wherein the target frame image is one frame of frame images corresponding to the motion mask image;

according to the target pixel point position, extracting a pixel value corresponding to the pixel point position in the target frame image as a pixel value of the current background pixel point;

and constructing a current background image based on the pixel value of the current background pixel point.

Optionally, the number of frame images in the preset frame image sequence is k frames, k is a positive integer greater than 0, and the constructing the current background image according to the motion mask image and the frame image corresponding to the motion mask image includes:

sequentially sampling frame images in a preset frame image sequence, and adding the frame images to a first data set, wherein the first data set contains n frames of newly sampled frame images, n is a positive integer greater than 0, and k is greater than or equal to n;

sequentially acquiring motion mask images corresponding to frame images in a preset frame image sequence, and adding the motion mask images to a second data set, wherein the second data set contains n frames of latest motion mask images, and the latest motion mask images correspond to the latest sampled frame images;

and constructing a current background graph based on the first data set and the second data set.

Optionally, the constructing a current background map based on the first data set and the second data set includes:

judging whether a frame image sampled latest in the first data set is an mth frame or not, wherein m is a positive integer greater than 0;

if the latest frame image in the first data set is the mth frame, acquiring an h frame motion mask image to an h+a frame motion mask image, wherein h is greater than or equal to m, k is greater than or equal to h+a, and a is a positive integer greater than 0;

and when h+a is smaller than k, judging whether to stop the construction of the current background image in advance according to the h frame motion mask map to the h+a frame motion mask map.

Optionally, the mask value includes a motion mask value, and when h+a is smaller than k, determining whether to stop construction of the current background image in advance according to the h frame motion mask map to the h+a frame motion mask map includes:

calculating the motion mask value duty ratio of each frame of motion mask map in the h frame of motion mask map to the h+a frame of motion mask map according to the motion mask value;

if the motion mask value duty ratio of each frame of motion mask image is smaller than a preset duty ratio threshold value in the h frame of motion mask image to the h+a frame of motion mask image, stopping the construction of the background image in advance, and taking the latest constructed background image as the current background image;

And if the motion mask value duty ratio of at least one frame of motion mask map is larger than a preset duty ratio threshold value in the h frame of motion mask map to the h+a frame of motion mask map, taking the finally constructed background image as the current background image until the h+a frame of motion mask map is the k frame of motion mask map.

In a second aspect, an embodiment of the present invention further provides a camera occlusion detection device, including:

the acquisition module is used for acquiring a motion mask image of a current frame image according to optical flow information between the current frame image and a previous adjacent frame image in a preset frame image sequence, wherein the motion mask image comprises mask values;

the construction module is used for constructing a current background image according to the motion mask image and the frame image corresponding to the motion mask image;

the first calculation module is used for calculating the similarity between the current background image and a preset background image and judging whether the similarity is smaller than a similarity threshold value or not;

and the confirming module is used for confirming that the camera is blocked if the similarity is smaller than a similarity threshold value.

In a third aspect, an embodiment of the present invention provides an electronic device, including: the camera occlusion detection device comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the steps in the camera occlusion detection method provided by the embodiment of the invention are realized when the processor executes the computer program.

In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the steps in the method for detecting camera occlusion provided in the embodiment of the present invention.

In the embodiment of the invention, a motion mask image of a current frame image is obtained according to optical flow information between the current frame image and a previous adjacent frame image in a preset frame image sequence, wherein the motion mask image comprises mask values; constructing a current background image according to the motion mask image and a frame image corresponding to the motion mask image; calculating the similarity between the current background image and a preset background image, and judging whether the similarity is greater than or equal to a similarity threshold value; and if the similarity is greater than or equal to a similarity threshold, confirming that the camera is blocked. The camera is subjected to background modeling through optical flow information to obtain a current background image of the camera, and the current background image of the camera is subjected to similarity comparison with a preset background image, so that whether the camera is blocked or not can be judged; meanwhile, a worker is not required to detect the shielding of the camera, the detection efficiency of the shielding of the camera is improved, and further the operation and maintenance cost of the camera is reduced.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for detecting camera occlusion according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for constructing a current background diagram according to an embodiment of the present invention;

FIG. 3 is a flowchart of another method for constructing a current background diagram according to an embodiment of the present invention;

FIG. 4 is a flowchart of another method for detecting camera occlusion according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating the division of image areas according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a camera occlusion detection device according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of another camera occlusion detection device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an acquisition module according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a construction module according to an embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a first building sub-module according to an embodiment of the present invention;

FIG. 11 is a schematic diagram of another construction module according to an embodiment of the present invention;

FIG. 12 is a schematic diagram of a second construction sub-module according to an embodiment of the present invention;

fig. 13 is a schematic structural diagram of a second judging unit according to an embodiment of the present invention;

fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, fig. 1 is a flowchart of a method for detecting camera occlusion according to an embodiment of the present invention, as shown in fig. 1, the camera is fixedly disposed at a designated position, and the camera occlusion detection is performed at regular time or in real time, including the following steps:

101. And acquiring a motion mask image of the current frame image according to the optical flow information between the current frame image and the previous adjacent frame image in the preset frame image sequence.

In the embodiment of the present invention, the preset frame image sequence may be an image frame sequence of a preset number of frames in a video stream captured by a current camera, where the sequence refers to a sequence, and the video stream may be a real-time video stream or a video stream of a certain period of time.

The preset frame image sequence may be obtained by sampling the video stream at intervals, where the number of the intervals is preset, for example, may be preset to be 1, 2, 3, 4, 5, and other integer values. In this way, a sufficient time span can be obtained, so that the background reconstruction is more complete and accurate, for example, in a video stream, a frame is extracted every 5 frames, if the preset number of frames of the preset frame image sequence is k, the actual coverage range is 5k frames, and if k=500, the actual coverage range is 2500 frames.

In a possible embodiment, each frame image in the preset frame image sequence may be preprocessed, each frame image in the preset frame image sequence is fixedly scaled to a preset size, for example, to a size of 480x320, and the calculation speed may be increased by scaling the frame images in the preset frame image sequence.

The optical flow information refers to the movement speed and movement direction of each pixel point from the previous adjacent frame image to the current frame image, the optical flow information can be expressed in the form of an optical flow diagram, each pixel point in the optical flow diagram has a displacement in the horizontal x direction and the vertical y direction, and the movement speed and the movement direction of the corresponding pixel point can be calculated through the displacement. The size of the light flow graph is the same as that of the current frame image, and the light flow graph has the same pixel number.

In the embodiment of the invention, the corresponding motion mask map can be obtained through the motion speed (also called optical flow speed) of each pixel point in the optical flow map. The size of the motion mask map is the same as that of the optical flow map, and can be understood to be the same as that of the current frame image. The motion mask map includes a mask value, where each pixel point in the motion mask map corresponds to one mask value. The mask value may be a binary mask, i.e. two different mask values are used to indicate different motion states of each pixel, which may be moving or stationary.

Specifically, in the optical flow map, it may be determined whether the motion state of each pixel point is moving or stationary according to the optical flow velocity of each pixel point. If the motion state of one pixel point is judged to be moving, the first mask value b is assigned, if the motion state of one pixel point is judged to be stationary, the second mask value c is assigned, and each pixel point of the optical flow graph is assigned, so that a corresponding motion mask graph is obtained. Specifically judging the motion state of a pixel, namely comparing the optical flow speed of the pixel with a preset optical flow speed threshold value, and judging that the motion state of the pixel is moving if the optical flow speed of the pixel is greater than the preset optical flow speed threshold value; if the optical flow speed of the pixel point is smaller than the preset optical flow speed threshold value, judging that the motion state of the pixel point is static. Specifically, the expression can be represented by the following formula:

Wherein motion_mask (i, j) represents a mask value of pixel (i, j), v (i, j) represents an optical flow velocity of pixel (i, j), v_thr represents a preset optical flow velocity threshold, b represents a first mask value, and c represents a second mask value.

Alternatively, the motion mask map may be a binary mask map, where the mask value is 0 or 1, and when the mask value is 0, it may represent that the motion state of the corresponding pixel point is stationary, and when the mask value is 1, it may represent that the motion state of the corresponding pixel point is moving. Correspondingly, the expression can be represented by the following formula:

wherein motion_mask (i, j) represents a mask value of pixel (i, j), v (i, j) represents an optical flow speed of pixel (i, j), v_thr represents a preset optical flow speed threshold, 1 represents a first mask value, and 0 represents a second mask value.

Optionally, the optical flow map may be obtained by predicting through an optical flow prediction neural network, and by inputting a preset frame image sequence into the optical flow prediction neural network, an optical flow map between a current frame image and a previous adjacent frame image may be obtained, and the optical flow map may be associated with the current frame image, so as to represent a change condition of each pixel point in a process from the previous adjacent frame image to the current frame image, and also the motion mask map obtained based on the optical flow map may be associated with the current frame image, so that subsequent background construction is facilitated.

The optical flow diagram can be a single-channel optical flow diagram or a double-channel optical flow diagram, wherein the single-channel optical flow diagram comprises integrated optical flow information flow; the dual-channel optical flow map includes horizontal optical flow information flowx (i.e., the horizontal optical flow velocity of each pixel in the horizontal optical flow map) and vertical optical flow information flowy (i.e., the vertical optical flow velocity of each pixel in the vertical optical flow map). In this alternative embodiment, the normalized optical flow velocity of each pixel point in the optical flow map may be used to determine whether the motion state of each pixel point is moving or stationary. The normalized optical flow speed is the relative speed, so that the motion state of each pixel point can be accurately judged.

Optionally, it may be determined whether the corresponding motion mask map needs to be determined by normalizing the optical flow velocity according to the ratio of the minimum optical flow velocity to the maximum optical flow velocity in the optical flow map, so that the problem that the calculation error is increased due to the normalized relative velocity when the optical flow velocity distribution of each pixel point in the optical flow map is relatively close and the velocity values are relatively small can be avoided. The ratio of the minimum optical flow speed to the maximum optical flow speed can be expressed by the following expression:

Wherein h represents the optical flow velocity corresponding to the pixel point (i, j) in the optical flow graph, flow (i, j) represents the optical flow information of the pixel point (i, j) in the optical flow graph, max flow (i, j) represents the maximum value of the optical flow information corresponding to all the pixel points, min flow (i, j) represents the minimum value of the optical flow information corresponding to all the pixel points, img_height represents the height of the optical flow graph, img_width represents the height of the optical flow, for example, if the current image frame is 480×320 pixel size, img_width is 480 pixels, img_width is 320 pixels. When h is greater than or equal to the preset speed ratio, the absolute optical flow speed is used for calculating the mask value, and when h is less than the preset speed ratio, the normalized optical flow speed can be used for calculating the mask value. In addition, when the optical flow graph is two-channel, the mask value can be calculated by adopting the normalized optical flow speed when h of the horizontal optical flow graph is smaller than the preset speed ratio or h of the vertical optical flow graph is smaller than the preset speed ratio.

Specifically, for a single-channel optical flow graph, the optical flow speed of each pixel point in the optical flow graph can be calculated according to the optical flow information corresponding to each pixel point in the optical flow graph; according to the optical flow speed of each pixel point, calculating to obtain a mask value corresponding to each pixel point in the current frame image; thereby generating a motion mask map according to the mask values corresponding to the pixel points. The normalization can be performed specifically by the following formula:

Wherein normal_v (i, j) in the above formula represents a normalized optical flow velocity corresponding to a pixel (i, j) in the optical flow graph, flow (i, j) represents optical flow information of the pixel (i, j) in the optical flow graph, and Max flow (i, j) is described above ² The maximum value is represented in the optical flow information square values corresponding to all the pixel points, the img_height represents the height of the optical flow graph, and the img_width represents the height of the optical flow, for example, if the current image frame is 480x320 pixel size, the obtained optical flow graph is also 480x320 pixel size, the img_width is 480 pixels, and the img_height is 320 pixels.

Alternatively, the optical flow prediction neural network may be a neural network constructed based on coarse2fineNet, and the optical flow result predicted by the neural network constructed based on coarse2fineNet is a horizontal optical flow graph and a vertical optical flow graph. The optical flow speed of each pixel point can be calculated according to the horizontal optical flow information flowx of each pixel point in the horizontal optical flow graph and the vertical optical flow information flowy of each pixel point in the vertical optical flow graph; according to the optical flow speed of each pixel point, calculating to obtain a mask value corresponding to each pixel point in the current frame image; and generating a motion mask map according to the mask values corresponding to the pixel points. The normalization can be performed specifically by the following formula:

Wherein normal_v (i, j) represents normalized optical flow velocity corresponding to pixel (i, j) in the optical flow sheet, flowx (i, j) represents horizontal optical flow information of pixel (i, j) in the optical flow sheet, flowy (i, j) represents vertical optical flow information of pixel (i, j) in the optical flow sheet, and Max (flowx (i, j) ² +flowy(i，j) ² ) The maximum value is represented by the sum of squares of the horizontal optical flow information and the vertical optical flow information corresponding to all the pixel points, the img_height represents the height of the optical flow graph, and the img_width represents the height of the optical flow, for example, if the current image frame is 480x320 pixel size, the obtained horizontal optical flow graph and vertical optical flow graph are both 480x320 pixel size, the img_width is 480 pixels, and the img_height is 320 pixels.

The mask value corresponding to each pixel point in the current frame image is a binary mask value of 0 and 1, and the mask value may be calculated according to the normalized optical flow speed of each pixel point. Specifically, the method can be represented by the following formula:

wherein normal_v (i, j) in the above formula represents a normalized optical flow velocity corresponding to the pixel point (i, j) in the optical flow graph, and v_thr represents a preset normalized optical flow velocity threshold. It can be seen that when the normalized optical flow velocity is equal to or greater than the normalized optical flow velocity threshold, the corresponding mask value is 1, and when the normalized optical flow velocity is less than the normalized optical flow velocity threshold, the corresponding mask value is 0.

And generating a motion mask diagram according to the mask value corresponding to each pixel point, wherein the motion mask diagram comprises a first mask value and a second mask value, and the binary motion mask diagram comprises 0 and 1, namely the mask value of each pixel point in the binary motion mask diagram is constructed by 0 or 1. The size of the motion mask map, the current frame image, and the optical flow map are the same.

102. And constructing a current background image according to the motion mask image and the frame image corresponding to the motion mask image.

In the embodiment of the present invention, the mask value in the motion mask map may represent a motion state of a corresponding pixel, for example, the mask value may be a binary mask, that is, two different mask values are used to represent different motion states of each pixel through a first mask value and a second mask value, and the motion states may be moving or stationary, which respectively correspond to the first mask value and the second mask value.

The pixel having a stationary motion state can be used as a background pixel. And for the pixel points with moving motion states, searching a frame image corresponding to the pixel points when the motion states are static in a preset frame image sequence, and determining the pixel points in the frame image as background pixel points.

Further, the number of the motion mask images is n frames, and the number of frame images corresponding to the motion mask images is also n frames. And judging the motion state of the corresponding pixel point in the frame image through the pixel point in the motion mask image, so as to determine which pixel point in which frame image can be used as a background pixel point.

When the corresponding motion mask map is acquired according to the preset frame image sequence, a first data set may be maintained, for accommodating the latest frame images sampled from the preset frame image sequence in sequence, for example, when the t+5 th frame is sampled, the t+5 th frame is added to the first data set. And maintaining a second data set for accommodating the corresponding motion mask map, for example, after adding the frame image of the t+5 frame in the first data set, the motion mask map corresponding to the frame image of the t+5 frame is added in the second data set. The holding capacity of the first data set and the second data set is set to n frames, after n frames are exceeded, the frame image or the motion mask image added first is removed, for example, when n is set to 19, and when the frame image or the motion mask image in the data set exceeds 19, the frame image or the motion mask image added first in the data set is removed, and the frame image or the motion mask image in the data set is kept not to exceed 19. The removed frame image or the motion mask image can be stored separately for multiplexing data, or can be deleted directly. The frame image of the first data set and the motion mask image of the second data set are in one-to-one correspondence.

According to the motion state of each pixel point in the n-frame motion mask diagram, selecting a pixel point with a static motion state from the frame image corresponding to the n frames as a background pixel point. And when the background image with the same size as the frame image or the motion mask image is obtained, the construction of the current background image is completed.

103. And calculating the similarity between the current background image and the preset background image, and judging whether the similarity is smaller than a similarity threshold value.

In the embodiment of the present invention, the similarity threshold is preset.

The above-mentioned preset background map may be constructed and stored according to the methods of step 101 and step 102 when there are few or no moving objects in the initial stage of the camera installation. The above-mentioned preset background map may also be updated periodically during the operation of the camera according to the methods of step 101 and step 102, and in the updating process, the obtained background map may be sent to a staff member for determination to ensure the accuracy of the background map, so that after a new background portion is added, the preset background map may be updated in time, so as to avoid misjudgment caused by updating the background and not updating the preset background map.

The similarity may be an angle similarity, a distance similarity, etc., and the specific similarity algorithm is not limited in the embodiment of the present invention, and the corresponding similarity threshold is set according to different similarity algorithms. In the embodiment of the invention, the similarity algorithm is preferably a similarity algorithm based on structural similarity ssim (Structural SIMilarity). Structural similarity_score=ssim (back_group_updated, back_group_reference).

If the similarity is greater than or equal to the similarity threshold, the current background image is similar to the preset background image, the camera is not shielded, and the shielded detection of the camera can be finished. For example, when the similarity is greater than or equal to 0.85, the current background image may be considered similar to the preset background image, and when the similarity is less than 0.85, the current background image may be considered dissimilar to the preset background image.

If the similarity is less than the similarity threshold, the process proceeds to step 104.

104. Confirming that the camera is occluded.

In the embodiment of the invention, if the similarity between the current background image and the preset background image is smaller than the similarity threshold, the camera can be confirmed to be blocked.

In one possible embodiment, if the similarity between the current background image and the preset background image is smaller than the similarity threshold, the similarity result may be sent to the corresponding staff for confirmation, where the confirmation result includes that the camera is blocked, a background is newly added, or a temporary background is newly added. For the result that the camera is blocked, for example, the camera is blocked by leaves or insect corpses, corresponding field personnel can be notified to maintain; for a newly added background, such as a newly added fixed dustbin or a newly added building, etc., the preset background map can be updated; for newly added temporary backgrounds, such as temporary construction cards or temporary banners, the reminding that the camera is blocked can be ignored.

In the embodiment of the invention, a motion mask image of a current frame image is obtained according to optical flow information between the current frame image and a previous adjacent frame image in a preset frame image sequence, wherein the motion mask image comprises mask values; constructing a current background image according to the motion mask image and a frame image corresponding to the motion mask image; calculating the similarity between the current background image and a preset background image, and judging whether the similarity is greater than or equal to a similarity threshold value; and if the similarity is greater than or equal to a similarity threshold, confirming that the camera is blocked. The camera is subjected to background modeling through optical flow information to obtain a current background image of the camera, the current background image of the camera is subjected to similarity comparison with a preset background image, whether the camera is blocked or not can be judged, and as the current background image is constructed through the optical flow information, the current background image is closer to the real situation, so that the similarity comparison of the current background image and the preset background image is more accurate, and the detection accuracy of camera blocking is high; meanwhile, a worker is not required to detect the shielding of the camera, the detection efficiency of the shielding of the camera is improved, and further the operation and maintenance cost of the camera is reduced.

It should be noted that, the method for detecting camera occlusion provided by the embodiment of the invention can be applied to devices such as a mobile phone, a monitor, a computer, a server and the like which can detect camera occlusion.

Optionally, referring to fig. 2, fig. 2 is a flowchart of a method for constructing a current background image according to an embodiment of the present invention, in the embodiment of the present invention, the number of motion mask images is n frames, and the number of frame images corresponding to the motion mask images is also n frames, specifically as shown in fig. 2, the method includes the following steps:

201. and extracting a mask value sequence of each pixel point in the motion mask diagram according to the n-frame motion mask diagram.

In the embodiment of the present invention, since the number of motion mask graphs is n, the dimension of the mask value sequence is n. The motion mask map is obtained according to a preset frame image sequence.

Optionally, when the corresponding motion mask map is acquired according to the preset frame image sequence, a first data set may be maintained, where the latest frame image sampled sequentially from the preset frame image sequence is accommodated, for example, when the t+5 frame is sampled, the t+5 frame is added to the first data set. And maintaining a second data set for accommodating the corresponding motion mask map, for example, after adding the frame image of the t+5 frame in the first data set, the motion mask map corresponding to the frame image of the t+5 frame is added in the second data set. The capacity of the first data set and the second data set is set to n frames, and after n frames are exceeded, the first added frame image or the motion mask image is removed. The frame image of the first data set and the motion mask image of the second data set are in one-to-one correspondence.

For an n-frame motion mask map, there is a n-dimensional sequence of mask values (m ₁ ，m ₂ ，…m _n-1 ，m _n ) ^i,j Wherein the sequence of mask values may represent the motion state of the pixel point.

202. And extracting a corresponding target pixel index according to the mask value sequence corresponding to the target pixel.

In the embodiment of the present invention, the target pixel is any one of the pixels in the motion mask map, and the target pixel index includes a target frame image position and a target pixel position. Determining a target frame image according to the target frame image position, wherein the target frame image is one frame of frame images corresponding to the motion mask image; and extracting a pixel value corresponding to the pixel position in the target frame image as the pixel value of the current background pixel according to the target pixel position.

Specifically, for an n-frame motion mask map, there is an n-dimensional mask value sequence (m ₁ ，m ₂ ，…，m _n-1 ，m _n ) ^i,j The superscript i, j of the mask value sequence is the target pixel point position, and the subscript (1, 2, …, n-1, n) of the specific mask value in the mask value sequence is the target motion mask map position, and the corresponding subscript is the target frame image position.

The mask value sequence may represent the motion state of the pixel, for example, in a binary mask map (where the mask value is 0 or 1, where 0 represents stationary and 1 represents moving), if all the mask values in the mask value sequence are 0, the motion state of the pixel may be described as stationary, and may be used as a background pixel; if all mask values in the mask value sequence are not all 0, then calculation can be performed afterwards, adding the latest frame image to the first data set, and adding the corresponding motion mask image to the second data set. Further, in the n-frame motion mask map, a frame image corresponding to any one frame of motion mask map is selected as a target frame image, and a pixel value corresponding to the pixel point in the target frame image is used as a pixel value of a background pixel point. Of course, in some possible embodiments, when the number of mask values representing static in the mask value sequence is greater than the preset number, the motion state of the pixel point may be considered to be static, for example, n=19, and when the number of 0 is 17 or more, the motion state of the pixel point may be considered to be static.

203. And constructing a current background image based on the frame image corresponding to the motion mask image and the target pixel index.

In the embodiment of the present invention, the number of frame images corresponding to the motion mask map is the frame images in the first data set. The target pixel index includes a target frame image position and a target pixel position. The target frame image is one frame of frame images corresponding to the motion mask image, and the target frame image may be understood as one frame of the first data set; and extracting a pixel value corresponding to the pixel position in the target frame image as the pixel value of the current background pixel according to the target pixel position.

Specifically, the specific position of the target frame image in the first data set may be determined according to the above-mentioned target frame image position, in this embodiment of the present invention, n is an odd number, the target frame image position may take the median position of the mask value sequence, for example, taking n=19 as an example, and if the value of 19 is 10, after determining that a pixel point may be used as a background pixel point, the target pixel point index [ (i, j) and 10] corresponding to the pixel point is extracted, where the target motion mask map is the 10 th frame in the second data set, the corresponding target frame image is the 10 th frame in the first data set, then in the frame image corresponding to the 10 th frame, the target pixel point (i, j) is found as the background pixel point, and the pixel value of the background pixel point is the pixel value of the target pixel point (i, j) in the frame image corresponding to the 10 th frame.

And when all the background pixel points are determined, obtaining a current background image, wherein the size of the current background image is the same as that of the frame image in the first data set and the motion mask image in the second data set.

Optionally, referring to fig. 3, fig. 3 is a flowchart of another method for constructing a current background image according to an embodiment of the present invention, in the embodiment of the present invention, the number of frame images in a preset frame image sequence is k frames, k is a positive integer greater than 0, the number of motion mask images is n frames, and the number of frame images corresponding to the motion mask images is also n frames, and specifically as shown in fig. 3, the method includes the following steps:

301. and sequentially sampling frame images in a preset frame image sequence and adding the frame images to the first data set.

In the embodiment of the present invention, the first data set accommodates n frames of the frame images sampled most recently, n is a positive integer greater than 0, and k is greater than or equal to n. The latest sampled frame image refers to a frame image obtained by sampling a latest frame from a preset frame image sequence.

Alternatively, n may be an odd number, so as to obtain a median value, for example, n=19, where the median value is 10, and n=5, where the median value is 3.

Alternatively, the value of n may be between 15 and 25, if the value of n is too small, the sample is easy to be sparse, the background image obtained by construction is inaccurate, if the value of n is too large, the calculation time is long, the construction speed of the background image is slow, and in the embodiment of the invention, 19 is preferred.

Optionally, the preset frame image sequence may be obtained by sampling the video stream at intervals, where the number of the intervals is preset, for example, may be preset to integer values of 1, 2, 3, 4, 5, etc. In this way, a sufficient time span can be obtained, so that the background reconstruction is more complete and accurate, for example, in a video stream, a frame is extracted every 5 frames, if the preset number of frames of the preset frame image sequence is k, the actual coverage range is 5k frames, and if k=500, the actual coverage range is 2500 frames.

302. And sequentially acquiring a motion mask image corresponding to the frame images in the preset frame image sequence, and adding the motion mask image to the second data set.

In the embodiment of the present invention, the second data set accommodates n frames of the latest motion mask map, and the accommodation capacities of the first data set and the second data set are set to n frames, and after n frames are exceeded, the frame image or the motion mask map which is added first is removed. The frame images of the first data set and the motion mask images of the second data set are in one-to-one correspondence, and the latest motion mask image corresponds to the latest frame image.

The motion mask map corresponding to the frame images in the preset frame image sequence may be sequentially obtained by the calculation method of the motion mask map in step 101.

303. A current context map is constructed based on the first data set and the second data set.

In the embodiment of the present invention, the frame image of the first data set and the motion mask image of the second data set are in a one-to-one correspondence, and the current background image can be constructed by the proposed methods of step 201, step 202 and step 203.

Further, in order to provide a coverage space for the calculation of the background map, and obtain a more accurate current background map, k may be far greater than n, for example, let k=500, and n=19. The first data set can be regarded as a sliding sampling extraction window, the size of the extraction window is n, the sliding step length of the extraction window is 1, the sliding sampling is performed on a preset frame image sequence with the length of k by using the sliding step length as 1, and when the sliding is performed once, the sampling is performed to extract a frame of the latest frame image, and a frame of the frame image extracted first is removed. From the first frame, the extraction window may be slid k times. The above calculation coverage space can also be understood as a sliding space of the extraction window on the preset frame image sequence. It can be seen that when the sliding space is small, enough background pixels may not be available to construct the current background map.

Optionally, under the condition of ensuring accuracy of the current background image, the current background image construction efficiency may be improved by ending background image construction in advance, specifically, in the process of performing background construction, whether the frame image sampled latest in the first data set is an mth frame or not may be judged, specifically, whether the frame image sampled latest in the first data set is an mth frame in a preset frame image sequence or not is judged, and m is a positive integer greater than 0, for example, when m=80, whether the frame image sampled latest in the first data set is an 80 th frame in the preset frame image sequence may be judged.

If the frame image sampled most recently in the first data set is the mth frame, acquiring an h frame motion mask image to an h+a frame motion mask image, wherein h is greater than or equal to m, k is greater than or equal to h+a, a is a positive integer greater than 0, and a can be preset according to the requirement of a user. For example, assuming that m=80, h=80, and a=12, when it is determined that the latest frame image in the first data set is the 80 th frame in the preset frame image sequence, a subsequent 80 th frame motion mask map to 92 th frame motion mask map may be acquired in the second data set.

When h+a is smaller than k, judging whether to stop the construction of the current background image in advance according to the h frame motion mask map to the h+a frame motion mask map.

Specifically, the mask value includes a motion mask value and a still mask value, the motion mask value is used to indicate that the motion state of the pixel point is moving, and the still mask value is used to indicate that the motion state of the pixel point is still.

The motion mask value duty ratio of each frame of the motion mask map from the h frame of the motion mask map to the h+a frame of the motion mask map may be calculated according to the motion mask value. The motion mask value duty cycle may be represented by the following equation:

in the formula, motion_status is a motion mask value duty ratio, motion_mask (i, j) is a pixel point in the motion mask map, the mask value is a motion mask value, and image_width is the total number of pixel points in the motion mask map.

And if the motion mask value duty ratio of each frame of motion mask image is smaller than a preset first duty ratio threshold value in the h frame of motion mask image to the h+a frame of motion mask image, stopping the construction of the background image in advance, and taking the latest constructed background image as the current background image.

Alternatively, the still mask value duty ratio of each frame of the motion mask map from the h frame of the motion mask map to the h+a frame of the motion mask map may be calculated according to the still mask value. And if the static mask value duty ratio of each frame of motion mask image is larger than a preset second duty ratio threshold value in the h frame of motion mask image to the h+a frame of motion mask image, stopping the construction of the background image in advance, and taking the latest constructed background image as the current background image. By stopping the construction of the background image in advance, the computing resources and computing time can be reduced.

And if the motion mask value duty ratio of at least one frame of motion mask map is larger than a preset first duty ratio threshold value or the still mask value duty ratio of at least one frame of motion mask map is smaller than a preset second duty ratio threshold value in the h frame of motion mask map to the h+a frame of motion mask map, taking the finally constructed background image as the current background image until the h+a frame of motion mask map is the k frame of motion mask map.

For example, assuming that k=500, m=80, h=80, a=12, the first duty threshold is 0.1, the second duty threshold is 0.9, the motion mask value is 1, and the still mask value is 0, when it is determined that the latest frame image in the first data set is the 80 th frame in the preset frame image sequence, the subsequent 80 th frame motion mask map to 92 th frame motion mask map may be acquired in the second data set. If the motion mask value ratio of each frame of motion mask map is smaller than 0.1 in the 80 th frame of motion mask map to the 92 th frame of motion mask map, the construction of the background image is stopped in advance, and the latest constructed background image is used as the current background image. Or if the static mask value ratio of each frame of motion mask map is larger than 0.9 in the 80 th frame of motion mask map to the 92 th frame of motion mask map, stopping the construction of the background image in advance, and taking the latest constructed background image as the current background image. If the motion mask value of at least one frame of motion mask map is greater than 0.1 or the still mask value of at least one frame of motion mask map is less than 0.9 in the 80 th to 92 th frame of motion mask maps, the finally constructed background image is taken as the current background image until the next advanced stop condition is reached or until the 92 th frame of motion mask map is the 500 th frame of motion mask map.

Optionally, referring to fig. 4, fig. 4 is a flowchart of another method for detecting camera occlusion according to an embodiment of the present invention, where in the embodiment of the present invention, a current background image includes a plurality of image areas, and further, the motion mask image includes a plurality of mask areas, as shown in fig. 4, and the method includes the following steps:

401. and judging whether the similarity between the current background picture and the preset background picture is smaller than a similarity threshold value.

In the embodiment of the present invention, the step 103 may be referred to for calculating the similarity between the current background image and the preset background image. If the similarity between the current background image and the preset background image is smaller than the similarity threshold, the camera is possibly blocked. For example, when the similarity between the current background image and the preset background image is smaller than 0.85, the current background image and the preset background image may be considered to be dissimilar.

Further, to obtain a more accurate determination result, step 401 may be performed.

402. And respectively calculating the motion information of each image area according to the motion mask map and the corresponding frame image.

In the embodiment of the present invention, the motion mask map includes a mask value, the mask value includes a motion mask value and a static mask value, and the motion information may be motion state information obtained based on the motion mask value, where the motion state information includes a moving state and a static state.

The number of image areas may be r, and the second data set may be divided into r subsets, each subset accommodating a mask area corresponding to the image area. For example, if the image area is 4, the second data set may be divided into 4 subsets, each subset accommodating a mask area corresponding to the image area.

Or, acquiring a third data set, and dividing the third data set into r subsets, wherein the third data set is a full motion mask image sequence, each frame of motion mask image corresponds to a preset frame image sequence, and the combination of the historical motion mask image from which the second data set is removed and the multiplexing of the second data set can be realized.

Alternatively, the image areas may be divided into grids at the center point of the image, or may be divided into rows or columns, as shown in fig. 5. The division mode can be selected according to the actual situation of the camera deployment site, for example, in the area with more banners, the division can be performed in a row mode, in the area frequently needing construction, the grid division can be performed by using the center point of the image, in the area between two trees, the division can be performed by using a column mode, and the like. The mask region and the image region have the same division. According to different camera deployment environments, different area divisions are adopted, and the judgment is carried out by combining corresponding image areas, so that the difference between the current background image and the preset background image can be further judged, and the judgment accuracy of whether the camera is shielded or not is further improved.

The motion information of each mask region may be calculated, and the motion information of the mask region may be used as the motion information of the corresponding image region. Specifically, an average motion mask value of each mask region may be calculated, where the calculation of the average motion mask value may be performed by the following equation:

where roi_mean_motion (i) is the average motion mask value of the ith mask area, roi_motion_count_list (i, j) is the mask value of the jth pixel point in the ith mask area, and t_frame is the number of all pixel points in the ith mask area.

The motion information of the image area is the same as the motion information of the mask area, and when the motion information of the mask area indicates a moving state, the corresponding image area is also indicated as a moving state.

Further, the average motion mask value of each mask region is compared with a preset region motion threshold, and if the average motion mask value of one mask region is greater than or equal to the region motion threshold, the mask region is indicated to be in a moving state, and the image region corresponding to the mask region is also indicated to be in a moving state. If the average motion mask value of one mask region is less than the region motion threshold, it is indicated that the mask region is in a stationary state, and the image region corresponding to the mask region is also in a stationary state.

403. And judging whether the camera is blocked or not according to the motion information of each image area.

In the embodiment of the invention, whether the camera is blocked can be judged according to the number of the image areas in the static state. For example, if more than 50% of the image area is in a static state, the current background image is dissimilar to the preset background image, and if the current background image is static in a large area, it can be determined that the camera is blocked.

In one possible embodiment, if the current background image is dissimilar from the preset background image and there is a large-area rest in the current background image, the calculation result may be sent to the corresponding staff for confirmation, where the confirmation result includes that the camera is blocked, a background is added, or a temporary background is added. For the result that the camera is blocked, for example, the camera is blocked by leaves or insect corpses, corresponding field personnel can be notified to maintain; for a newly added background, such as a newly added fixed dustbin or a newly added building, etc., the preset background map can be updated; for newly added temporary backgrounds, such as temporary construction cards or temporary banners, the reminding that the camera is blocked can be ignored.

In a possible embodiment, the preset background map also includes a plurality of preset image areas, and it may be determined, in particular, which image area is occluded by calculating the similarity between the preset image area in the preset background map and the corresponding image area in the current background map.

In the embodiment of the invention, whether the camera is shielded or not is judged by combining the motion information of each image area, so that the difference between the current background image and the preset background image can be further judged, and the judgment accuracy of whether the camera is shielded or not is further improved.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a camera occlusion detection device according to an embodiment of the present invention, as shown in fig. 6, the camera is fixedly disposed at a designated position, and the device includes:

an obtaining module 601, configured to obtain a motion mask map of a current frame image according to optical flow information between the current frame image and a previous adjacent frame image in a preset frame image sequence, where the motion mask map includes a mask value;

a construction module 602, configured to construct a current background map according to the motion mask map and a frame image corresponding to the motion mask map;

a first calculating module 603, configured to calculate a similarity between the current background image and a preset background image, and determine whether the similarity is smaller than a similarity threshold;

and a confirming module 604, configured to confirm that the camera is blocked if the similarity is less than a similarity threshold.

Optionally, as shown in fig. 7, the current background map includes a plurality of image areas, and the apparatus further includes:

A second calculating module 605, configured to calculate, if the similarity is smaller than a similarity threshold, motion information of each image area according to the motion mask map and the corresponding frame image;

and a judging module 606, configured to judge whether the camera is blocked according to the motion information of each image area.

Optionally, as shown in fig. 8, the optical flow information includes horizontal optical flow information and vertical optical flow information, and the acquiring module 601 includes:

a first calculation submodule 6011 for calculating an optical flow velocity according to the horizontal optical flow information and the vertical optical flow information;

a second calculation submodule 6012, configured to calculate a mask value corresponding to each pixel point in the current frame image according to the optical flow speed;

and the generation submodule 6013 is used for generating a motion mask diagram according to the mask values corresponding to the pixel points.

Optionally, as shown in fig. 9, the number of the motion mask maps is n frames, and the number of frame images corresponding to the motion mask maps is also n frames, where n is a positive integer greater than 0, and the building module 602 includes:

a first extraction submodule 6021, configured to extract a mask value sequence of each pixel point in the motion mask map according to the n-frame motion mask map, where a dimension of the mask value sequence is n;

The second extraction submodule 6022 is configured to extract a corresponding target pixel index according to a mask value sequence corresponding to a target pixel, where the target pixel is any one of the pixels in the motion mask map;

a first construction submodule 6023 is configured to construct a current background image based on the frame image corresponding to the motion mask image and the target pixel index.

Optionally, as shown in fig. 10, the target pixel index includes a target frame image position and a target pixel position, and the first building sub-module 6023 includes:

a determining unit 60231, configured to determine a target frame image according to the target frame image position, where the target frame image is one frame of n frames of frame images corresponding to the motion mask image;

an extracting unit 60232, configured to extract, according to the target pixel position, a pixel value of a corresponding pixel position in the target frame image as a pixel value of a current background pixel;

and a construction unit 60233, configured to construct a current background map based on the pixel values of the current background pixel points.

Optionally, as shown in fig. 11, the number of frame images in the preset frame image sequence is k frames, where k is a positive integer greater than 0, and the building module 602 includes:

The first processing submodule 6024 is used for sequentially sampling frame images in a preset frame image sequence and adding the frame images to a first data set, wherein the first data set contains n frames of newly sampled frame images, n is a positive integer greater than 0, and k is greater than or equal to n;

a second processing sub-module 6025, configured to sequentially obtain motion mask maps corresponding to frame images in a preset frame image sequence, and add the motion mask maps to a second data set, where the second data set accommodates n frames of latest motion mask maps, and the latest motion mask maps correspond to the latest sampled frame images;

a second construction submodule 6026 for constructing a current background map based on the first data set and the second data set.

Optionally, as shown in fig. 12, the second building sub-module 6026 includes:

a first judging unit 60261, configured to judge whether the latest frame image in the first data set is an mth frame, where m is a positive integer greater than 0;

an obtaining unit 60262, configured to obtain an h frame motion mask map to an h+a frame motion mask map if the frame image sampled most recently in the first dataset is an m-th frame, where h is greater than or equal to m, k is greater than or equal to h+a, and a is a positive integer greater than 0;

And a second judging unit 60263, configured to judge whether to stop the construction of the current background image in advance according to the h frame motion mask map to the h+a frame motion mask map when h+a is smaller than k.

Optionally, as shown in fig. 13, the mask value includes a motion mask value, and the second judging unit 60263 includes:

a calculating subunit 602631, configured to calculate, according to the motion mask value, a motion mask value duty ratio of each frame of motion mask map from the h frame of motion mask map to the h+a frame of motion mask map;

a first stopping subunit 602632, configured to stop construction of the background image in advance if the motion mask value duty ratio of each frame of motion mask map is smaller than a preset duty ratio threshold value in the h frame of motion mask map to the h+a frame of motion mask map, and take the latest constructed background image as the current background image;

and the second stopping subunit 602633 is configured to, if the motion mask value duty ratio of at least one frame of motion mask map is greater than the preset duty ratio threshold in the h frame of motion mask map to the h+a frame of motion mask map, take the finally constructed background image as the current background image until the h+a frame of motion mask map is the k frame of motion mask map.

It should be noted that the camera occlusion detection device provided by the embodiment of the invention can be applied to devices such as a mobile phone, a monitor, a computer, a server and the like which can perform camera occlusion detection.

The camera shielding detection device provided by the embodiment of the invention can realize each process realized by the camera shielding detection method in the embodiment of the method, and can achieve the same beneficial effects. In order to avoid repetition, a description thereof is omitted.

Referring to fig. 14, fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 14, including: memory 1402, processor 1401, and a computer program stored on the memory 1402 and executable on the processor 1401, wherein:

the processor 1401 is configured to call a computer program stored in the memory 1402, and execute the following steps:

Optionally, the current background map includes a plurality of image areas, and the processor 1401 further executes the steps including:

Optionally, the optical flow information includes horizontal optical flow information and vertical optical flow information, and the acquiring, by the processor 1401, the motion mask map of the current frame image according to optical flow information between the current frame image and a previous adjacent frame image in the preset frame image sequence includes:

Optionally, the number of the motion mask images is n frames, and the number of frame images corresponding to the motion mask images is also n frames, where n is a positive integer greater than 0, and the constructing, by the processor 1401, a current background image according to the motion mask images and the frame images corresponding to the motion mask images includes:

Optionally, the target pixel point index includes a target frame image position and a target pixel point position, and the constructing, by the processor 1401, the current background image based on the frame image corresponding to the motion mask image and the target pixel point index includes:

Optionally, the number of frame images in the preset frame image sequence is k frames, k is a positive integer greater than 0, and the constructing, by the processor 1401, a current background image according to the motion mask image and the frame images corresponding to the motion mask image includes:

Optionally, the constructing, by the processor 1401, a current context map based on the first data set and the second data set includes:

Optionally, the mask value includes a motion mask value, when h+a is smaller than k, the determining, according to the h frame motion mask map to the h+a frame motion mask map, whether to stop construction of the current background image in advance includes:

The electronic device may be a mobile phone, a monitor, a computer, a server, or the like, which can be used for detecting camera occlusion.

The electronic device provided by the embodiment of the invention can realize each process realized by the camera shielding detection method in the embodiment of the method, and can achieve the same beneficial effects, and in order to avoid repetition, the description is omitted.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements each process of the camera occlusion detection method provided by the embodiment of the invention, and can achieve the same technical effect, so that repetition is avoided, and no further description is provided herein.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM) or the like.

The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.

Claims

1. The camera shielding detection method is characterized by comprising the following steps of:

constructing a current background image according to the motion mask image and a frame image corresponding to the motion mask image, wherein the current background image comprises a plurality of image areas;

2. The method of claim 1, wherein the optical flow information includes horizontal optical flow information and vertical optical flow information, and the obtaining the motion mask map of the current frame image according to the optical flow information between the current frame image and the previous adjacent frame image in the preset frame image sequence includes:

3. The method of claim 1, wherein the number of the motion mask maps is n frames, and the number of frame images corresponding to the motion mask maps is also n frames, wherein n is a positive integer greater than 0, and the constructing the current background map from the motion mask maps and the frame images corresponding to the motion mask maps comprises:

4. The method of claim 3, wherein the target pixel point index includes a target frame image location and a target pixel point location, the constructing a current background map based on the frame image corresponding to the motion mask map and the target pixel point index comprising:

5. The method according to claim 1, wherein the number of frame images in the preset sequence of frame images is k frames, k is a positive integer greater than 0, and the constructing a current background image according to the motion mask image and the frame images corresponding to the motion mask image includes:

6. The method of claim 5, wherein the constructing a current context map based on the first dataset and the second dataset comprises:

7. The method of claim 6, wherein the mask value includes a motion mask value, and wherein when h+a is less than k, determining whether to stop construction of the current background image in advance according to the h-th frame motion mask map to the h+a-th frame motion mask map includes:

8. A camera occlusion detection device, the device comprising:

the construction module is used for constructing a current background image according to the motion mask image and the frame image corresponding to the motion mask image, wherein the current background image comprises a plurality of image areas;

The confirming module is used for respectively calculating the motion information of each image area according to the motion mask image and the corresponding frame image if the similarity is smaller than a similarity threshold value; and judging whether the camera is blocked or not according to the motion information of each image area.

9. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the camera occlusion detection method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps in the camera occlusion detection method of any of claims 1 to 7.