CN107705334B

CN107705334B - Camera abnormity detection method and device

Info

Publication number: CN107705334B
Application number: CN201710742191.2A
Authority: CN
Inventors: 王乃岩; 黄秀坤
Original assignee: Beijing Tusimple Technology Co Ltd
Current assignee: Beijing Tusimple Technology Co Ltd
Priority date: 2017-08-25
Filing date: 2017-08-25
Publication date: 2020-08-25
Anticipated expiration: 2037-08-25
Also published as: CN107705334A

Abstract

The invention discloses a camera abnormity detection method and a device, which detect whether a camera is abnormal in real time through a computer vision method and give an alarm when the camera is detected to be abnormal. The method comprises the following steps: training a neural network according to a preset abnormal image set and a preset normal image set to obtain a semantic segmentation model for determining the type of each pixel in an image, wherein the type comprises an abnormal type and a normal type; inputting an image shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to the image; and judging whether the camera is abnormal according to the characteristic information corresponding to the image, and giving an alarm when the abnormality is determined.

Description

Camera abnormity detection method and device

Technical Field

The invention relates to the field of computer vision, in particular to a camera abnormity detection method and device.

Background

In an application scene of a vision-based unmanned vehicle, a robot, or the like, a camera mounted on an autonomous vehicle, an advanced assistant driving vehicle, a robot, or the like generally captures an environment around the vehicle, and a decision control unit makes a decision based on an image captured by the camera and controls the autonomous vehicle, the advanced assistant driving vehicle, the robot, or the like. Therefore, the quality of the image shot by the camera plays an important role in the decision of the decision control unit.

However, in practical applications, the camera may have poor quality of the captured image due to some abnormal situations, which may affect the accuracy of the decision making by the decision control unit, for example: the camera lens has insect corpses, viscous liquid is stained, raindrops and abnormal conditions such as overexposure or underexposure of images caused by problems of an exposure device of the lens. As shown in fig. 1A, 1B, 1C, 1D, 1E, and 1F, the abnormal images captured by the camera are respectively an opaque stain image, a raindrop image, a transparent stain image, an insect liquid image, an overexposed image, and an underexposed image. Therefore, how to detect the camera abnormality in real time and alarm becomes a technical problem to be solved urgently.

Disclosure of Invention

In view of the above problems, the present invention provides a camera abnormality detection method and device, which detects whether an abnormality occurs in a camera in real time by a computer vision method and alarms when an abnormality occurs.

In one aspect, an embodiment of the present invention provides a method for detecting camera anomalies, where the method includes:

training a neural network according to a preset abnormal image set and a preset normal image set to obtain a semantic segmentation model for determining the type of each pixel in an image, wherein the type comprises an abnormal type and a normal type;

inputting an image shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to the image;

and judging whether the camera is abnormal according to the characteristic information corresponding to the image, and giving an alarm when the abnormality is determined.

The embodiment of the present invention further provides a camera anomaly detection device, including:

the training unit is used for training the neural network according to a preset abnormal image set and a preset normal image set to obtain a semantic segmentation model for determining the type of each pixel in the image, wherein the type comprises an abnormal type and a normal type;

the information acquisition unit is used for inputting the image shot by the camera into the semantic segmentation model to obtain the characteristic information corresponding to the image;

and the abnormality detection unit is used for judging whether the camera is abnormal according to the characteristic information corresponding to the image and giving an alarm when the abnormality is determined.

According to the embodiment of the invention, on one hand, whether the camera is abnormal or not can be detected in real time, and an alarm can be given when the camera is abnormal, so that an operator can know the abnormality of the camera in time and process the camera in time; on the other hand, the neural network is trained through the abnormal image set and the normal image set, so that the robustness and the accuracy of the semantic segmentation model obtained through training can be improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.

FIGS. 1A to 1F are abnormal images photographed by a camera;

FIG. 2 is a flowchart of a method for detecting camera shooting anomalies according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a semantic segmentation model obtained by training a neural network according to an embodiment of the present invention;

FIG. 4 is a second schematic diagram of a semantic segmentation model obtained by training a neural network according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating an embodiment of determining whether a camera is abnormal according to feature information corresponding to consecutive multi-frame images;

FIGS. 6A-6C are process diagrams of an embodiment of the method of FIG. 5;

FIGS. 7A-7C are process diagrams of another embodiment of the method of FIG. 5;

FIG. 8 is a second schematic diagram illustrating a determination of whether a camera is abnormal according to feature information corresponding to consecutive multi-frame images according to an embodiment of the present invention;

FIG. 9 is a schematic diagram illustrating an embodiment of determining whether a camera is abnormal according to feature information corresponding to a frame of image;

FIG. 10 is a second schematic diagram illustrating a method for determining whether a camera is abnormal according to feature information corresponding to a frame of image according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an imaging abnormality detection apparatus according to an embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The camera abnormity detection method provided by the embodiment of the invention is mainly suitable for automatic driving vehicles, advanced auxiliary driving vehicles, robots and other artificial intelligent equipment.

Example one

Referring to fig. 2, a flowchart of a camera anomaly detection method according to an embodiment of the present invention is shown, where the method includes:

step 201, training the neural network according to a preset abnormal image set and a preset normal image set to obtain a semantic segmentation model for determining the type of each pixel in the image, wherein the type includes an abnormal type and a normal type.

In a first example, the set of abnormal images includes a transparent dirty image and an opaque dirty image, the normal type is no dirty, and the abnormal type is dirty.

In a second example, the set of abnormal images includes a transparent defacement image and an opaque defacement image, the normal type is no defacement, and the abnormal type includes a transparent defacement and an opaque defacement.

In a third example, the abnormal image set includes an overexposed image and an underexposed image, the abnormal type includes overexposure and underexposure, and the normal type is normal exposure.

As a fourth example, the set of abnormal images includes a transparent defacement image, an opaque defacement image, an exposed image, and an underexposed image, the normal types include normal exposure and no defacement, and the abnormal types include defacement, no defacement, overexposure, and underexposure.

As a fifth example, the set of abnormal images includes a clear defacement image, an opaque defacement image, an exposed image, and an underexposed image, the normal types include normal exposure and no defacement, and the abnormal types include clear defacement, opaque defacement, no defacement, overexposure, and underexposure.

In the embodiment of the present invention, the abnormal image set, the normal image set, the abnormal type, and the normal type are not limited to the first to fifth examples, and are not exhaustive.

In the embodiment of the invention, the abnormal image set can be obtained by respectively acquiring images through cameras of the same type (namely, cameras of the same type which are judged to be abnormal or not) with lenses containing different pollutants in advance. The normal image set can be obtained by acquiring images through cameras of the same type without defilements on the lens in advance. This application is not described in detail.

The neural network used for training the semantic segmentation model can be any neural network model with a semantic segmentation function, and the application is not strictly limited.

Step 202, inputting the image shot by the camera into the semantic segmentation model to obtain the characteristic information corresponding to the image.

And 203, judging whether the camera is abnormal or not according to the characteristic information corresponding to the image, and alarming when the abnormality is determined.

In an embodiment, the foregoing step 201 may be specifically implemented by the process shown in fig. 3, and includes:

step 201A, training the neural network by adopting an abnormal image set to obtain a primary semantic segmentation model.

Step 201B, inputting the normal images in the normal image set into the primary semantic segmentation model, outputting the semantic segmentation results corresponding to the normal images, and adding the normal images with wrong semantic segmentation results into the abnormal image set to obtain a new abnormal image set.

Compared with the abnormal image, the normal image has lower acquisition difficulty, and in order to improve the training efficiency of the semantic segmentation model, only the normal image which is output after passing through the primary semantic segmentation model and is detected wrongly is added into the abnormal image set.

Step 201C, training the primary semantic segmentation model by adopting a new abnormal image set to obtain a semantic segmentation model.

In the embodiment of the invention, after an initial semantic segmentation model is obtained through training of an abnormal image set, the initial semantic segmentation model is tested through a normal image set, an image (namely a positive sample) which is detected in error is selected, a label of a normal type is marked on each pixel of the image, and the label is added into the abnormal image set; and training the initial semantic segmentation model based on the new abnormal image set to obtain the semantic segmentation model, wherein the aim of the training is to improve the robustness of the semantic segmentation model and reduce the error detection rate of the semantic segmentation model by adding a positive sample for error detection.

Preferably, to further improve the robustness and accuracy of the trained semantic segmentation model, one skilled in the art may also derive an alternative as shown in fig. 4 to implement step 201 based on the teaching of the flowchart shown in fig. 3, where the method includes:

step 201A', the neural network is trained by adopting an abnormal image set, and a primary semantic segmentation model is obtained.

Step 201B', inputting the normal images in the normal image set into the primary semantic segmentation model, and outputting the semantic segmentation result corresponding to the normal images.

Step 201C ', determining the number of normal images with wrong semantic segmentation, judging whether a preset iteration stopping condition is met or not according to the number, if not, executing step 201D ', and if so, executing step 201F '.

In this embodiment of the present invention, the condition for stopping iteration may be that the number is smaller than a preset number threshold, or that a ratio of the number to the total number of normal images input into the initial semantic segmentation model is smaller than a preset proportion threshold. The present application is not limited strictly, and those skilled in the art can flexibly set the setting according to actual requirements.

Step 201D', adding the normal image with wrong semantic segmentation to the abnormal image set to obtain a new abnormal image set, and training the initial semantic segmentation model based on the new abnormal image set to obtain a middle-level semantic segmentation model.

Step 201E ', using the middle-level semantic segmentation model as an initial semantic segmentation model, and executing the step 201B'.

Step 201F', the primary semantic segmentation model is determined as the semantic segmentation model.

In the embodiment of the present invention, the specific implementation of the foregoing step 202 to step 203 can be implemented by, but not limited to, any of the following modes (mode 1 to mode 2):

mode 1

Step 202, inputting continuous multi-frame images shot by a camera into a semantic segmentation model to obtain characteristic information corresponding to each frame of image;

and 203, judging whether the camera is abnormal or not according to the characteristic information corresponding to the multi-frame images, and alarming when the abnormality is determined.

In practical application, if the camera is abnormal, the images continuously shot within a period of time are all abnormal, so that the method 1 combines the characteristic information corresponding to the continuous multi-frame images to judge whether the camera is abnormal more accurately.

In one embodiment, the feature information is a type to which each pixel output by the semantic segmentation model softmax layer belongs; that is, after an image is input into a semantic segmentation model, the softmax layer of the semantic segmentation model outputs the type to which each pixel belongs as feature information. In mode 1, in step 203, the specific implementation of determining whether the camera is abnormal according to the feature information corresponding to the multiple frames of images may be as follows: simulating a plurality of pixels at the same position in a multi-frame image into a virtual pixel, determining the type of the virtual pixel according to the type of the plurality of pixels, and taking the virtual pixel as a pixel at a corresponding position in a semantic segmentation result graph so as to obtain a semantic segmentation model; and when the semantic segmentation result graph contains pixels with abnormal types, the number of the pixels is judged to be larger than or equal to a preset number threshold value, and the camera is determined to be abnormal. The quantity threshold value is determined according to an empirical value, for example, according to an pollutant causing an abnormal type, for example, through experiments, pollutant areas with different sizes are caused on a camera lens by pollutants with different sizes, the influence degree of an image obtained by the camera under the pollutant areas with different sizes on a decision making control unit is judged, and the quantity threshold value is determined according to a lossless area with the minimum size, wherein the influence degree reaches a preset influence degree. As shown in fig. 5, after the N consecutive frame images P1, P2, …, PN are input to the semantic segmentation model, corresponding N semantic segmentation results G1, G2, …, GN are output from the softmax layer of the semantic segmentation model, and a semantic segmentation result map is obtained from the semantic segmentation results G1, G2, …, GN. Two specific examples are described below.

Example 1, assume that an abnormal image set includes a transparent dirty image and an opaque dirty image, the normal type is no dirty, the abnormal type is dirty, and the semantic segmentation model labels each pixel in the input image in a category, for example, if the type to which the pixel belongs is no dirty, the corresponding category label is 0, and if the type to which the pixel belongs is dirty, the corresponding category label is 1. Taking 4-frame images of 4 × 4 size continuously captured by a camera as an example, the 4-frame images are respectively represented by P1, P2, P3 and P4, and as shown in fig. 6A, the pixels in P1, P2, P3 and P4 are numbered as x1, x2, … and x16 sequentially from left to right and from top to bottom. Respectively inputting P1, P2, P3 and P4 into a semantic segmentation model to obtain a semantic segmentation result shown in FIG. 6B, and virtualizing pixels at the same positions (namely, the same numbers) in P1, P2, P3 and P4 into one pixel to obtain virtual pixels x1 ', x2 ', … and x16 '; and for each virtual pixel, determining the type of the virtual pixel according to the types of 4 pixels of the virtualized virtual pixel. For example, in the mode 1, the number N1 of the type 0 and the number N2 of the type 1 of the 4 pixels are counted, and the type corresponding to the number of values greater than a preset value (the value is generally set to be 1/2 of N, taking the continuous N-frame image as an example, and when the number of 0 of 10 x1 pixels in the 10-frame image exceeds 5, the type 0 of the corresponding virtual pixel is determined) in N1 and N2 is taken as the type to which the virtual pixel belongs, in the mode 2, the type with the larger values of N1 and N2 is taken as the type of the virtual pixel, taking the pixel x1 ' as an example, and the types to which the four pixels corresponding to the pixel x1 ' belong are 0001 respectively, so that the type to which the pixel x1 ' belongs is 0 (i.e., there is no contamination). Thus, a semantic segmentation result graph as shown in fig. 6C is obtained.

Example 2, assuming that the abnormal image set includes a transparent dirty image, an opaque dirty image, an underexposed image, and an overexposed image, where the normal types are no dirt and normal exposure, and the abnormal types are dirt, overexposure, and underexposed, the semantic segmentation model labels a category for each pixel in the input image, for example, if the type is no dirt, the category label is 0, if the type is dirt, the category label is 1, if the type is overexposed, the category label is 2, and if the type is underexposed, the category label is 4. Taking a 6-frame image with a size of 4 × 4 continuously captured by a camera as an example, the 6-frame image is represented by P1, P2, P3, P4, P5 and P6, and as shown in fig. 7A, the pixels in the P1, P2, P3, P4, P5 and P6 are numbered as x1, x2, … and x16 sequentially from left to right and from top to bottom. Respectively inputting P1, P2, P3, P4, P5 and P6 into a semantic segmentation model to obtain a semantic segmentation result shown in FIG. 7B, and virtualizing pixels at the same positions (namely, the same numbers) in P1, P2, P3, P4, P5 and P6 into one pixel to obtain virtual pixels x1 ', x2 ', … and x16 '; for each virtual pixel, determining the type to which the virtual pixel belongs according to the type to which 6 pixels of the virtual pixel belong, for example, counting the number N1 of the type 0, the number N2 of the type 1, the number N3 of the type 2, and the number N4 of the type 3 in the 6 pixels, taking the type with a large value in N1, N2, N3, and N4 as the type of the virtual pixel, taking the pixel x1 ' as an example, the types to which the 6 pixels corresponding to the pixel x1 ' belong are 0, 1, 2, and 3, respectively, so that the type to which the pixel x1 ' belongs is 1 (i.e., there is a stain), thereby obtaining the segmentation semantic result graph shown in fig. 7C.

In another embodiment, the feature information is a feature map output by a neural network convolution layer previous to the softmax layer in the semantic segmentation model; namely, after the image is input into the semantic segmentation model, the feature map output by the neural network convolution layer which is the layer before the softmax layer of the semantic segmentation model is used as feature information. In mode 1, the specific implementation of determining whether the camera is abnormal according to the feature information corresponding to the multiple frames of images in step 203 may be as follows: merging the characteristic graphs corresponding to the multiple frames of images, and inputting the merged characteristic graphs into a preset first post-processing model to obtain the result of whether the camera is abnormal. In the embodiment of the present invention, after obtaining the semantic segmentation model through training of the abnormal image set and the normal image set, the neural network model is trained according to the sample image (the sample image may be an image in the abnormal image set, or may not be an image in the abnormal image set, which is not strictly limited in the present application) and the trained semantic segmentation model to obtain the first post-processing model, that is, the semantic segmentation model remains unchanged during the training process. Or, the neural network model for training the semantic segmentation model and the neural network model for training the first post-processing model may be jointly trained through the abnormal image set and the normal image set to obtain the semantic segmentation model and the first post-processing model. The training mode may be an existing neural network training mode, such as a gradient descent mode, and the present application is not limited thereto. The neural network model used for training the post-processing model may be a full-connection layer network or a convolution layer network, and the present application is not limited strictly. Specifically, as shown in fig. 8, after the multiple frames of images P1, P2, …, and PN are input into the semantic segmentation model, N feature maps T1, T2, …, and TN corresponding to the multiple frames of images are output from the neural network convolution layer that is one layer before the softmax layer of the semantic segmentation model, and feature maps T1, T2, …, and TN are combined into feature information of one three-dimensional space according to a preset order and input into the first post-processing model, so as to obtain result information of whether the camera is normal. Of course, a person skilled in the art may also use a feature map output by any neural network convolution layer before the softmax layer in the semantic segmentation model as the feature information in the embodiment of the present invention, and obtain the semantic segmentation result map according to the manner shown in fig. 8, which is not limited in this application.

The preset sequence can be the time sequence of the collected multiple frames of images. The preset sequence is the same as the sequence adopted by the pre-training post-processing model.

Mode 2

Step 202, inputting an image shot by a camera into a semantic segmentation model to obtain characteristic information corresponding to the image respectively;

and step 203, judging whether the camera is normal or not according to the characteristic information corresponding to the image.

The mode 2 can realize that the camera abnormity judgment is carried out once when one frame of image is shot, the camera abnormity judgment is carried out again after continuous multi-frame images are shot, and compared with the mode 1, the mode 2 has stronger real-time performance. The skilled person can select the mode 1 or the mode 2 according to the actual requirement, and the application is not limited strictly.

In one embodiment, the feature information is a type to which each pixel output by the semantic segmentation model softmax layer belongs; that is, after an image is input into a semantic segmentation model, the softmax layer of the semantic segmentation model outputs the type to which each pixel belongs as feature information. In a mode 2, in the step 203, determining whether the camera is abnormal according to the feature information corresponding to the image includes: and judging whether the number of the pixels of which the types are abnormal in the image is greater than or equal to a preset number threshold, if so, determining that the camera is abnormal, and otherwise, determining that the camera is normal. The quantity threshold is determined empirically, for example, based on the type of insult causing the anomaly. For example, through experiments, the stained areas with different sizes are caused to the camera lens by the stained objects with different sizes, the influence degree of the image obtained by the camera under the stained areas with different sizes on the decision making of the decision making control unit is judged, and the quantity threshold value is determined according to the smallest-sized lossless area with the influence degree reaching the preset influence degree. As shown in fig. 9, after the image P is input into the semantic segmentation model, the semantic segmentation result G corresponding to the image P is output from the softmax layer of the semantic segmentation model, and the result information of whether the camera is normal is obtained from the semantic segmentation result G.

In another embodiment, the feature information is a feature map output by a neural network convolution layer previous to the softmax layer in the semantic segmentation model; namely, after the image is input into the semantic segmentation model, the feature map output by the neural network convolution layer which is the layer before the softmax layer of the semantic segmentation model is used as feature information. In mode 2, the determining, in the step 203, whether the camera is abnormal according to the feature information corresponding to the image specifically includes: and inputting the characteristic diagram corresponding to the image into a preset second post-processing model to obtain the result of whether the camera is abnormal or not. In the embodiment of the present invention, after the semantic segmentation model is obtained by training the abnormal image set and the normal image set, the neural network model is trained according to the sample image (the sample image may be an image in the abnormal image set, or may not be an image in the abnormal image set, which is not strictly limited in the present application) and the trained semantic segmentation model to obtain the second post-processing model, that is, the semantic segmentation model is kept unchanged during the training process. Or, the neural network model for training the semantic segmentation model and the neural network model for training the second post-processing model may be jointly trained through the abnormal image set and the normal image set to obtain the semantic segmentation model and the second post-processing model. The training mode may be an existing neural network training mode, such as a gradient descent mode, and the present application is not limited thereto. The neural network model used for training the post-processing model may be a full-connection layer network or a convolution layer network, and the present application is not limited strictly. Specifically, as shown in fig. 10, after the image P is input into the semantic segmentation model, the corresponding feature map T is output from the neural network convolution layer that is one layer before the softmax layer of the semantic segmentation model, and the feature map T is input into the second post-processing model, so as to obtain result information on whether the camera is normal. Of course, a person skilled in the art may also use a feature map output by any neural network convolution layer before the softmax layer in the semantic segmentation model as the feature information in the embodiment of the present invention, and the present application is not limited thereto.

On one hand, the camera abnormity detection method provided by the embodiment of the invention can realize the real-time detection of whether the camera is abnormal or not and alarm when the camera is abnormal, so that an operator can know the abnormity of the camera in time and process the camera in time; on the other hand, the neural network is trained through the abnormal image set and the normal image set, so that the robustness and the accuracy of the semantic segmentation model obtained through training can be improved.

Example two

Based on the same concept of the camera anomaly detection method provided by the first embodiment, the second embodiment of the present invention provides a camera anomaly detection device, the structure of which is shown in fig. 11, and the device comprises a training unit 11, an information acquisition unit 12, and an anomaly detection unit 13, wherein:

the training unit 11 is configured to train the neural network according to a preset abnormal image set and a preset normal image set, to obtain a semantic segmentation model for determining a type to which each pixel in the image belongs, where the type includes an abnormal type and a normal type.

The specific information of the abnormal image set, the normal image set, the abnormal type and the normal type may be as in the first to fifth examples of the first embodiment, and will not be described herein again.

An information obtaining unit 12, configured to input an image captured by a camera into the semantic segmentation model, so as to obtain feature information corresponding to the image.

And the abnormality detection unit 13 is used for judging whether the camera is abnormal according to the characteristic information corresponding to the image and giving an alarm when the abnormality is determined.

The foregoing information acquisition unit 12 and the abnormality detection unit 13 can be implemented in, but not limited to, the following two ways:

mode 1

The information obtaining unit 12 is specifically configured to: inputting continuous multi-frame images shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to each frame of image;

the abnormality detection unit 13 is specifically configured to: and judging whether the camera is abnormal or not according to the characteristic information corresponding to the multi-frame images, and giving an alarm when the abnormality is determined.

In one example, the feature information is a type to which each pixel output by a softmax layer in the semantic segmentation model belongs; the abnormality detecting unit 13 determines whether the camera is abnormal according to the feature information corresponding to the multi-frame image, and specifically includes: simulating a plurality of pixels at the same position in a multi-frame image into a virtual pixel, determining the type of the virtual pixel according to the type of the plurality of pixels, and taking the virtual pixel as a pixel at a corresponding position in a semantic segmentation result graph so as to obtain the semantic segmentation result graph; and determining that the camera is abnormal when the semantic segmentation result graph contains pixels with abnormal types in a number greater than or equal to a preset number threshold.

In another example, the feature information is a feature map output by a neural network convolution layer previous to the softmax layer in the semantic segmentation model; the abnormality detecting unit 13 determines whether the camera is abnormal according to the feature information corresponding to the multi-frame image, and specifically includes: merging the characteristic graphs corresponding to the multiple frames of images, and inputting the merged characteristic graphs into a preset first post-processing model to obtain a result of whether the camera is abnormal; the first post-processing model is obtained by adopting a sample image and the semantic segmentation model to train a neural network model in advance.

Mode 2

The information obtaining unit 12 is specifically configured to: inputting an image shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to the image;

the abnormality detection unit 13 is specifically configured to: and judging whether the camera is abnormal according to the characteristic information corresponding to the image, and giving an alarm when the abnormality is determined.

In one example, the feature information is a type to which each pixel output by a softmax layer in the semantic segmentation model belongs; the abnormality detecting unit 13 determines whether the camera is abnormal according to the feature information corresponding to the image, and specifically includes: and judging whether the number of the pixels of which the types are abnormal in the image is greater than or equal to a preset number threshold, if so, determining that the camera is abnormal, and otherwise, determining that the camera is normal.

In another example, the feature information is a feature map output by a neural network convolution layer previous to the softmax layer in the semantic segmentation model; the abnormality detecting unit 13 determines whether the camera is abnormal according to the feature information corresponding to the image, and specifically includes: inputting the characteristic diagram corresponding to the image into a preset second post-processing model to obtain a result of whether the camera is abnormal or not; and the second post-processing model is obtained by adopting a sample image and the semantic segmentation model to train a neural network model in advance.

The training unit 11 is specifically configured to: training the neural network by adopting an abnormal image set to obtain a primary semantic segmentation model; inputting normal images in a normal image set into the primary semantic segmentation model, outputting semantic segmentation results corresponding to the normal images, and adding normal images with wrong semantic segmentation results into the abnormal image set to obtain a new abnormal image set; and training the primary semantic segmentation model by adopting a new abnormal image set to obtain a semantic segmentation model.

The foregoing training unit 11 may also be implemented by the process shown in fig. 4, which is not described herein again.

The technical scheme of the invention brings the following technical effects:

the method has the technical effects that after the neural network model is trained through the preset abnormal image set to obtain the initial semantic segmentation model, the initial semantic segmentation model is tested through the normal image set, the normal image (namely, the positive sample) with the detection error is added into the abnormal image set to obtain a new abnormal image set, and then the initial semantic segmentation model is trained based on the new abnormal image set. According to the embodiment of the invention, the normal image with error detection in the normal image set is automatically added into the abnormal image set, and the normal image with error detection has typicality reflecting the detection error of the initial semantic segmentation model, so that the normal image with error detection can form a good training positive sample, the training positive sample is added into the abnormal image set, so that the abnormal image set can be automatically expanded, the labor cost is reduced, and the semantic segmentation model trained on the basis of the new abnormal image set has stronger robustness and lower error detection rate.

Technical effect 2, in practical application, if the camera is abnormal, the images continuously shot within a period of time are all abnormal, and therefore, whether the camera is abnormal or not is judged more accurately by combining the characteristic information corresponding to the continuous multi-frame images.

The foregoing is the core idea of the present invention, and in order to make the technical solutions in the embodiments of the present invention better understood and make the above objects, features and advantages of the embodiments of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention are further described in detail with reference to the accompanying drawings.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A camera abnormality detection method characterized by comprising:

2. The method according to claim 1, wherein the image captured by the camera is input into the semantic segmentation model to obtain feature information corresponding to the image, specifically: inputting continuous multi-frame images shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to each frame of image;

judging whether the camera is abnormal according to the characteristic information corresponding to the image, specifically: and judging whether the camera is abnormal or not according to the characteristic information corresponding to the multi-frame image.

3. The method according to claim 2, wherein the feature information is a type to which each pixel output by a softmax layer in the semantic segmentation model belongs;

judging whether the camera is abnormal according to the characteristic information corresponding to the multi-frame images, specifically comprising:

simulating a plurality of pixels at the same position in a multi-frame image into a virtual pixel, determining the type of the virtual pixel according to the type of the plurality of pixels, and taking the virtual pixel as a pixel at a corresponding position in a semantic segmentation result graph so as to obtain the semantic segmentation result graph;

and when the semantic segmentation result graph contains pixels with abnormal types, the number of the pixels is judged to be larger than or equal to a preset number threshold value, and the camera is determined to be abnormal.

4. The method according to claim 2, wherein the feature information is a feature map output by a neural network convolution layer previous to a softmax layer in the semantic segmentation model;

merging the characteristic graphs corresponding to the multiple frames of images, and inputting the merged characteristic graphs into a preset first post-processing model to obtain a result of whether the camera is abnormal; the first post-processing model is obtained by adopting a sample image and the semantic segmentation model to train a neural network model in advance.

5. The method according to claim 1, wherein the image captured by the camera is input into the semantic segmentation model to obtain feature information corresponding to the image, specifically: inputting an image shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to the image;

judging whether the camera is abnormal according to the characteristic information corresponding to the image, specifically: and judging whether the camera is abnormal or not according to the characteristic information corresponding to the image.

6. The method according to claim 5, wherein the feature information is a type to which each pixel output by a softmax layer in the semantic segmentation model belongs;

judging whether the camera is abnormal according to the characteristic information corresponding to the image, which specifically comprises the following steps:

and judging whether the number of the pixels of which the types are abnormal in the image is greater than or equal to a preset number threshold, if so, determining that the camera is abnormal, and otherwise, determining that the camera is normal.

7. The method according to claim 5, wherein the feature information is a feature map output by a neural network convolution layer previous to a softmax layer in the semantic segmentation model;

inputting the characteristic diagram corresponding to the image into a preset second post-processing model to obtain a result of whether the camera is abnormal or not; and the second post-processing model is obtained by adopting a sample image and the semantic segmentation model to train a neural network model in advance.

8. The method according to any one of claims 1 to 7, wherein the training of the neural network is performed according to a preset abnormal image set and a normal image set to obtain a semantic segmentation model for determining the abnormal type of each pixel in the image, and specifically comprises:

training the neural network by adopting an abnormal image set to obtain a primary semantic segmentation model;

inputting normal images in a normal image set into the primary semantic segmentation model, outputting semantic segmentation results corresponding to the normal images, and adding normal images with wrong semantic segmentation results into the abnormal image set to obtain a new abnormal image set;

and training the primary semantic segmentation model by adopting a new abnormal image set to obtain a semantic segmentation model.

9. The method of any of claims 1 to 7, wherein the set of abnormal images comprises transparent defacement images and opaque defacement images, the normal type is no defacement, the abnormal type is defacement, or the abnormal type comprises transparent defacement and opaque defacement;

or the abnormal image set comprises an overexposed image and an underexposed image, the abnormal type comprises overexposure and underexposure, and the normal type is normal exposure;

or the abnormal image set comprises a transparent stained image, an opaque stained image, an exposed image and an underexposed image, the normal type comprises normal exposure and no stain, and the abnormal type comprises stain, no stain, overexposure and underexposure; alternatively, the types of anomalies include clear smudge, opaque smudge, no smudge, overexposure, and underexposure.

10. A camera abnormality detection device characterized by comprising:

11. The apparatus according to claim 10, wherein the information obtaining unit is specifically configured to: inputting continuous multi-frame images shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to each frame of image;

the abnormality detection unit is specifically configured to: and judging whether the camera is abnormal or not according to the characteristic information corresponding to the multi-frame images, and giving an alarm when the abnormality is determined.

12. The apparatus according to claim 11, wherein the feature information is a type to which each pixel output by a softmax layer in the semantic segmentation model belongs;

the abnormality detection unit judges whether the camera is abnormal according to the feature information corresponding to the multi-frame image, and specifically includes:

13. The apparatus according to claim 11, wherein the feature information is a feature map output by a neural network convolution layer previous to a softmax layer in the semantic segmentation model;

14. The apparatus according to claim 10, wherein the information obtaining unit is specifically configured to: inputting an image shot by a camera into the semantic segmentation model to obtain characteristic information corresponding to the image;

the abnormality detection unit is specifically configured to: and judging whether the camera is abnormal according to the characteristic information corresponding to the image, and giving an alarm when the abnormality is determined.

15. The apparatus according to claim 14, wherein the feature information is a type to which each pixel output by a softmax layer in the semantic segmentation model belongs;

the abnormality detection unit judges whether the camera is abnormal according to the feature information corresponding to the image, and specifically includes:

16. The apparatus according to claim 14, wherein the feature information is a feature map output by a neural network convolution layer previous to a softmax layer in the semantic segmentation model;

17. The device according to any one of claims 10 to 16, wherein the training unit is specifically configured to:

18. The apparatus according to any one of claims 10 to 16, wherein the abnormal image set comprises a transparent defacement image and an opaque defacement image, the normal type is no defacement, the abnormal type is defacement, or the abnormal type comprises a transparent defacement and an opaque defacement;