CN115965899B

CN115965899B - Video segmentation-based unmanned sweeping robot anomaly detection method and system

Info

Publication number: CN115965899B
Application number: CN202310252874.5A
Authority: CN
Inventors: 徐龙生; 孙振行; 庞世玺; 杨纪冲
Original assignee: Shandong Kailin Environmental Protection Equipment Co ltd
Current assignee: Shandong Kailin Environmental Protection Equipment Co ltd
Priority date: 2023-03-16
Filing date: 2023-03-16
Publication date: 2023-06-06
Anticipated expiration: 2043-03-16
Also published as: CN115965899A

Abstract

The invention discloses an anomaly detection method and system of an unmanned sweeping robot based on video segmentation, belonging to the technical field of artificial intelligence, and comprising the following steps: acquiring a monitoring video of the unmanned sweeping robot; performing frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame; obtaining a current time instance mask and a historical time instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model; obtaining a current time prediction instance mask according to the historical time instance mask and the trained generation model; calculating to obtain the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, and weighting the duty ratio and the intersection ratio to obtain the proportion of the current time video frame; and judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment. The accuracy of abnormal event detection is improved.

Description

Video segmentation-based unmanned sweeping robot anomaly detection method and system

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an unmanned sweeping robot car abnormality detection method and system based on video segmentation.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

In the sweeping process of the unmanned sweeping machine, unexpected conditions, such as object intrusion and machine stagnation, can be inevitably caused around the machine, and when the unmanned sweeping machine cannot accurately identify unexpected abnormal events, response judgment of the automatic sweeping machine can be affected to cause accidents, such as collision, road blockage and the like. In the existing video anomaly detection technology, a camera is generally used for acquiring a video around an unmanned sweeping vehicle, identifying moving objects in the surrounding video and judging whether an anomaly occurs, but because the moving objects in the video are not completely objects needing to be detected for anomaly, such as a fountain which can be used as a background, leaves blown by wind and the like, when all the moving objects are judged to be anomaly events, the error rate of anomaly event detection is increased.

Disclosure of Invention

In order to solve the problems, the invention provides an unmanned sweeping machine vehicle anomaly detection method and system based on video segmentation, which acquire a current moment instance mask and a predicted instance mask, calculate the duty ratio of the current moment instance mask in a current moment video frame and the intersection ratio of the current moment predicted instance mask and the current moment instance mask, calculate the specific gravity of the current moment video frame according to the duty ratio and the intersection ratio, and judge whether an anomaly event occurs at the current moment by utilizing the specific gravity of the current moment video frame, so that the distance of an instance object can be judged to be far or near, the anomaly behavior of the instance object can be judged, and the accuracy of anomaly event judgment is improved.

In order to achieve the above purpose, the invention adopts the following technical scheme:

in a first aspect, a method for detecting an anomaly of an unmanned sweeping robot based on video segmentation is provided, including:

acquiring a monitoring video of the unmanned sweeping robot;

performing frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame;

obtaining a current time instance mask and a historical time instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;

obtaining a current time prediction instance mask according to the historical time instance mask and the trained generation model;

calculating to obtain the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, and weighting the duty ratio and the intersection ratio to obtain the proportion of the current time video frame;

and judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.

Further, the video instance segmentation model detects instance targets from a current time video frame and a historical time video frame, a target detection set comprising a current time video frame target matrix and a historical time video frame target matrix is obtained, the similarity of the two adjacent time video frame target matrices is calculated, an affinity matrix set is obtained, the initial positions of the instance targets in the time video frames are defined through the target detection set, the initial positions of the instance targets are corrected through the affinity matrix set, and pixel areas, in which the similarity of the two adjacent time video frames in the initial positions of the instance targets is smaller than or equal to a set threshold value, are eliminated, so that instance masks of all the time instances are obtained.

Further, the convolutional neural network model is adopted as the generating model.

Further, the penalty functions employed by the generative model include an active range penalty function of the instance, an instance mask prediction error penalty function, and an optical flow penalty function of the instance mask.

Further, comparing the specific gravity of the video frame at the current moment with an abnormal event specific gravity threshold value, and judging that an abnormal event occurs at the current moment when the specific gravity of the video frame at the current moment is greater than or equal to the abnormal event specific gravity threshold value; and when the proportion of the video frame at the current moment is smaller than the proportion threshold value of the abnormal event, judging that the abnormal event does not occur at the current moment.

Further, the duty ratio and the cross ratio are weighted by the following formula to obtain the proportion of the video frame at the current moment

：

wherein ,

calculating a specific gravity for the weighting; />

Masking the duty cycle of the video frame at the current time for the current time instance; />

The cross ratio of the instance mask to the instance mask for the current time is predicted for the current time.

Further, when it is judged that an abnormal event occurs at the current time, the abnormal instance is highlighted in the video frame at the current time, and braking operation is performed on the unmanned sweeping machine.

In a second aspect, an anomaly detection system for an unmanned sweeping robot based on video segmentation is provided, including:

the video acquisition module is used for acquiring a monitoring video of the unmanned sweeping robot;

the frame dividing module is used for dividing the frame of the monitoring video to obtain a current moment video frame and a historical moment video frame;

the real instance mask obtaining module is used for obtaining a current instance mask and a historical instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;

the predicted instance mask obtaining module is used for obtaining a predicted instance mask of the current moment according to the historical moment instance mask and the trained generation model;

the specific gravity acquisition module of the video frame is used for calculating and obtaining the duty ratio of the current moment instance mask in the video frame at the current moment and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, and weighting the duty ratio and the intersection ratio to obtain the specific gravity of the video frame at the current moment;

the abnormal event judging module is used for judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.

In a third aspect, an electronic device is provided, including a memory, a processor, and computer instructions stored on the memory and running on the processor, where the computer instructions, when executed by the processor, perform the steps of a method for detecting anomalies in an unmanned sweeping vehicle based on video segmentation.

In a fourth aspect, a computer readable storage medium is provided for storing computer instructions that, when executed by a processor, perform the steps of a method for detecting anomalies in an unmanned floor sweeping vehicle based on video segmentation.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention obtains the current time instance mask and the predicted instance mask, calculates the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time predicted instance mask and the current time instance mask, calculates the proportion of the current time video frame according to the duty ratio and the intersection ratio, and judges whether an abnormal event occurs at the current time by utilizing the proportion of the current time video frame, thereby judging the distance of an instance object, judging the abnormal behavior of the instance object and improving the accuracy of judging the abnormal event.

2. According to the method, the built video example segmentation model is not trained by taking the fountain, the leaf blown by wind and the like which can be used as the background as examples, so that when the monitoring video is subjected to example segmentation through the trained video example segmentation model, objects which can be used as the background, such as the fountain, the leaf blown by wind and the like, are not segmented, and the error rate of abnormal event judgment is reduced.

Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application.

FIG. 1 is a flow chart of the method disclosed in example 1;

fig. 2 is a flowchart of example object segmentation learning of the method disclosed in embodiment 1.

Detailed Description

The invention will be further described with reference to the drawings and examples.

It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

Example 1

In this embodiment, a method for detecting an anomaly of an unmanned sweeping robot based on video segmentation is disclosed, as shown in fig. 1 and fig. 2, including:

acquiring a monitoring video of the unmanned sweeping robot;

The historical moment video frame is a video frame positioned before the current moment and comprises a plurality of video frames.

The video instance segmentation model detects instance targets from a current moment video frame and a historical moment video frame, obtains a target detection set comprising a current moment video frame target matrix and a historical moment video frame target matrix, calculates the similarity of two adjacent moment video frame target matrices, obtains an affinity matrix set, marks the initial positions of the instance targets in the moment video frames through the target detection set, corrects the initial positions of the instance targets through the affinity matrix set, and excludes pixel areas with the similarity of two adjacent moment video frames in the initial positions of the instance targets being smaller than or equal to a set threshold value, so as to obtain instance masks of each moment.

The video instance segmentation model comprises an object detector and a mask generator, wherein the object detector detects instance objects from a current moment video frame and a historical moment video frame, obtains an object detection set containing a current moment video frame object matrix and a historical moment video frame object matrix, calculates similarity of two adjacent moment video frame object matrices, obtains an affinity matrix set, enables the affinity matrix set and the object detection set to be simultaneously applied to the video frame set, inputs the three to the mask generator, delineates initial positions of instance objects in corresponding moment video frames through the object detection set, corrects the initial positions of the instance objects through the affinity matrix of the two adjacent moment video frames, excludes pixel areas with similarity of the two adjacent moment video frames in the initial positions of the instance objects smaller than or equal to a set threshold, and obtains instance masks of each moment, including the instance masks of the current moment and the historical moment.

The process of obtaining the trained video instance segmentation model is as follows:

i-a) acquiring a training video set comprisingNA bar video; dividing each video into

Zhang Shipin frames, constituting a video frame set->

When (when)K5->

I.e. each video frame set contains 5 consecutive video frames +.>

Is->

Time video frames.

I-b) aggregating all video frames

Input constructed video instance segmentation model +.>

In the step (a), an example mask of a video frame at each moment is obtained, specifically:

video frame aggregation using object detector

Target detection is carried out on five video frames in the video frame to obtain a target detection set

, wherein />

，/>

Is->

The method comprises the steps of calculating the similarity of two adjacent moment video frame target matrixes to obtain the affinity matrixes of the two adjacent moment video frames to form an affinity matrix set->

，/>

，/>

Is->

And->

A similarity matrix between the two. The affinity matrix is assembled->

Target detection set->

Simultaneously acting on a set of video frames>

And inputs the three into a mask generator, through the target detection set +.>

The initial position of an instance target in each corresponding moment video frame is outlined, then the initial position of the instance target is corrected through the affinity matrix of two adjacent moment video frames, the pixel area of which the similarity of two adjacent moment video frames in the initial position of the instance target is smaller than or equal to a set threshold value is eliminated, if the pixel area of the part is set to be 0, the specific position of the instance target is obtained, the instance target is output in the form of a mask, the instance masks of each moment are obtained, and a video instance mask set (I/II) is formed>

。/>

,/>

Is->

Temporal video frame instance mask wherein the mask generator is comprised of an automatic encoder.

In the embodiment, after the initial position of the instance target in the video frame is detected through the target detection set, the initial position of the instance target is corrected through the affinity matrix, and the pixel area which does not belong to the instance target is excluded through the similarity of the video frames at adjacent moments, so that the accuracy of the obtained instance mask is ensured.

I-c) calculating a loss function, and carrying out loss calculation on the generated video frame instance mask and a real instance label, so as to restrict the instance object segmentation model to obtain a more accurate mask.

Wherein the penalty functions include a size penalty function of the instance mask and an error penalty function of the instance mask prediction.

For the size of the instance mask in video, the segmentation mask of the instance object in the video frame should be within the range of the target detection bounding box, by the size loss function of the instance mask

The size penalty of its video instance mask is calculated, thereby constraining the size of the instance mask to be within the range of target detection:

wherein ,

is->

Time instance mask->

Is->

Time video frame target matrix,/->

To calculate +.>

Spill->

Function of the outer matrix of pixels, ">

For calculating the pixel point count function of the instance mask, < +.>

Is the number of instance target objects in the video frame.

Error loss function predicted by instance mask

An error of the instance mask prediction is calculated.

wherein ,

for calculating the pixel point count function of the instance mask, < +.>

For the number of instance object objects in the video frame, < >>

Is->

Moment video frame object matrix->

Middle->

Mask matrix of individual instance objects,/>

Is->

Time instance mask->

Middle->

Mask matrix of individual instance objects,/>

Is->

and />

Is a matrix of intersections of (a) and (b).

I-d) optimizing video instance segmentation model using loss function

And obtaining an optimized instance object segmentation model.

I-e) repeating the steps I-b) to I-d) to set training times to obtain an iterated model

For the trained video example segmentation model, the training times can be 3000 times when the training is implemented.

I-f) inputting all videos in the training video set into the model iterated in step I-e)

In, a sequence set of video instance masks is obtained +.>

。

The generating model is constructed by adopting a convolutional neural network.

The specific process for obtaining the trained generation model comprises the following steps:

II-a) training a video instance segmentation model to obtain a set of sequences of video instance masks

As a training set for generating models. Sequence set->

The number of the videos is the same as that used in the video instance segmentation model training, and the videos are shared

Pieces of video data, each piece of video takes 5 consecutive sets of video instance masks one by one>

,

,/>

Is->

Temporal video frame instance mask.

II-b) mask-sequence of instances

Four frames before->

Input to the generative model

In (3) an instance mask for predicting a fifth frame, obtaining a predicted instance mask +.>

. Specific generative model

Consists of an automatic encoder whose convolutional neural network learns the effective information of the normal event video mask, including the action information of the instance object mask and the appearance information, i.e. the scale of the mask in the video frameWhether the size is changed drastically, and the motion information is whether the shape of the mask in the video frame is shifted drastically. The future fifth frame is predicted from two features of the four virtual video instance object mask frames. The method comprises the steps of adopting a multi-branch generator, wherein the multi-branch generator is composed of a generating branch and a combining branch, the generating branch predicts different example objects by using U-Net, and the combining branch carries out cascade addition on the generated analysis prediction results to obtain a prediction example mask of a video frame.

II-c) calculating a loss function, wherein the generated prediction mask is required to be compared with the mask of the real video frame so as to calculate loss, and thus the generated model is optimized. The penalty consists of a number of constraints to continually generate a more accurate prediction mask. Furthermore, the scope of activity of the instance object mask should also be constrained, i.e., the scope of activity of the detection object.

The penalty functions employed by the generation model of the present embodiment include the active range penalty function of the instance, the instance mask prediction error penalty function, and the optical flow penalty function of the instance mask.

The process of obtaining the trained generative model comprises the following steps:

II-c-1) in real-time video, not the whole video frame has a target which needs to detect abnormality, under the condition of no abnormality, the behavior of the target can be in a predictable space range, correspondingly, the index variation of the example object mask is also in a predictable range, and the activity range constraint of the example object is added to improve the video abnormality detection performance. Calculating range losses by the active range loss function of an instance

The instance object exists in the form of pixels in the video frame, and the pixels are in a certain area, namely the range of the instance object mask and the image matrix scale where the instance object mask is located, so that the effect of predicting the instance object in a range is better than that of predicting the instance object in the whole video frame. The activity range loss function of the example is:

wherein ,

is->

Time of day prediction instance mask->

Is->

Time instance mask.

II-c-2) each instance object needs to be separately detected for abnormal event, a plurality of groups of instance object masks are arranged in one video frame, different instance object masks have different mask values for processing, and the difference mode can be used for calculating

and />

And loss of mask. Calculating error of instance mask prediction by mask prediction error loss function>

The mask prediction error loss function is: />

。

wherein ,

pixel point count function for calculating an instance object mask,/-, for example>

For the number of instance objects in a video frame, +.>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

The>

Prediction mask matrix of individual instance objects, +.>

Is->

and />

Is a matrix of intersections of (a) and (b).

II-c-3) computing optical flow loss of instance object mask by optical flow loss function of instance mask

The optical flow loss function of the instance mask is:

wherein ,

calculating a function of the optical flow for the 2D image, wherein +.>

For the number of instance objects in a video frame, +.>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

The>

The prediction mask matrix of the instance objects, the mask frame is an instance mask obtained by the video instance segmentation model.

II-d) optimization by loss function

And obtaining an optimized generation model.

II-e) repeating the steps II-b) to II-d) for a set number of times to obtain an iterative generation model, wherein the set number of times can be 300 times when the iterative generation model is a trained generation model.

And inputting the historical moment instance mask into a trained generation model to obtain the current moment prediction instance mask.

Calculating and obtaining the duty ratio of the video frame of the current time instance mask at the current time

Merging of current time instance mask with current time instance maskRatio->

Weighting the duty ratio and the cross-over ratio to obtain the proportion of the video frame at the current moment

。

According to the principle of near-large and far-small, the larger the size of the example object in the video frame is, the closer the example object is to the automatic unmanned sweeping robot vehicle is illustrated; and vice versa +.>

Mask +.>

Mask with actual instance

The smaller its value indicates that the predicted instance mask deviates from the truly occurring instance object, i.e., the instance object is more likely to experience abnormal behavior. In calculating->

At the time, due to the duty ratioSSum-to-cross ratioIoUIs not synchronous with the abnormality of the instance object, then the following formula will be used to take up the ratio +.>

And cross ratio->

Weighting to obtain specific gravity of current video frame

：

，

wherein ,

for weight calculation of the specific gravity, the value range is +.>

Weighting the duty ratio and the cross ratio to obtain the specific gravity of the current video frame +.>

By->

The method can judge whether the distance of the instance object is far or near or not, and can judge the abnormal behavior of the instance object.

Judging whether an abnormal event occurs at the current moment according to the proportion of the video frame at the current moment, wherein the abnormal event is specifically:

specific gravity of video frame at current moment

And an abnormal event specific gravity threshold ++>

Comparing, when the specific gravity of the video frame at the current moment is greater than or equal to the abnormal event specific gravity threshold value, namely +.>

When the abnormal event occurs at the current moment, judging that the abnormal event occurs; when the specific gravity of the video frame at the current moment is smaller than the specific gravity threshold value of the abnormal event, namely

And judging that no abnormal event occurs at the current moment.

When judging that an abnormal event occurs at the current moment, highlighting an abnormal instance in a video frame at the current moment, and performing braking operation on the unmanned sweeping machine, and when judging that the abnormal event does not occur, performing no braking operation on the unmanned sweeping machine, wherein the camera of the unmanned sweeping machine is always in a working mode, and after waiting for the end of the abnormal eventI.e. the detection of abnormal events in real-time video is restored to

And after the state, the automatic unmanned sweeping robot car resumes the working mode. When the road is crowded and the crowd is dense, the automatic unmanned sweeping machine vehicle can respond to the pause work due to the fact that the occupation of the example object mask is larger, and the real-time detection of the automatic unmanned sweeping machine vehicle is restored to the ++when the road is not crowded and the crowd is sparse any more>

The state will revert to the operational mode.

According to the method for detecting the abnormality of the unmanned sweeping robot based on video segmentation, disclosed by the embodiment, the objects to be detected are identified by fusing the instance segmentation of the real-time video, and the abnormal event detection is carried out on the instances in the video independently, so that the accuracy of the abnormal event detection is improved. In addition, the self-judgment of work when the road is clear and pause when the road is crowded can be realized, and other normal work can not be influenced. Judging whether an abnormal event occurs at the current moment by utilizing the proportion of the video frame at the current moment, and judging whether the distance of the instance object is far or near or judging the abnormal behavior of the instance object, thereby improving the accuracy of judging the abnormal event; the built video example segmentation model is not trained by taking a fountain, a leaf blown by wind and the like which can be used as a background as examples, so that when a monitoring video is subjected to example segmentation by the trained video example segmentation model, objects which can be used as the background, such as the fountain, the leaf blown by wind and the like, are not segmented, and the error rate of abnormal event judgment is reduced.

Example 2

In this embodiment, an anomaly detection system of an unmanned sweeping robot based on video segmentation is disclosed, comprising:

Example 3

In this embodiment, an electronic device is disclosed that includes a memory, a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the method for detecting anomalies in an unmanned sweeping robot based on video segmentation disclosed in embodiment 1.

Example 4

In this embodiment, a computer readable storage medium is disclosed for storing computer instructions that, when executed by a processor, perform the steps of the method for detecting anomalies in an unmanned sweeping vehicle based on video segmentation disclosed in embodiment 1.

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims

1. The method for detecting the abnormality of the unmanned sweeping robot based on video segmentation is characterized by comprising the following steps of:

acquiring a monitoring video of the unmanned sweeping robot;

obtaining a predicted instance mask at the current moment according to a historical moment instance mask and a trained generation model, wherein the generation model adopts a convolutional neural network model, a loss function of the generation model comprises an active range loss function of an instance, an instance mask prediction error loss function and an optical flow loss function of the instance mask, and the active range loss function of the instance is as follows:

wherein ,

for loss of range->

Is->

Time of day prediction instance mask->

Is->

A time instance mask;

the example mask prediction error loss function is:

wherein ,

error predicted for instance mask, +.>

For the number of instance objects in a video frame, +.>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

The>

Prediction mask matrix of individual instance objects, +.>

Is->

and />

Is a matrix of intersections of (a);

the optical flow loss function of the instance mask is:

wherein ,

optical flow loss for instance object mask, +.>

Calculating a function of the optical flow for the 2D image, wherein +.>

For the number of instance objects in a video frame, +.>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

The>

A prediction mask matrix of the individual instance objects, the mask frame being an instance mask obtained by a video instance segmentation model;

2. The method for detecting the anomaly of the unmanned sweeping machine vehicle based on video segmentation according to claim 1, wherein a video instance segmentation model detects instance targets from a current time video frame and a historical time video frame, obtains a target detection set comprising a current time video frame target matrix and a historical time video frame target matrix, calculates similarity of two adjacent time video frame target matrices, obtains an affinity matrix set, circles initial positions of instance targets in each time video frame through the target detection set, corrects the initial positions of the instance targets through the affinity matrix set, and excludes pixel areas with similarity of two adjacent time video frames in the initial positions of the instance targets being smaller than or equal to a set threshold value, so as to obtain instance masks of each time.

3. The method for detecting the anomaly of the unmanned sweeping robot based on video segmentation according to claim 1, wherein the specific gravity of the video frame at the current moment is compared with an anomaly event specific gravity threshold value, and when the specific gravity of the video frame at the current moment is greater than or equal to the anomaly event specific gravity threshold value, the anomaly event is judged to occur at the current moment; and when the proportion of the video frame at the current moment is smaller than the proportion threshold value of the abnormal event, judging that the abnormal event does not occur at the current moment.

4. The method for detecting the abnormality of the unmanned sweeping robot based on the video segmentation according to claim 1, wherein when the occurrence of the abnormal event at the current moment is judged, the abnormal instance is highlighted in the video frame at the current moment, and the unmanned sweeping robot is braked.

5. The anomaly detection method for unmanned sweeping vehicle based on video segmentation as set forth in claim 1, wherein the duty ratio and the cross-over ratio are weighted by the following formula to obtain the specific gravity of the video frame at the current moment

：

wherein ,

calculating a specific gravity for the weighting; />

6. An unmanned robot car anomaly detection system that sweeps floor based on video segmentation, characterized in that includes:

the predicted instance mask obtaining module is used for obtaining a predicted instance mask at the current moment according to the historical moment instance mask and a trained generating model, the generating model adopts a convolutional neural network model, a loss function of the generating model comprises an active range loss function of an instance, an instance mask prediction error loss function and an optical flow loss function of the instance mask, and the active range loss function of the instance is:

wherein ,

for loss of range->

Is->

Time of day prediction instance mask->

Is->

A time instance mask;

the example mask prediction error loss function is:

wherein ,

error predicted for instance mask, +.>

To calculatePixel point count function of instance object mask, +.>

For the number of instance objects in a video frame, +.>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

The>

Prediction mask matrix of individual instance objects, +.>

Is->

and />

Is a matrix of intersections of (a);

the optical flow loss function of the instance mask is:

wherein ,

masking instance objectsOptical flow loss->

Calculating a function of the optical flow for the 2D image, wherein +.>

For the number of instance objects in a video frame, +.>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

Time mask frame->

Mask matrix of individual instance objects,/>

Is->

The>

7. An electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of a method for detecting anomalies in an unmanned sweeping robot based on video segmentation as defined in any one of claims 1 to 6.

8. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of a method of unmanned robot cleaner anomaly detection based on video segmentation of any one of claims 1 to 6.