CN115965899A

CN115965899A - Unmanned sweeping robot vehicle abnormality detection method and system based on video segmentation

Info

Publication number: CN115965899A
Application number: CN202310252874.5A
Authority: CN
Inventors: 徐龙生; 孙振行; 庞世玺; 杨纪冲
Original assignee: Shandong Kailin Environmental Protection Equipment Co ltd
Current assignee: Shandong Kailin Environmental Protection Equipment Co ltd
Priority date: 2023-03-16
Filing date: 2023-03-16
Publication date: 2023-04-14
Anticipated expiration: 2043-03-16
Also published as: CN115965899B

Abstract

The invention discloses an unmanned sweeping robot vehicle abnormity detection method and system based on video segmentation, which belong to the technical field of artificial intelligence and comprise the following steps: acquiring a monitoring video of the unmanned sweeping robot vehicle; performing frame division on a monitoring video to obtain a current moment video frame and a historical moment video frame; obtaining a current moment instance mask and a historical moment instance mask according to a current moment video frame, a historical moment video frame and a trained video instance segmentation model; obtaining a prediction instance mask at the current moment according to the historical moment instance mask and the trained generation model; calculating and obtaining the proportion of the current moment instance mask in the current moment video frame and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, and weighting the proportion and the intersection ratio to obtain the proportion of the current moment video frame; and judging whether an abnormal event occurs at the current moment or not according to the proportion of the video frame at the current moment. The accuracy of abnormal event detection is improved.

Description

Unmanned sweeping robot vehicle anomaly detection method and system based on video segmentation

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an unmanned sweeping robot vehicle abnormity detection method and system based on video segmentation.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

In the process of sweeping the floor by the unmanned sweeping robot vehicle, sudden conditions such as object intrusion and machine stagnation are inevitably generated around the robot vehicle, and when the unmanned sweeping robot vehicle cannot accurately identify abnormal events such as the sudden conditions, the response judgment of the automatic sweeping robot vehicle is influenced, so that accidents such as collision, road blockage and the like are caused. In the existing video anomaly detection technology, videos around the unmanned sweeping robot vehicle are usually acquired through a camera, moving objects in the surrounding videos are identified, whether an anomaly occurs or not is judged, however, since the moving objects in the videos are not completely abnormal objects to be detected, such as a fountain which can be used as a background, leaves blown by wind, and the like, when all the moving objects are judged to be the anomaly, the error rate of anomaly detection is increased.

Disclosure of Invention

The invention aims to solve the problems and provides an unmanned sweeping robot vehicle abnormity detection method and system based on video segmentation, a current time instance mask and a prediction instance mask are obtained, the proportion of the current time instance mask in a current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask are calculated, the proportion of the current time video frame is obtained according to the proportion of the current time video frame, and whether an abnormal event occurs at the current time is judged by utilizing the proportion of the current time video frame.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, a method for detecting an abnormality of an unmanned sweeping robot vehicle based on video segmentation is provided, which includes:

acquiring a monitoring video of the unmanned sweeping robot vehicle;

performing frame division on a monitoring video to obtain a current moment video frame and a historical moment video frame;

obtaining a current moment instance mask and a historical moment instance mask according to a current moment video frame, a historical moment video frame and a trained video instance segmentation model;

obtaining a prediction instance mask at the current moment according to the historical moment instance mask and the trained generation model;

calculating and obtaining the proportion of the current moment instance mask in the current moment video frame and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, and weighting the proportion and the intersection ratio to obtain the proportion of the current moment video frame;

and judging whether an abnormal event occurs at the current moment according to the proportion of the video frame at the current moment.

Further, the video instance segmentation model detects instance targets from the current-time video frame and the historical-time video frame, obtains a target detection set containing a current-time video frame target matrix and a historical-time video frame target matrix, calculates similarity of the two adjacent time video frame target matrices, obtains an affinity matrix set, defines initial positions of the instance targets in the video frames at all times through the target detection set, corrects the initial positions of the instance targets through the affinity matrix set, excludes pixel regions with the similarity of the two adjacent time video frames in the initial positions of the instance targets being smaller than or equal to a set threshold, and obtains instance masks at all times.

Further, the generation model adopts a convolution neural network model.

Further, the loss function used by the generative model includes an instance active range loss function, an instance mask prediction error loss function, and an instance mask optical flow loss function.

Further, comparing the proportion of the video frame at the current moment with an abnormal event proportion threshold, and judging that an abnormal event occurs at the current moment when the proportion of the video frame at the current moment is greater than or equal to the abnormal event proportion threshold; and when the proportion of the video frame at the current moment is smaller than the proportion threshold of the abnormal event, judging that the abnormal event does not occur at the current moment.

Further, the proportion and the intersection ratio are weighted by adopting the following formula to obtain the proportion of the video frame at the current moment

：

wherein ,

calculating a specific gravity for the weighting; />

Masking the occupation ratio of the video frame at the current moment for the current moment instance; />

And predicting the intersection ratio of the instance mask and the instance mask at the current moment.

Further, when it is judged that an abnormal event occurs at the current moment, the abnormal instance is highlighted in the video frame at the current moment, and the unmanned sweeping robot vehicle is braked.

In a second aspect, a system for detecting an abnormality of an unmanned sweeping robot based on video segmentation is provided, which includes:

the video acquisition module is used for acquiring a monitoring video of the unmanned sweeping robot vehicle;

the frame dividing module is used for carrying out frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame;

the real instance mask acquiring module is used for acquiring a current instance mask and a historical instance mask according to the current video frame, the historical video frame and the trained video instance segmentation model;

the prediction instance mask obtaining module is used for obtaining a prediction instance mask at the current moment according to the historical moment instance mask and the trained generation model;

the proportion obtaining module of the video frame is used for calculating and obtaining the proportion of the current moment instance mask in the current moment video frame and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, weighting the proportion and the intersection ratio and obtaining the proportion of the current moment video frame;

and the abnormal event judging module is used for judging whether an abnormal event occurs at the current moment according to the proportion of the video frame at the current moment.

In a third aspect, an electronic device is provided, which includes a memory, a processor, and computer instructions stored in the memory and executed on the processor, where the computer instructions, when executed by the processor, perform the steps of the method for detecting an abnormality of an unmanned sweeping robot based on video segmentation.

In a fourth aspect, a computer-readable storage medium is provided for storing computer instructions, which when executed by a processor, perform the steps of a method for detecting abnormality of an unmanned sweeping robot based on video segmentation.

Compared with the prior art, the invention has the following beneficial effects:

1. the method obtains the current time instance mask and the prediction instance mask, calculates the proportion of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, calculates and obtains the proportion of the current time video frame according to the proportion and the intersection ratio, and judges whether an abnormal event occurs at the current time by using the proportion of the current time video frame, thereby not only judging the distance of the instance object, but also judging the abnormal behavior of the instance object and improving the accuracy of judging the abnormal event.

2. The constructed video instance segmentation model is not trained by taking the fountain and the wind-blown leaves which can be used as the background as examples, so that the fountain and the wind-blown leaves which can be used as the background can not be segmented when the example segmentation is carried out on the monitoring video through the trained video instance segmentation model, and the error rate of abnormal event judgment is reduced.

Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, are included to provide a further understanding of the application, and the description of the exemplary embodiments and illustrations of the application are intended to explain the application and are not intended to limit the application.

FIG. 1 is a flow chart of the method disclosed in example 1;

fig. 2 is a flow chart of an example object segmentation learning process of the method disclosed in embodiment 1.

Detailed Description

The invention is further described with reference to the following figures and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example 1

In this embodiment, a method for detecting an abnormality of an unmanned sweeping robot based on video segmentation is disclosed, as shown in fig. 1 and 2, the method includes:

acquiring a monitoring video of the unmanned sweeping robot vehicle;

The historical moment video frame is a video frame before the current moment and comprises a plurality of video frames.

The video instance segmentation model detects instance targets from a current-time video frame and a historical-time video frame, obtains a target detection set comprising a current-time video frame target matrix and a historical-time video frame target matrix, calculates the similarity of the target matrixes of the two adjacent time video frames to obtain an affinity matrix set, defines the initial position of the instance target in each time video frame through the target detection set, corrects the initial position of the instance target through the affinity matrix set, excludes pixel regions with the similarity of the two adjacent time video frames in the initial position of the instance target being less than or equal to a set threshold value, and obtains an instance mask at each time.

The video instance segmentation model comprises a target detector and a mask generator, wherein the target detector detects instance targets from a current-time video frame and a historical-time video frame, a target detection set comprising a current-time video frame target matrix and a historical-time video frame target matrix is obtained, the similarity of the target matrices of two adjacent time video frames is calculated, an affinity matrix set is obtained, the affinity matrix set and the target detection set are simultaneously acted on the video frame set and are input into the mask generator, the initial positions of the instance targets in the corresponding time video frames are defined through the target detection set, the initial positions of the instance targets are corrected through the affinity matrices of the two adjacent time video frames, pixel regions with the similarity of the two adjacent time video frames being smaller than or equal to a set threshold value in the initial positions of the instance targets are excluded, and instance masks at all times are obtained and comprise the current-time instance masks and the historical-time instance masks.

The process of obtaining the trained video instance segmentation model comprises the following steps:

i-a) obtaining a training video set, the video set comprisingNA bar video; dividing each video into

Opens a video frame, constitutes a set of video frames>

When is coming into contact withKIs 5, is selected>

I.e. each set of video frames comprises 5 consecutive video frames, <' > or>

Is->

Temporal video frames.

I-b) aggregating all video frames

Inputting constructed video instance segmentation model->

In the above, obtaining an instance mask of a video frame at each time specifically includes:

assembling video frames using object detectors

The five video frames in the video frame are subjected to target detection to obtain a target detection set

, wherein />

，/>

Is->

A target matrix of the video frames at the moment, the similarity of the target matrices of the video frames at two adjacent moments is calculated, an affinity matrix of the video frames at two adjacent moments is obtained, and an affinity matrix set is formed>

，/>

，/>

Is->

And/or>

A similarity matrix between them. Assembling the affinity matrix->

Target detection set>

Acting on a set of video frames simultaneously->

And inputting the three into a mask generator to detect the collection->

Defining initial positions of example targets in each corresponding moment video frame, correcting the initial positions of the example targets through affinity matrixes of two adjacent moment video frames, excluding pixel regions with similarity of two adjacent moment video frames being less than or equal to a set threshold value in the initial positions of the example targets, if the pixel regions of the portions are set to be 0, obtaining specific positions of the example targets, and performing image processing in the form of masksLine output, obtaining the mask of each time instance to form a set of mask of video instances>

。/>

,

Is->

Temporal video frame instance mask wherein the mask generator is comprised of an auto-encoder.

In the embodiment, after the initial position of the example target in the video frame is detected through the target detection set, the initial position of the example target is corrected through the affinity matrix, and the pixel regions which do not belong to the example target are excluded through the similarity of the video frames at adjacent moments, so that the accuracy of the acquired example mask is ensured.

I-c) calculating a loss function, and performing loss calculation on the generated video frame instance mask and a real instance label so as to restrict the instance object segmentation model to obtain a more accurate mask.

Wherein the penalty function includes a size penalty function for the instance mask and an error penalty function for the instance mask prediction.

For the size of the instance mask in the video, the segmentation mask of the instance object in the video frame should be within the range of the target detection bounding box, passing the size penalty function of the instance mask

The size loss of its video instance mask is calculated, thereby constraining the size of the instance mask to be within the range of target detection: />

wherein ,

is->

Time instance mask, <' > or>

Is->

The time instant video frame target matrix, < > or>

For calculating ^ in a video frame>

Overflows and/or is greater than>

A function of the outer pixel matrix->

For calculating a pixel count function for an instance mask>

The number of example target objects in the video frame.

Error loss function predicted by instance mask

The error of the example mask prediction is calculated.

wherein ,

to calculate a pixel count function for an instance mask, <' >>

For the number of example target objects in a video frame, <' >>

Is->

Moment video frame target matrix->

In a fifth or fifth sun>

Mask matrix for instance object based on>

Is->

Time instance mask->

In a fifth or fifth sun>

Mask matrix for instance object, <' >>

Is->

and />

The intersection matrix of (a).

I-d) optimization of video instance segmentation models using loss functions

And obtaining an optimized example object segmentation model.

I-e) repeating the steps I-b) to I-d) and setting training times to obtain an iterated model

In specific implementation, the training times may be set to 3000 times for the trained video instance segmentation model.

Ⅰ-f) Inputting all videos in the training video set into the model after iteration in the step I-e)

Get the sequence set of the video instance mask->

。

The generation model is constructed by adopting a convolution neural network.

The specific process of obtaining the trained generative model is as follows:

II-a) sequence set composed of video example masks obtained during training of video example segmentation model

As a training set for generating the model. Sequence set>

The number of the medium videos is the same as that of the videos used in the training of the video instance segmentation model, and the medium videos and the videos share the same number

Strip video data, each video taking 5 consecutive video instance mask sets ≥ one by one>

,

,/>

Is->

Temporal video frame instance masks.

II-b) masking sequences of instances

Is greater than or equal to the first four frames>

Input to generative model

To predict an instance mask for the fifth frame, to obtain a predicted instance mask for the fifth frame->

. Specific generation model>

The convolutional neural network can learn the effective information of the normal event video mask, including the action information and the appearance information of the example object mask, wherein the appearance information is whether the scale size of the mask in the video frame is changed drastically, and the action information is whether the shape of the mask in the video frame is deviated drastically. The future fifth frame is predicted by two features of four two virtual video instance object mask frames. And a multi-branch generator is adopted, the multi-branch generator consists of a generating branch and a merging branch, the generating branch predicts different instance objects by using U-Net, and the merging branch performs cascade addition on the generated analysis prediction results to obtain a prediction instance mask of a video frame. />

II-c) calculating a loss function, the generated prediction mask needs to be compared with the mask of the real video frame to calculate the loss, and therefore the generation model is optimized. The penalty consists of multiple constraints to generate more accurate prediction masks on an ongoing basis. In addition, the active range of the instance object mask should also be constrained, i.e., the range of detected object activity.

Therefore, the loss function adopted by the generated model of the embodiment includes an example active range loss function, an example mask prediction error loss function and an example mask optical flow loss function.

The process of obtaining a trained generative model comprises the following steps:

II-c-1) in real-time video, not the whole video frame has a target needing to detect abnormality, and under the condition of no abnormality, the behavior of the target can be in a predictable spatial range, and correspondingly, the index of an example object mask is also in a predictable changeWithin range, the active range constraint of the instance object is added to improve the performance of video anomaly detection. Computing range loss by an example Range loss function

The example object exists in the form of pixel points in the video frame, the pixel points are in a certain area, the area is the range of the example object mask, the image matrix scale of the example object mask is located, and the effect of predicting the example object in a range is better than that of predicting the example object in the whole video frame. Wherein the range loss function for an instance is:

wherein ,

is->

Time of day prediction instance mask, <' >>

Is->

The time instance mask.

II-c-2) each example object needs to be independently detected for abnormal events, a plurality of groups of example object masks are arranged in one video frame, different example object masks have different mask values to be processed, and calculation can be carried out in a difference mode

And

in between the loss of the mask. Calculating an example mask predicted error by a mask prediction error loss function>

Mask prediction errorThe difference loss function is: />

。

wherein ,

to calculate a pixel count function of an instance object mask, <' >>

For the number of instance objects in a video frame, <' >>

Is->

The ^ th or greater in the time mask frame>

Mask matrix for instance object, <' >>

Is->

The ^ th or greater in the time of day prediction mask frame>

A prediction mask matrix for an individual instance object, <' >>

Is->

and />

The intersection matrix of (c).

II-c-3) calculating the optical flow loss of an example object mask by means of the optical flow loss function of the example mask

The optical flow loss function of the example mask is:

wherein ,

calculating a function of the optical flow for the 2D image, wherein->

For the number of instance objects in a video frame,

is->

The ^ th or greater in the time mask frame>

Mask matrix for instance object, <' >>

Is->

The ^ th or greater in the time mask frame>

Mask matrix for instance object, <' >>

Is->

The ^ th or greater in the time of day prediction mask frame>

And predicting a mask matrix of the example object, wherein a mask frame is an example mask obtained by a video example segmentation model. />

II-d) optimization by loss function

To obtain the bestAnd generating a model after the transformation.

II-e) repeating the steps II-b) to II-d) for a set number of times to obtain an iterated generated model, wherein the set number of times can be 300 times in specific implementation.

And inputting the historical moment instance mask into a trained generation model to obtain the current moment prediction instance mask.

Calculating and obtaining the occupation ratio of the example mask at the current moment in the video frame at the current moment

And the intersection ratio of the current time instance mask and the current time instance mask>

Weighting the proportion and the intersection ratio to obtain the proportion of the video frame at the current moment>

。

According to the principle of large and small in the near and far, when the proportion is larger, the example object is closer to the automatic unmanned sweeping robot vehicle; and vice versa>

Masking for prediction instances>

And actual instance mask>

The smaller the intersection ratio of (a) and (b), the more the predicted instance mask deviates from the instance object which actually occurs, i.e. the instance object is more likely to have abnormal behavior. Is counted and->

Due to the ratio ofSCross-over ratioIoUIn contrast to the exampleWhether an abnormality of a picture is not synchronously correlated, then the ratio is based upon>

And the cross-over ratio->

Weighting to obtain the specific gravity of the current video frame>

：

，

wherein ,

for the weighted calculation of the specific gravity, the value range is->

Weighting the ratio and the cross-over ratio to obtain the specific gravity of the current video frame>

Passing through>

The distance of the example object can be judged, and the abnormal behavior of the example object can be judged.

Judging whether an abnormal event occurs at the current moment according to the proportion of the video frame at the current moment, specifically:

the specific weight of the video frame at the current moment is calculated

And an exceptional event specific gravity threshold>

Comparing, when the specific gravity of the video frame at the current moment is more than or equal to the abnormal event specific gravity threshold value, namely

Judging that an abnormal event occurs at the current moment; when the specific gravity of the video frame at the current moment is less than the abnormal event specific gravity threshold value, namely->

And judging that no abnormal event occurs at the current moment.

When the abnormal event happens at the current moment is judged, the abnormal instance is highlighted in the video frame at the current moment, the braking operation is carried out on the unmanned sweeping robot car, when the abnormal event does not happen, the braking operation is not carried out on the unmanned sweeping robot car, as the camera of the unmanned sweeping robot car is always in a working mode, after the abnormal event is finished, the abnormal event detection of the real-time video is recovered to

And after the state, the automatic unmanned sweeping robot vehicle resumes the working mode. When the road is crowded and crowds are dense, the automatic unmanned sweeping robot vehicle can make a response of work suspension due to the fact that the proportion of the example object mask is large, and the real-time detection of the automatic unmanned sweeping robot vehicle is recovered to the area/area ratio until the road is not crowded and the crowds are sparse any more>

The state is restored to the operating mode.

According to the method for detecting the abnormity of the unmanned sweeping robot vehicle based on video segmentation, the objects to be detected are identified by fusing real-time video instance segmentation, and abnormal event detection is carried out on the instances in the video independently, so that the accuracy of the abnormal event detection is improved. In addition, the self-judgment of work in a free road and suspension in a crowded road can be realized, and other normal work cannot be influenced. Whether an abnormal event occurs at the current moment is judged by utilizing the proportion of the video frame at the current moment, so that the distance of the example object can be judged, the abnormal behavior of the example object can be judged, and the accuracy of judging the abnormal event is improved; fountain and wind-blown leaves which can be used as backgrounds are not used as examples for training the constructed video example segmentation model, so that objects which can be used as backgrounds, such as the fountain and the wind-blown leaves, are not segmented when the trained video example segmentation model is used for carrying out example segmentation on the monitoring video, and the error rate of abnormal event judgment is reduced.

Example 2

In this embodiment, a system for detecting abnormality of an unmanned sweeping robot based on video segmentation is disclosed, comprising:

Example 3

In this embodiment, an electronic device is disclosed, which includes a memory, a processor, and a computer instruction stored in the memory and executed on the processor, where when the computer instruction is executed by the processor, the steps of the method for detecting an abnormality of an unmanned sweeping robot based on video segmentation disclosed in embodiment 1 are completed.

Example 4

In this embodiment, a computer-readable storage medium is disclosed, which is used for storing computer instructions, and when the computer instructions are executed by a processor, the steps of the method for detecting the abnormality of the unmanned sweeping robot vehicle based on video segmentation disclosed in embodiment 1 are completed.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. An unmanned sweeping robot vehicle abnormity detection method based on video segmentation is characterized by comprising the following steps:

acquiring a monitoring video of the unmanned sweeping robot vehicle;

obtaining a current moment instance mask and a historical moment instance mask according to the current moment video frame, the historical moment video frame and the trained video instance segmentation model;

and judging whether an abnormal event occurs at the current moment or not according to the proportion of the video frame at the current moment.

2. The method as claimed in claim 1, wherein the video instance segmentation model detects instance objects from a current-time video frame and a historical-time video frame, obtains a target detection set including a current-time video frame target matrix and a historical-time video frame target matrix, calculates similarity of two adjacent-time video frame target matrices, obtains an affinity matrix set, defines initial positions of the instance objects in the video frames at each time through the target detection set, corrects the initial positions of the instance objects through the affinity matrix set, excludes pixel regions of two adjacent-time video frames in the initial positions of the instance objects, the similarity of which is less than or equal to a set threshold value, and obtains instance masks at each time.

3. The method for detecting the abnormality of the unmanned sweeping robot vehicle based on the video segmentation as claimed in claim 1, wherein the generation model is a convolutional neural network model.

4. The method as claimed in claim 1, wherein the loss function of the generated model includes an instance range of motion loss function, an instance mask prediction error loss function, and an instance mask optical flow loss function.

5. The method for detecting the abnormality of the unmanned sweeping robot vehicle based on the video segmentation as claimed in claim 1, wherein the specific gravity of the video frame at the current moment is compared with an abnormal event specific gravity threshold, and when the specific gravity of the video frame at the current moment is greater than or equal to the abnormal event specific gravity threshold, it is determined that an abnormal event occurs at the current moment; and when the proportion of the video frame at the current moment is smaller than the proportion threshold of the abnormal event, judging that the abnormal event does not occur at the current moment.

6. The method for detecting the abnormality of the unmanned sweeping robot vehicle based on the video segmentation as claimed in claim 1, wherein when it is determined that the abnormal event occurs at the current time, the abnormal instance is highlighted in the video frame at the current time, and the unmanned sweeping robot vehicle is braked.

7. The method as claimed in claim 1, wherein the unmanned sweeping robot vehicle abnormality detection method based on video segmentation,the method is characterized in that the proportion and the cross-over ratio are weighted by adopting the following formula to obtain the proportion of the video frame at the current moment

：

wherein ,

calculating a specific gravity for the weighting; />

And predicting the intersection ratio of the instance mask and the current instance mask for the current time.

8. The utility model provides an unmanned robot vehicle anomaly detection system of sweeping floor based on video is cut apart which characterized in that includes:

9. An electronic device, comprising a memory and a processor, and computer instructions stored in the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the steps of the method for detecting abnormality of an unmanned sweeping robot based on video segmentation according to any one of claims 1-7.

10. A computer readable storage medium for storing computer instructions, which when executed by a processor, perform the steps of the method for detecting abnormality of unmanned sweeping robot vehicle based on video segmentation according to any one of claims 1-7.