CN115965899B - Video segmentation-based unmanned sweeping robot anomaly detection method and system - Google Patents

Video segmentation-based unmanned sweeping robot anomaly detection method and system Download PDF

Info

Publication number
CN115965899B
CN115965899B CN202310252874.5A CN202310252874A CN115965899B CN 115965899 B CN115965899 B CN 115965899B CN 202310252874 A CN202310252874 A CN 202310252874A CN 115965899 B CN115965899 B CN 115965899B
Authority
CN
China
Prior art keywords
instance
mask
video frame
video
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310252874.5A
Other languages
Chinese (zh)
Other versions
CN115965899A (en
Inventor
徐龙生
孙振行
庞世玺
杨纪冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Kailin Environmental Protection Equipment Co ltd
Original Assignee
Shandong Kailin Environmental Protection Equipment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Kailin Environmental Protection Equipment Co ltd filed Critical Shandong Kailin Environmental Protection Equipment Co ltd
Priority to CN202310252874.5A priority Critical patent/CN115965899B/en
Publication of CN115965899A publication Critical patent/CN115965899A/en
Application granted granted Critical
Publication of CN115965899B publication Critical patent/CN115965899B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an anomaly detection method and system of an unmanned sweeping robot based on video segmentation, belonging to the technical field of artificial intelligence, and comprising the following steps: acquiring a monitoring video of the unmanned sweeping robot; performing frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame; obtaining a current time instance mask and a historical time instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model; obtaining a current time prediction instance mask according to the historical time instance mask and the trained generation model; calculating to obtain the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, and weighting the duty ratio and the intersection ratio to obtain the proportion of the current time video frame; and judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment. The accuracy of abnormal event detection is improved.

Description

Video segmentation-based unmanned sweeping robot anomaly detection method and system
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an unmanned sweeping robot car abnormality detection method and system based on video segmentation.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In the sweeping process of the unmanned sweeping machine, unexpected conditions, such as object intrusion and machine stagnation, can be inevitably caused around the machine, and when the unmanned sweeping machine cannot accurately identify unexpected abnormal events, response judgment of the automatic sweeping machine can be affected to cause accidents, such as collision, road blockage and the like. In the existing video anomaly detection technology, a camera is generally used for acquiring a video around an unmanned sweeping vehicle, identifying moving objects in the surrounding video and judging whether an anomaly occurs, but because the moving objects in the video are not completely objects needing to be detected for anomaly, such as a fountain which can be used as a background, leaves blown by wind and the like, when all the moving objects are judged to be anomaly events, the error rate of anomaly event detection is increased.
Disclosure of Invention
In order to solve the problems, the invention provides an unmanned sweeping machine vehicle anomaly detection method and system based on video segmentation, which acquire a current moment instance mask and a predicted instance mask, calculate the duty ratio of the current moment instance mask in a current moment video frame and the intersection ratio of the current moment predicted instance mask and the current moment instance mask, calculate the specific gravity of the current moment video frame according to the duty ratio and the intersection ratio, and judge whether an anomaly event occurs at the current moment by utilizing the specific gravity of the current moment video frame, so that the distance of an instance object can be judged to be far or near, the anomaly behavior of the instance object can be judged, and the accuracy of anomaly event judgment is improved.
In order to achieve the above purpose, the invention adopts the following technical scheme:
in a first aspect, a method for detecting an anomaly of an unmanned sweeping robot based on video segmentation is provided, including:
acquiring a monitoring video of the unmanned sweeping robot;
performing frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame;
obtaining a current time instance mask and a historical time instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;
obtaining a current time prediction instance mask according to the historical time instance mask and the trained generation model;
calculating to obtain the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, and weighting the duty ratio and the intersection ratio to obtain the proportion of the current time video frame;
and judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.
Further, the video instance segmentation model detects instance targets from a current time video frame and a historical time video frame, a target detection set comprising a current time video frame target matrix and a historical time video frame target matrix is obtained, the similarity of the two adjacent time video frame target matrices is calculated, an affinity matrix set is obtained, the initial positions of the instance targets in the time video frames are defined through the target detection set, the initial positions of the instance targets are corrected through the affinity matrix set, and pixel areas, in which the similarity of the two adjacent time video frames in the initial positions of the instance targets is smaller than or equal to a set threshold value, are eliminated, so that instance masks of all the time instances are obtained.
Further, the convolutional neural network model is adopted as the generating model.
Further, the penalty functions employed by the generative model include an active range penalty function of the instance, an instance mask prediction error penalty function, and an optical flow penalty function of the instance mask.
Further, comparing the specific gravity of the video frame at the current moment with an abnormal event specific gravity threshold value, and judging that an abnormal event occurs at the current moment when the specific gravity of the video frame at the current moment is greater than or equal to the abnormal event specific gravity threshold value; and when the proportion of the video frame at the current moment is smaller than the proportion threshold value of the abnormal event, judging that the abnormal event does not occur at the current moment.
Further, the duty ratio and the cross ratio are weighted by the following formula to obtain the proportion of the video frame at the current moment
Figure SMS_1
Figure SMS_2
wherein ,
Figure SMS_3
calculating a specific gravity for the weighting; />
Figure SMS_4
Masking the duty cycle of the video frame at the current time for the current time instance; />
Figure SMS_5
The cross ratio of the instance mask to the instance mask for the current time is predicted for the current time.
Further, when it is judged that an abnormal event occurs at the current time, the abnormal instance is highlighted in the video frame at the current time, and braking operation is performed on the unmanned sweeping machine.
In a second aspect, an anomaly detection system for an unmanned sweeping robot based on video segmentation is provided, including:
the video acquisition module is used for acquiring a monitoring video of the unmanned sweeping robot;
the frame dividing module is used for dividing the frame of the monitoring video to obtain a current moment video frame and a historical moment video frame;
the real instance mask obtaining module is used for obtaining a current instance mask and a historical instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;
the predicted instance mask obtaining module is used for obtaining a predicted instance mask of the current moment according to the historical moment instance mask and the trained generation model;
the specific gravity acquisition module of the video frame is used for calculating and obtaining the duty ratio of the current moment instance mask in the video frame at the current moment and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, and weighting the duty ratio and the intersection ratio to obtain the specific gravity of the video frame at the current moment;
the abnormal event judging module is used for judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.
In a third aspect, an electronic device is provided, including a memory, a processor, and computer instructions stored on the memory and running on the processor, where the computer instructions, when executed by the processor, perform the steps of a method for detecting anomalies in an unmanned sweeping vehicle based on video segmentation.
In a fourth aspect, a computer readable storage medium is provided for storing computer instructions that, when executed by a processor, perform the steps of a method for detecting anomalies in an unmanned floor sweeping vehicle based on video segmentation.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention obtains the current time instance mask and the predicted instance mask, calculates the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time predicted instance mask and the current time instance mask, calculates the proportion of the current time video frame according to the duty ratio and the intersection ratio, and judges whether an abnormal event occurs at the current time by utilizing the proportion of the current time video frame, thereby judging the distance of an instance object, judging the abnormal behavior of the instance object and improving the accuracy of judging the abnormal event.
2. According to the method, the built video example segmentation model is not trained by taking the fountain, the leaf blown by wind and the like which can be used as the background as examples, so that when the monitoring video is subjected to example segmentation through the trained video example segmentation model, objects which can be used as the background, such as the fountain, the leaf blown by wind and the like, are not segmented, and the error rate of abnormal event judgment is reduced.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application.
FIG. 1 is a flow chart of the method disclosed in example 1;
fig. 2 is a flowchart of example object segmentation learning of the method disclosed in embodiment 1.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Example 1
In this embodiment, a method for detecting an anomaly of an unmanned sweeping robot based on video segmentation is disclosed, as shown in fig. 1 and fig. 2, including:
acquiring a monitoring video of the unmanned sweeping robot;
performing frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame;
obtaining a current time instance mask and a historical time instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;
obtaining a current time prediction instance mask according to the historical time instance mask and the trained generation model;
calculating to obtain the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, and weighting the duty ratio and the intersection ratio to obtain the proportion of the current time video frame;
and judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.
The historical moment video frame is a video frame positioned before the current moment and comprises a plurality of video frames.
The video instance segmentation model detects instance targets from a current moment video frame and a historical moment video frame, obtains a target detection set comprising a current moment video frame target matrix and a historical moment video frame target matrix, calculates the similarity of two adjacent moment video frame target matrices, obtains an affinity matrix set, marks the initial positions of the instance targets in the moment video frames through the target detection set, corrects the initial positions of the instance targets through the affinity matrix set, and excludes pixel areas with the similarity of two adjacent moment video frames in the initial positions of the instance targets being smaller than or equal to a set threshold value, so as to obtain instance masks of each moment.
The video instance segmentation model comprises an object detector and a mask generator, wherein the object detector detects instance objects from a current moment video frame and a historical moment video frame, obtains an object detection set containing a current moment video frame object matrix and a historical moment video frame object matrix, calculates similarity of two adjacent moment video frame object matrices, obtains an affinity matrix set, enables the affinity matrix set and the object detection set to be simultaneously applied to the video frame set, inputs the three to the mask generator, delineates initial positions of instance objects in corresponding moment video frames through the object detection set, corrects the initial positions of the instance objects through the affinity matrix of the two adjacent moment video frames, excludes pixel areas with similarity of the two adjacent moment video frames in the initial positions of the instance objects smaller than or equal to a set threshold, and obtains instance masks of each moment, including the instance masks of the current moment and the historical moment.
The process of obtaining the trained video instance segmentation model is as follows:
i-a) acquiring a training video set comprisingNA bar video; dividing each video into
Figure SMS_6
Zhang Shipin frames, constituting a video frame set->
Figure SMS_7
When (when)K5->
Figure SMS_8
I.e. each video frame set contains 5 consecutive video frames +.>
Figure SMS_9
Is->
Figure SMS_10
Time video frames.
I-b) aggregating all video frames
Figure SMS_11
Input constructed video instance segmentation model +.>
Figure SMS_12
In the step (a), an example mask of a video frame at each moment is obtained, specifically:
video frame aggregation using object detector
Figure SMS_25
Target detection is carried out on five video frames in the video frame to obtain a target detection set
Figure SMS_19
, wherein />
Figure SMS_26
,/>
Figure SMS_17
Is->
Figure SMS_24
The method comprises the steps of calculating the similarity of two adjacent moment video frame target matrixes to obtain the affinity matrixes of the two adjacent moment video frames to form an affinity matrix set->
Figure SMS_29
,/>
Figure SMS_30
,/>
Figure SMS_14
Is->
Figure SMS_21
And->
Figure SMS_13
A similarity matrix between the two. The affinity matrix is assembled->
Figure SMS_22
Target detection set->
Figure SMS_15
Simultaneously acting on a set of video frames>
Figure SMS_23
And inputs the three into a mask generator, through the target detection set +.>
Figure SMS_16
The initial position of an instance target in each corresponding moment video frame is outlined, then the initial position of the instance target is corrected through the affinity matrix of two adjacent moment video frames, the pixel area of which the similarity of two adjacent moment video frames in the initial position of the instance target is smaller than or equal to a set threshold value is eliminated, if the pixel area of the part is set to be 0, the specific position of the instance target is obtained, the instance target is output in the form of a mask, the instance masks of each moment are obtained, and a video instance mask set (I/II) is formed>
Figure SMS_27
。/>
Figure SMS_20
,/>
Figure SMS_28
Is->
Figure SMS_18
Temporal video frame instance mask wherein the mask generator is comprised of an automatic encoder.
In the embodiment, after the initial position of the instance target in the video frame is detected through the target detection set, the initial position of the instance target is corrected through the affinity matrix, and the pixel area which does not belong to the instance target is excluded through the similarity of the video frames at adjacent moments, so that the accuracy of the obtained instance mask is ensured.
I-c) calculating a loss function, and carrying out loss calculation on the generated video frame instance mask and a real instance label, so as to restrict the instance object segmentation model to obtain a more accurate mask.
Wherein the penalty functions include a size penalty function of the instance mask and an error penalty function of the instance mask prediction.
For the size of the instance mask in video, the segmentation mask of the instance object in the video frame should be within the range of the target detection bounding box, by the size loss function of the instance mask
Figure SMS_31
The size penalty of its video instance mask is calculated, thereby constraining the size of the instance mask to be within the range of target detection:
Figure SMS_32
wherein ,
Figure SMS_33
is->
Figure SMS_38
Time instance mask->
Figure SMS_39
Is->
Figure SMS_34
Time video frame target matrix,/->
Figure SMS_37
To calculate +.>
Figure SMS_40
Spill->
Figure SMS_41
Function of the outer matrix of pixels, ">
Figure SMS_35
For calculating the pixel point count function of the instance mask, < +.>
Figure SMS_36
Is the number of instance target objects in the video frame.
Error loss function predicted by instance mask
Figure SMS_42
An error of the instance mask prediction is calculated.
Figure SMS_43
wherein ,
Figure SMS_45
for calculating the pixel point count function of the instance mask, < +.>
Figure SMS_51
For the number of instance object objects in the video frame, < >>
Figure SMS_54
Is->
Figure SMS_47
Moment video frame object matrix->
Figure SMS_49
Middle->
Figure SMS_52
Mask matrix of individual instance objects,/>
Figure SMS_55
Is->
Figure SMS_44
Time instance mask->
Figure SMS_48
Middle->
Figure SMS_53
Mask matrix of individual instance objects,/>
Figure SMS_56
Is->
Figure SMS_46
and />
Figure SMS_50
Is a matrix of intersections of (a) and (b).
I-d) optimizing video instance segmentation model using loss function
Figure SMS_57
And obtaining an optimized instance object segmentation model.
I-e) repeating the steps I-b) to I-d) to set training times to obtain an iterated model
Figure SMS_58
For the trained video example segmentation model, the training times can be 3000 times when the training is implemented.
I-f) inputting all videos in the training video set into the model iterated in step I-e)
Figure SMS_59
In, a sequence set of video instance masks is obtained +.>
Figure SMS_60
The generating model is constructed by adopting a convolutional neural network.
The specific process for obtaining the trained generation model comprises the following steps:
II-a) training a video instance segmentation model to obtain a set of sequences of video instance masks
Figure SMS_61
As a training set for generating models. Sequence set->
Figure SMS_62
The number of the videos is the same as that used in the video instance segmentation model training, and the videos are shared
Figure SMS_63
Pieces of video data, each piece of video takes 5 consecutive sets of video instance masks one by one>
Figure SMS_64
,
Figure SMS_65
,/>
Figure SMS_66
Is->
Figure SMS_67
Temporal video frame instance mask.
II-b) mask-sequence of instances
Figure SMS_68
Four frames before->
Figure SMS_69
Input to the generative model
Figure SMS_70
In (3) an instance mask for predicting a fifth frame, obtaining a predicted instance mask +.>
Figure SMS_71
. Specific generative model
Figure SMS_72
Consists of an automatic encoder whose convolutional neural network learns the effective information of the normal event video mask, including the action information of the instance object mask and the appearance information, i.e. the scale of the mask in the video frameWhether the size is changed drastically, and the motion information is whether the shape of the mask in the video frame is shifted drastically. The future fifth frame is predicted from two features of the four virtual video instance object mask frames. The method comprises the steps of adopting a multi-branch generator, wherein the multi-branch generator is composed of a generating branch and a combining branch, the generating branch predicts different example objects by using U-Net, and the combining branch carries out cascade addition on the generated analysis prediction results to obtain a prediction example mask of a video frame.
II-c) calculating a loss function, wherein the generated prediction mask is required to be compared with the mask of the real video frame so as to calculate loss, and thus the generated model is optimized. The penalty consists of a number of constraints to continually generate a more accurate prediction mask. Furthermore, the scope of activity of the instance object mask should also be constrained, i.e., the scope of activity of the detection object.
The penalty functions employed by the generation model of the present embodiment include the active range penalty function of the instance, the instance mask prediction error penalty function, and the optical flow penalty function of the instance mask.
The process of obtaining the trained generative model comprises the following steps:
II-c-1) in real-time video, not the whole video frame has a target which needs to detect abnormality, under the condition of no abnormality, the behavior of the target can be in a predictable space range, correspondingly, the index variation of the example object mask is also in a predictable range, and the activity range constraint of the example object is added to improve the video abnormality detection performance. Calculating range losses by the active range loss function of an instance
Figure SMS_73
The instance object exists in the form of pixels in the video frame, and the pixels are in a certain area, namely the range of the instance object mask and the image matrix scale where the instance object mask is located, so that the effect of predicting the instance object in a range is better than that of predicting the instance object in the whole video frame. The activity range loss function of the example is:
Figure SMS_74
wherein ,
Figure SMS_75
is->
Figure SMS_76
Time of day prediction instance mask->
Figure SMS_77
Is->
Figure SMS_78
Time instance mask.
II-c-2) each instance object needs to be separately detected for abnormal event, a plurality of groups of instance object masks are arranged in one video frame, different instance object masks have different mask values for processing, and the difference mode can be used for calculating
Figure SMS_79
and />
Figure SMS_80
And loss of mask. Calculating error of instance mask prediction by mask prediction error loss function>
Figure SMS_81
The mask prediction error loss function is: />
Figure SMS_82
wherein ,
Figure SMS_84
pixel point count function for calculating an instance object mask,/-, for example>
Figure SMS_88
For the number of instance objects in a video frame, +.>
Figure SMS_91
Is->
Figure SMS_85
Time mask frame->
Figure SMS_87
Mask matrix of individual instance objects,/>
Figure SMS_90
Is->
Figure SMS_93
The>
Figure SMS_83
Prediction mask matrix of individual instance objects, +.>
Figure SMS_86
Is->
Figure SMS_89
and />
Figure SMS_92
Is a matrix of intersections of (a) and (b).
II-c-3) computing optical flow loss of instance object mask by optical flow loss function of instance mask
Figure SMS_94
The optical flow loss function of the instance mask is:
Figure SMS_95
wherein ,
Figure SMS_97
calculating a function of the optical flow for the 2D image, wherein +.>
Figure SMS_99
For the number of instance objects in a video frame, +.>
Figure SMS_102
Is->
Figure SMS_98
Time mask frame->
Figure SMS_100
Mask matrix of individual instance objects,/>
Figure SMS_104
Is->
Figure SMS_106
Time mask frame->
Figure SMS_96
Mask matrix of individual instance objects,/>
Figure SMS_101
Is->
Figure SMS_103
The>
Figure SMS_105
The prediction mask matrix of the instance objects, the mask frame is an instance mask obtained by the video instance segmentation model.
II-d) optimization by loss function
Figure SMS_107
And obtaining an optimized generation model.
II-e) repeating the steps II-b) to II-d) for a set number of times to obtain an iterative generation model, wherein the set number of times can be 300 times when the iterative generation model is a trained generation model.
And inputting the historical moment instance mask into a trained generation model to obtain the current moment prediction instance mask.
Calculating and obtaining the duty ratio of the video frame of the current time instance mask at the current time
Figure SMS_108
Merging of current time instance mask with current time instance maskRatio->
Figure SMS_109
Weighting the duty ratio and the cross-over ratio to obtain the proportion of the video frame at the current moment
Figure SMS_110
Figure SMS_112
According to the principle of near-large and far-small, the larger the size of the example object in the video frame is, the closer the example object is to the automatic unmanned sweeping robot vehicle is illustrated; and vice versa +.>
Figure SMS_115
Mask +.>
Figure SMS_116
Mask with actual instance
Figure SMS_113
The smaller its value indicates that the predicted instance mask deviates from the truly occurring instance object, i.e., the instance object is more likely to experience abnormal behavior. In calculating->
Figure SMS_114
At the time, due to the duty ratioSSum-to-cross ratioIoUIs not synchronous with the abnormality of the instance object, then the following formula will be used to take up the ratio +.>
Figure SMS_117
And cross ratio->
Figure SMS_118
Weighting to obtain specific gravity of current video frame
Figure SMS_111
Figure SMS_119
wherein ,
Figure SMS_120
for weight calculation of the specific gravity, the value range is +.>
Figure SMS_121
Weighting the duty ratio and the cross ratio to obtain the specific gravity of the current video frame +.>
Figure SMS_122
By->
Figure SMS_123
The method can judge whether the distance of the instance object is far or near or not, and can judge the abnormal behavior of the instance object.
Judging whether an abnormal event occurs at the current moment according to the proportion of the video frame at the current moment, wherein the abnormal event is specifically:
specific gravity of video frame at current moment
Figure SMS_124
And an abnormal event specific gravity threshold ++>
Figure SMS_125
Comparing, when the specific gravity of the video frame at the current moment is greater than or equal to the abnormal event specific gravity threshold value, namely +.>
Figure SMS_126
When the abnormal event occurs at the current moment, judging that the abnormal event occurs; when the specific gravity of the video frame at the current moment is smaller than the specific gravity threshold value of the abnormal event, namely
Figure SMS_127
And judging that no abnormal event occurs at the current moment.
When judging that an abnormal event occurs at the current moment, highlighting an abnormal instance in a video frame at the current moment, and performing braking operation on the unmanned sweeping machine, and when judging that the abnormal event does not occur, performing no braking operation on the unmanned sweeping machine, wherein the camera of the unmanned sweeping machine is always in a working mode, and after waiting for the end of the abnormal eventI.e. the detection of abnormal events in real-time video is restored to
Figure SMS_128
And after the state, the automatic unmanned sweeping robot car resumes the working mode. When the road is crowded and the crowd is dense, the automatic unmanned sweeping machine vehicle can respond to the pause work due to the fact that the occupation of the example object mask is larger, and the real-time detection of the automatic unmanned sweeping machine vehicle is restored to the ++when the road is not crowded and the crowd is sparse any more>
Figure SMS_129
The state will revert to the operational mode.
According to the method for detecting the abnormality of the unmanned sweeping robot based on video segmentation, disclosed by the embodiment, the objects to be detected are identified by fusing the instance segmentation of the real-time video, and the abnormal event detection is carried out on the instances in the video independently, so that the accuracy of the abnormal event detection is improved. In addition, the self-judgment of work when the road is clear and pause when the road is crowded can be realized, and other normal work can not be influenced. Judging whether an abnormal event occurs at the current moment by utilizing the proportion of the video frame at the current moment, and judging whether the distance of the instance object is far or near or judging the abnormal behavior of the instance object, thereby improving the accuracy of judging the abnormal event; the built video example segmentation model is not trained by taking a fountain, a leaf blown by wind and the like which can be used as a background as examples, so that when a monitoring video is subjected to example segmentation by the trained video example segmentation model, objects which can be used as the background, such as the fountain, the leaf blown by wind and the like, are not segmented, and the error rate of abnormal event judgment is reduced.
Example 2
In this embodiment, an anomaly detection system of an unmanned sweeping robot based on video segmentation is disclosed, comprising:
the video acquisition module is used for acquiring a monitoring video of the unmanned sweeping robot;
the frame dividing module is used for dividing the frame of the monitoring video to obtain a current moment video frame and a historical moment video frame;
the real instance mask obtaining module is used for obtaining a current instance mask and a historical instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;
the predicted instance mask obtaining module is used for obtaining a predicted instance mask of the current moment according to the historical moment instance mask and the trained generation model;
the specific gravity acquisition module of the video frame is used for calculating and obtaining the duty ratio of the current moment instance mask in the video frame at the current moment and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, and weighting the duty ratio and the intersection ratio to obtain the specific gravity of the video frame at the current moment;
the abnormal event judging module is used for judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.
Example 3
In this embodiment, an electronic device is disclosed that includes a memory, a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the method for detecting anomalies in an unmanned sweeping robot based on video segmentation disclosed in embodiment 1.
Example 4
In this embodiment, a computer readable storage medium is disclosed for storing computer instructions that, when executed by a processor, perform the steps of the method for detecting anomalies in an unmanned sweeping vehicle based on video segmentation disclosed in embodiment 1.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims (8)

1. The method for detecting the abnormality of the unmanned sweeping robot based on video segmentation is characterized by comprising the following steps of:
acquiring a monitoring video of the unmanned sweeping robot;
performing frame division on the monitoring video to obtain a current moment video frame and a historical moment video frame;
obtaining a current time instance mask and a historical time instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;
obtaining a predicted instance mask at the current moment according to a historical moment instance mask and a trained generation model, wherein the generation model adopts a convolutional neural network model, a loss function of the generation model comprises an active range loss function of an instance, an instance mask prediction error loss function and an optical flow loss function of the instance mask, and the active range loss function of the instance is as follows:
Figure QLYQS_1
wherein ,
Figure QLYQS_2
for loss of range->
Figure QLYQS_3
Is->
Figure QLYQS_4
Time of day prediction instance mask->
Figure QLYQS_5
Is->
Figure QLYQS_6
A time instance mask;
the example mask prediction error loss function is:
Figure QLYQS_7
wherein ,
Figure QLYQS_11
error predicted for instance mask, +.>
Figure QLYQS_14
Pixel point count function for calculating an instance object mask,/-, for example>
Figure QLYQS_18
For the number of instance objects in a video frame, +.>
Figure QLYQS_10
Is->
Figure QLYQS_13
Time mask frame->
Figure QLYQS_17
Mask matrix of individual instance objects,/>
Figure QLYQS_19
Is->
Figure QLYQS_8
The>
Figure QLYQS_12
Prediction mask matrix of individual instance objects, +.>
Figure QLYQS_15
Is->
Figure QLYQS_16
and />
Figure QLYQS_9
Is a matrix of intersections of (a);
the optical flow loss function of the instance mask is:
Figure QLYQS_20
wherein ,
Figure QLYQS_23
optical flow loss for instance object mask, +.>
Figure QLYQS_26
Calculating a function of the optical flow for the 2D image, wherein +.>
Figure QLYQS_29
For the number of instance objects in a video frame, +.>
Figure QLYQS_24
Is->
Figure QLYQS_28
Time mask frame->
Figure QLYQS_31
Mask matrix of individual instance objects,/>
Figure QLYQS_32
Is->
Figure QLYQS_21
Time mask frame->
Figure QLYQS_25
Mask matrix of individual instance objects,/>
Figure QLYQS_27
Is->
Figure QLYQS_30
The>
Figure QLYQS_22
A prediction mask matrix of the individual instance objects, the mask frame being an instance mask obtained by a video instance segmentation model;
calculating to obtain the duty ratio of the current time instance mask in the current time video frame and the intersection ratio of the current time prediction instance mask and the current time instance mask, and weighting the duty ratio and the intersection ratio to obtain the proportion of the current time video frame;
and judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.
2. The method for detecting the anomaly of the unmanned sweeping machine vehicle based on video segmentation according to claim 1, wherein a video instance segmentation model detects instance targets from a current time video frame and a historical time video frame, obtains a target detection set comprising a current time video frame target matrix and a historical time video frame target matrix, calculates similarity of two adjacent time video frame target matrices, obtains an affinity matrix set, circles initial positions of instance targets in each time video frame through the target detection set, corrects the initial positions of the instance targets through the affinity matrix set, and excludes pixel areas with similarity of two adjacent time video frames in the initial positions of the instance targets being smaller than or equal to a set threshold value, so as to obtain instance masks of each time.
3. The method for detecting the anomaly of the unmanned sweeping robot based on video segmentation according to claim 1, wherein the specific gravity of the video frame at the current moment is compared with an anomaly event specific gravity threshold value, and when the specific gravity of the video frame at the current moment is greater than or equal to the anomaly event specific gravity threshold value, the anomaly event is judged to occur at the current moment; and when the proportion of the video frame at the current moment is smaller than the proportion threshold value of the abnormal event, judging that the abnormal event does not occur at the current moment.
4. The method for detecting the abnormality of the unmanned sweeping robot based on the video segmentation according to claim 1, wherein when the occurrence of the abnormal event at the current moment is judged, the abnormal instance is highlighted in the video frame at the current moment, and the unmanned sweeping robot is braked.
5. The anomaly detection method for unmanned sweeping vehicle based on video segmentation as set forth in claim 1, wherein the duty ratio and the cross-over ratio are weighted by the following formula to obtain the specific gravity of the video frame at the current moment
Figure QLYQS_33
Figure QLYQS_34
wherein ,
Figure QLYQS_35
calculating a specific gravity for the weighting; />
Figure QLYQS_36
Masking the duty cycle of the video frame at the current time for the current time instance; />
Figure QLYQS_37
The cross ratio of the instance mask to the instance mask for the current time is predicted for the current time.
6. An unmanned robot car anomaly detection system that sweeps floor based on video segmentation, characterized in that includes:
the video acquisition module is used for acquiring a monitoring video of the unmanned sweeping robot;
the frame dividing module is used for dividing the frame of the monitoring video to obtain a current moment video frame and a historical moment video frame;
the real instance mask obtaining module is used for obtaining a current instance mask and a historical instance mask according to the current time video frame, the historical time video frame and the trained video instance segmentation model;
the predicted instance mask obtaining module is used for obtaining a predicted instance mask at the current moment according to the historical moment instance mask and a trained generating model, the generating model adopts a convolutional neural network model, a loss function of the generating model comprises an active range loss function of an instance, an instance mask prediction error loss function and an optical flow loss function of the instance mask, and the active range loss function of the instance is:
Figure QLYQS_38
wherein ,
Figure QLYQS_39
for loss of range->
Figure QLYQS_40
Is->
Figure QLYQS_41
Time of day prediction instance mask->
Figure QLYQS_42
Is->
Figure QLYQS_43
A time instance mask;
the example mask prediction error loss function is:
Figure QLYQS_44
wherein ,
Figure QLYQS_48
error predicted for instance mask, +.>
Figure QLYQS_50
To calculatePixel point count function of instance object mask, +.>
Figure QLYQS_54
For the number of instance objects in a video frame, +.>
Figure QLYQS_47
Is->
Figure QLYQS_51
Time mask frame->
Figure QLYQS_52
Mask matrix of individual instance objects,/>
Figure QLYQS_55
Is->
Figure QLYQS_45
The>
Figure QLYQS_49
Prediction mask matrix of individual instance objects, +.>
Figure QLYQS_53
Is->
Figure QLYQS_56
and />
Figure QLYQS_46
Is a matrix of intersections of (a);
the optical flow loss function of the instance mask is:
Figure QLYQS_57
wherein ,
Figure QLYQS_60
masking instance objectsOptical flow loss->
Figure QLYQS_63
Calculating a function of the optical flow for the 2D image, wherein +.>
Figure QLYQS_66
For the number of instance objects in a video frame, +.>
Figure QLYQS_61
Is->
Figure QLYQS_65
Time mask frame->
Figure QLYQS_68
Mask matrix of individual instance objects,/>
Figure QLYQS_69
Is->
Figure QLYQS_58
Time mask frame->
Figure QLYQS_62
Mask matrix of individual instance objects,/>
Figure QLYQS_64
Is->
Figure QLYQS_67
The>
Figure QLYQS_59
A prediction mask matrix of the individual instance objects, the mask frame being an instance mask obtained by a video instance segmentation model;
the specific gravity acquisition module of the video frame is used for calculating and obtaining the duty ratio of the current moment instance mask in the video frame at the current moment and the intersection ratio of the current moment prediction instance mask and the current moment instance mask, and weighting the duty ratio and the intersection ratio to obtain the specific gravity of the video frame at the current moment;
the abnormal event judging module is used for judging whether an abnormal event occurs at the current moment according to the specific gravity of the video frame at the current moment.
7. An electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of a method for detecting anomalies in an unmanned sweeping robot based on video segmentation as defined in any one of claims 1 to 6.
8. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of a method of unmanned robot cleaner anomaly detection based on video segmentation of any one of claims 1 to 6.
CN202310252874.5A 2023-03-16 2023-03-16 Video segmentation-based unmanned sweeping robot anomaly detection method and system Active CN115965899B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310252874.5A CN115965899B (en) 2023-03-16 2023-03-16 Video segmentation-based unmanned sweeping robot anomaly detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310252874.5A CN115965899B (en) 2023-03-16 2023-03-16 Video segmentation-based unmanned sweeping robot anomaly detection method and system

Publications (2)

Publication Number Publication Date
CN115965899A CN115965899A (en) 2023-04-14
CN115965899B true CN115965899B (en) 2023-06-06

Family

ID=85889850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310252874.5A Active CN115965899B (en) 2023-03-16 2023-03-16 Video segmentation-based unmanned sweeping robot anomaly detection method and system

Country Status (1)

Country Link
CN (1) CN115965899B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597864A (en) * 2020-12-16 2021-04-02 佳都新太科技股份有限公司 Monitoring video abnormity detection method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268563B (en) * 2014-09-15 2017-05-17 合肥工业大学 Video abstraction method based on abnormal behavior detection
US20160381320A1 (en) * 2015-06-25 2016-12-29 Nokia Technologies Oy Method, apparatus, and computer program product for predictive customizations in self and neighborhood videos
CN112989942A (en) * 2021-02-09 2021-06-18 四川警察学院 Target instance segmentation method based on traffic monitoring video
CN114067251B (en) * 2021-11-18 2023-09-15 西安交通大学 Method for detecting anomaly of unsupervised monitoring video prediction frame
CN115035432A (en) * 2022-03-10 2022-09-09 云从科技集团股份有限公司 Abnormal video detection method, device, medium and equipment
CN114724060A (en) * 2022-03-14 2022-07-08 中国人民解放军国防科技大学 Method and device for unsupervised video anomaly detection based on mask self-encoder

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597864A (en) * 2020-12-16 2021-04-02 佳都新太科技股份有限公司 Monitoring video abnormity detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于时空感知级联神经网络的视频前背景分离;杨敬钰;师雯;李坤;宋晓林;岳焕景;;天津大学学报(自然科学与工程技术版)(06);第87-94页 *

Also Published As

Publication number Publication date
CN115965899A (en) 2023-04-14

Similar Documents

Publication Publication Date Title
Zhang et al. Deep convolutional neural networks for forest fire detection
CN110070074B (en) Method for constructing pedestrian detection model
CN110084165B (en) Intelligent identification and early warning method for abnormal events in open scene of power field based on edge calculation
CN104574439A (en) Kalman filtering and TLD (tracking-learning-detection) algorithm integrated target tracking method
CN103246896A (en) Robust real-time vehicle detection and tracking method
CN103488993A (en) Crowd abnormal behavior identification method based on FAST
CN113592905B (en) Vehicle driving track prediction method based on monocular camera
CN111031266B (en) Method, system and medium for filtering background activity noise of dynamic visual sensor based on hash function
Toyungyernsub et al. Double-prong convlstm for spatiotemporal occupancy prediction in dynamic environments
CN114202803A (en) Multi-stage human body abnormal action detection method based on residual error network
CN115760921A (en) Pedestrian trajectory prediction method and system based on multi-target tracking
Wang et al. Multi-agent trajectory prediction with spatio-temporal sequence fusion
Kanu-Asiegbu et al. Leveraging trajectory prediction for pedestrian video anomaly detection
Mann et al. Predicting future occupancy grids in dynamic environment with spatio-temporal learning
CN115965899B (en) Video segmentation-based unmanned sweeping robot anomaly detection method and system
CN113392817A (en) Vehicle density estimation method and device based on multi-row convolutional neural network
Srilekha et al. A novel approach for detection and tracking of vehicles using Kalman filter
Katariya et al. A pov-based highway vehicle trajectory dataset and prediction architecture
CN112255141B (en) Thermal imaging gas monitoring system
CN117237676B (en) Method for processing small target drop track of nuclear power plant based on event camera
Sun et al. Visual perception based situation analysis of traffic scenes for autonomous driving applications
Lo Optical Flow Based Motion Detection for Autonomous Driving
CN116246492B (en) Vehicle lane change collision risk prediction method based on space-time attention LSTM and super-threshold model
Yugendar et al. Analysis of crowd flow parameters using artificial neural network
Liu et al. Region-Based Illumination-Temperature Awareness and Cross-Modality Enhancement for Multispectral Pedestrian Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20230414

Assignee: SHANDONG JUXIANG MACHINERY CO.,LTD.

Assignor: SHANDONG KAILIN ENVIRONMENTAL PROTECTION EQUIPMENT Co.,Ltd.

Contract record no.: X2023980047848

Denomination of invention: A Video Segmentation Based Anomaly Detection Method and System for Unmanned Sweeping Machine Vehicles

Granted publication date: 20230606

License type: Common License

Record date: 20231123

EE01 Entry into force of recordation of patent licensing contract