CN114241401A

CN114241401A - Abnormality determination method, apparatus, device, medium, and product

Info

Publication number: CN114241401A
Application number: CN202111288937.XA
Authority: CN
Inventors: 周超; 杜呈欣; 韩佩瑶; 李樊; 孟宇坤; 王志飞; 吴跃; 赵俊华; 王越彤; 吴卉; 孙同庆; 郭悦; 蔡晓蕾; 李高科; 李帅; 宫玉昕; 高洪波; 宗慧曦
Original assignee: China Academy of Railway Sciences Corp Ltd CARS; Institute of Computing Technologies of CARS; Beijing Jingwei Information Technology Co Ltd
Current assignee: China Academy of Railway Sciences Corp Ltd CARS; Institute of Computing Technologies of CARS; Beijing Jingwei Information Technology Co Ltd
Priority date: 2021-11-02
Filing date: 2021-11-02
Publication date: 2022-03-25

Abstract

The invention provides an abnormality determination method, an abnormality determination device, abnormality determination equipment, an abnormality determination medium and an abnormality determination product, wherein the method comprises the following steps: determining a scene type of video data obtained through a camera device; judging whether a target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result; inputting video data into a target detection model, and outputting a target detection value of a target to be detected in the video data through the target detection model, wherein the target detection model is obtained by training video sample data and a target detection sample value corresponding to the target of the sample to be detected; judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result; and judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result. The method and the device are used for solving the defect that abnormal conditions such as long-time no action of passengers, displacement of important articles and the like cannot be found in time due to the fact that video contents are manually identified and analyzed in the prior art.

Description

Abnormality determination method, apparatus, device, medium, and product

Technical Field

The invention relates to the technical field of rail transit, in particular to an abnormity judgment method, device, equipment, medium and product.

Background

In recent years, the development of the rail industry in China is rapid, and due to the advantages of no land occupation, large transportation volume, low energy consumption and the like, various cities are actively invested in rail traffic construction. With the huge pressure of passenger traffic in stations brought by huge passenger flow, the pressure on the aspects of operation management, public safety guarantee and the like of rail transit is increased day by day. At the present stage, management is mainly carried out by combining long-term operation experience with traditional video monitoring, wherein the station video monitoring only provides functions of video acquisition, preview, transmission, storage and the like, and the analysis and identification of video contents are mainly realized by manpower.

The video content is identified through manual analysis, and there are many defects, for example, the accuracy of small-size target analysis for a specific scene, such as no action of a passenger for a long time, displacement of an important article, and the like, is not high, and the position of an object cannot be accurately judged. And for example, the system can solve the problems of casualties, untimely emergency treatment and the like caused by the fact that abnormal conditions such as no action of passengers for a long time, displacement of important articles and the like cannot be found in time.

Disclosure of Invention

The invention provides an abnormity judgment method, device, equipment, medium and product, which are used for solving the defect that abnormal conditions such as long-time no action, displacement of important articles and the like of passengers cannot be found in time due to the fact that video contents are identified and analyzed manually in the prior art, and achieving the purpose of effectively and timely judging whether an abnormity occurs in a target to be detected.

The invention provides an abnormality determination method, which comprises the following steps:

determining a scene type of video data obtained through a camera device;

judging whether the target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result;

inputting the video data into a target detection model, and outputting a target detection value of the target to be detected in the video data through the target detection model, wherein the target detection model is obtained through training video sample data and a target detection sample value corresponding to the target to be detected;

judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result;

and judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result.

According to an abnormality determination method provided by the present invention, the inputting of the video data into a target detection model and the outputting of a target detection value of the target to be detected in the video data by the target detection model includes:

cutting each frame of data in the video data to obtain image data, and sequencing the image data according to the time stamp to obtain target image data;

and sequentially inputting the target image data into the target detection model, and sequentially outputting the target detection value of the target to be detected in each frame of the target image data through the target detection model.

According to an abnormality determination method provided by the present invention, the sequentially inputting the target image data into the target detection model, and sequentially outputting the target detection value of the target to be detected in each frame of the target image data through the target detection model includes:

sequentially inputting the target image data into the target detection model;

executing the following processing procedures on each frame of the target image data through the target detection model: extracting target characteristics of the target to be detected in the target image data; generating a candidate box for the target feature; mapping the candidate box onto the target image data; determining a target detection value of the target to be detected through the candidate frame on the target image data;

and determining a target candidate frame from the candidate frames, and outputting a target detection value of the target to be detected corresponding to the target candidate frame.

According to an abnormality determination method provided by the present invention, the extracting a target feature of the target to be detected in the target image data includes:

performing shallow feature extraction on the target image data to obtain target shallow features;

carrying out deep feature extraction on the target image data to obtain target deep features;

and fusing the target shallow layer feature and the target deep layer feature to obtain the target feature.

According to an abnormality determination method provided by the present invention, the determining whether the target to be detected moves based on the target detection value to obtain a second determination result includes:

based on the target detection value of the target to be detected in the target image data corresponding to the current time, executing the following calculation process:

determining a target detection value of the target to be detected in each frame of target image data at the current time; calculating a target detection value of the target to be detected in the first frame of target image data at the current time, and taking a first position intersection value of the target detection value of the target to be detected in other frames of target image data at the current time, wherein the target detection value of the target to be detected in the first frame of target image data at the current time is taken as a current target detection value; determining a target detection value of the target to be detected in each frame of target image data corresponding to a time point after a first preset time; calculating the current target detection value, and calculating a second position intersection value of the target detection value of the target to be detected in each frame of target image data corresponding to the time point after the first preset time; when the first position intersection value and the second position intersection value are determined to be respectively larger than a first preset threshold value, determining a target detection value of the target to be detected in the last frame of target image data corresponding to a time point after at least one second preset time; calculating the current target detection value, and calculating a third position intersection value of the target detection value of the target to be detected in the last frame of target image data corresponding to the time point after the at least one second preset time;

when the third position intersection value is determined to be larger than a second preset threshold value, determining that the second judgment result is that the target to be detected does not move;

and when the third position intersection value is determined to be smaller than or equal to the second preset threshold value, determining that the second judgment result is that the target to be detected moves.

According to an abnormality determination method provided by the present invention, determining whether an abnormality exists in the target to be detected based on the first determination result and the second determination result includes:

when the first judgment result indicates that the target to be detected needs to move and the second judgment result indicates that the target to be detected does not move, judging that the target to be detected is abnormal;

when the first judgment result is that the target to be detected needs to move and the second judgment result is that the target to be detected moves, judging that the target to be detected is not abnormal;

when the first judgment result indicates that the target to be detected does not need to move and the second judgment result indicates that the target to be detected does not move, judging that the target to be detected is not abnormal;

and when the first judgment result indicates that the target to be detected does not need to move and the second judgment result indicates that the target to be detected moves, judging that the target to be detected is abnormal.

The present invention also provides an abnormality determination device including:

the determining module is used for determining the scene type of the video data obtained by the camera device;

the first judgment module is used for judging whether the target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result;

the output module is used for inputting the video data into a target detection model and outputting a target detection value of the target to be detected in the video data through the target detection model, wherein the target detection model is obtained by training video sample data and a target detection sample value corresponding to the target to be detected;

the second judgment module is used for judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result;

and the judging module is used for judging whether the target to be detected is abnormal or not based on the first judging result and the second judging result.

The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of any of the above-mentioned abnormality determination methods when executing the program.

The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the abnormality determination method as in any one of the above.

The present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, implement the steps of the abnormality determination method according to any one of the above.

The abnormity judging method, the abnormity judging device, the abnormity judging equipment, the abnormity judging medium and the abnormity judging product are characterized in that the scene type of video data obtained through the camera device is determined; judging whether a target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result; inputting video data into a target detection model, and outputting a target detection value of a target to be detected in the video data through the target detection model; judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result; the method and the device for judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result, judge whether the target to be detected is abnormal or not by judging whether the target to be detected needs to move or not and judging whether the target to be detected moves or not, achieve the purpose of effectively and quickly judging whether the target to be detected is abnormal or not, solve the problems that in the prior art, through manual identification and analysis of video content, whether the target to be detected is abnormal or not can not be timely and effectively judged, for example, passengers have no action for a long time, important articles shift and other abnormal conditions, so that casualties, emergency treatment and the like are caused, and effectively improve user experience.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of an anomaly determination method according to the present invention;

FIG. 2 is a second schematic flow chart of the abnormality determination method according to the present invention;

FIG. 3A is a schematic diagram of a target detection network according to the present invention;

FIG. 3B is a second schematic diagram of a target detection network according to the present invention;

FIG. 4 is a third schematic flow chart of an anomaly determination method according to the present invention;

fig. 5 is a schematic structural view of an abnormality determination device provided by the present invention;

fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The abnormality determination method of the present invention is described below with reference to fig. 1 to 4.

The embodiment of the invention provides an abnormality judgment method which can be applied to intelligent terminals such as mobile phones, computers, tablets and the like and can also be applied to servers. The method is described below by using the server as an example, but the method is only described by way of example and is not intended to limit the scope of the present invention. The other descriptions in the embodiments of the present invention are also for illustration purposes, and are not intended to limit the scope of the present invention.

Specifically, the target size of the target to be detected in the subway, which utilizes target users, fire fighting equipment and the like, is generally concentrated on small and medium-sized targets within 96 pixels × 96 pixels, and there is almost no large target. The existing target detection algorithm is difficult to identify a large number of small-size targets, and some background areas, such as portrait posters and the like, in some scenes are identified as targets to be detected and deviate from an identification task. Therefore, the target detection model used by the invention has stronger detection performance on small-size targets.

The specific implementation of the abnormality determination method of the present invention is shown in fig. 1:

in step 101, a scene type of video data obtained by an image pickup device is determined.

Wherein, camera device includes the camera.

Specifically, each monitored area comprises at least one camera, and the cameras in each monitored area can be arranged at different angles to avoid the camera from being shielded, so that the judgment result is influenced.

Specifically, video data sent by each camera of each monitored area is acquired, and scene types in each video data are identified and analyzed. Wherein the scene categories include: fire scenes, public rest scenes, and the like.

And 102, judging whether the target to be detected corresponding to the scene type needs to move or not to obtain a first judgment result.

Specifically, the corresponding relationship between the scene category and the target to be detected corresponding to the scene category has been stored in advance, and the corresponding relationship is used to indicate whether the target to be detected corresponding to the scene category needs to be moved. After the scene type is determined, whether the target to be detected corresponding to the scene type needs to move or not is determined based on the pre-stored corresponding relation.

For example, the target to be detected corresponding to the fire scene is a fire-fighting article and does not need to be moved; a target to be detected corresponding to a public rest scene is a target user and needs to move; and the like.

For example, when the scene type is a fire scene, the first judgment result is that the target to be detected does not need to move; and when the scene type is a public rest scene, the first judgment result indicates that the target to be detected needs to move.

And 103, inputting the video data into a target detection model, and outputting a target detection value of a target to be detected in the video data through the target detection model.

The target detection model is obtained by training video sample data and target detection sample values corresponding to a sample target to be detected.

In a specific embodiment, each frame of data in the video data is cut to obtain image data, and the image data is sequenced according to the time stamps to obtain target image data; and sequentially inputting the target image data into the target detection model, and sequentially outputting the target detection value of the target to be detected in each frame of target image data through the target detection model.

Specifically, a digital IP camera, a Digital Video Recorder (DVR), a Network Video Recorder (NVR), and the like are used to collect video streams to obtain video data. Specifically, the video stream is acquired according to a Real Time Streaming Protocol (RTSP) so as to conform to a preset file format.

In a specific embodiment, the specific implementation of determining the target detection value of the target to be detected in the target image data through the target detection model is as follows: and sequentially inputting the target image data into a target detection model according to the time stamps, executing a processing process on each frame of target image data through the target detection model, further determining a target candidate frame from the candidate frames through an output layer, and outputting a target detection value of a target to be detected corresponding to the target candidate frame.

The specific implementation of the processing process of each frame of target image data by the target detection model is as shown in fig. 2:

step 201, extracting target features of a target to be detected in target image data.

Before inputting the target image data into the target detection model, preprocessing the target image data, for example, cutting the target image data to obtain target image data with a preset size.

In a specific embodiment, shallow feature extraction is carried out on target image data to obtain target shallow features; carrying out deep feature extraction on the target image data to obtain target deep features; and fusing the target shallow layer characteristic and the target deep layer characteristic to obtain the target characteristic.

Specifically, feature extraction is performed on target image data through a feature extraction layer of a target detection network to obtain target features.

Specifically, the video data collected in the subway is taken as an example for description, and the target image data of the invention is obtained based on the video data in the subway. The deep neural network is used for carrying out deep feature extraction on target image data, and the deep neural network can extract a large amount of semantic information, so that the target to be detected and the background can be distinguished, but the loss of details can be caused, and the detection effect on a specific scene is reduced. Therefore, for the defect part of the deep neural network, the shallow neural network is used for performing shallow feature extraction on target image data, and then, two feature maps with different scales of the target shallow feature and the target deep feature are fused to form a new feature map, namely, a target feature, so that the target to be detected can be more easily identified, as shown in formula (1):

fa_n＝pooling(conv_{n_1})+pooling(conv_{n_2})

conv_{n_1}＝conv_3×3(conv_{2_2})+conv_3×3(deconv_2×2(conv_{4_3}))

conv_{n_2}＝conv_3×3(conv_{3_3})+conv_3×3(deconv_2×2(conv_{5_3}))

(1)

wherein fa is_nIndicating the target feature, pooling (conv)_{n_1}) Representation pair fusion convolution conv_{n_1}Performing a pooling operation, pooling (conv)_{n_2}) Representation pair fusion convolution conv_{n_2}Performing a pooling operation, conv_{n_1}Representing the convolution features after fusing the second-layer features with the fourth-layer features of the network, conv_{n_2}Representing the convolution features after fusing the third and fourth layer features of the network, conv_3×3(conv_{2_2}) Representing the convolution operation on the second convolution of the second layer, conv_3×3(deconv_2×2(conv_{4_3}) Means that the third convolution of the fourth layer is performed with the deconvolution operation first and then with the convolution operation, conv_3×3(conv_{3_3}) Means for performing a convolution operation on the third layer of the third convolution, conv_3×3(deconv_2×2(conv_{5_3}) Means that the fifth layer third convolution is performed first with the deconvolution operation and then with the convolution operation.

Step 202, a candidate box is generated for the target feature.

Specifically, a target feature is input into a candidate region generation network through a feature extraction layer, a candidate frame is generated for the target feature, and the category of the candidate frame is judged through a wavelet neural network, wherein the category comprises: foreground and background.

Specifically, more noise is introduced into image data obtained based on a subway than image data obtained based on a natural scene, and partial shielding in the target image data leads to some targets to be detected being annotated only by visible parts, so that the detection can only take the parts of the targets to be detected as a whole, therefore, the wavelet neural network is adopted to judge whether a candidate frame is a foreground or a background so as to highlight details of problems to be processed, local information is effectively extracted, and the generalization performance of the target detection network is improved. Wherein the wavelet neural network belongs to a part of the candidate region generation network.

In order to quantify the degree of incompleteness of the target to be detected in the video detection process, the function S is constructed by adopting the foreground crossing (IoF) standard_iTo consider IoF and the confidence factor, see in particular equation (2):

wherein C is_iIndicates the confidence of the ith detection result, I_iThe corresponding maximum IoF representing the ith detection result, λ represents the balance coefficient for adjustment IoF and confidence weighting, δ is the bias factor to eliminate noise effects, S_iRepresenting the selection probability of the candidate box.

The selection probability of selecting the candidate frame is greater than the candidate frame with the preset probability. After the candidate frames are generated for the target feature using the candidate area generation network, IoF and the parameter value of the confidence level of each candidate frame are output, and IoF and the parameter value of the confidence level are substituted into formula (2) to obtain the selection probability of each candidate frame.

Step 203, mapping the candidate frame to the target image data.

Specifically, the target features with the candidate frames are input into the pooling layer through the candidate region generation network, and the candidate frames are mapped onto the target image data through the pooling layer.

And step 204, determining a target detection value of the target to be detected through the candidate frame on the target image data.

Specifically, the target detection value of the target to be detected is finally determined through the full connection layer and the PrRoIPooling layer.

Specifically, the specific structure of the target detection network can be seen in fig. 3A, and the target detection network includes: an input layer 301, a feature extraction layer 302, a candidate area generation network 303, a pooling layer 304, a fully connected layer 305 and an output layer 306, wherein the candidate area generation network 303 comprises: a wavelet neural network. An input layer 301 for inputting target image data; a feature extraction layer 302 for extracting a target feature; a candidate area generation network 303 for generating a candidate frame; a wavelet neural network for determining a category of the generated candidate box, wherein the category includes: foreground and background; a pooling layer 304 for mapping candidate boxes in the feature map onto the target image data; the full connection layer 305 is used for determining a target detection value of a target to be detected; and the output layer 306 is configured to determine a target candidate frame from the candidate frames, and output a target detection value of the target to be detected corresponding to the target candidate frame.

Among them, the candidate generation Network 303 is a candidate generation Network (RPN for short). Pooling layer 304 includes PrRoIPooling.

Specifically, the structure, data processing, and data interaction of the target detection network can be seen in fig. 3B, the target image data is input to the feature extraction layer 302, and feature extraction operations are performed on the target image data by the feature extraction layer, where the feature extraction operations include: performing operations such as convolution operation and deconvolution operation through a convolution layer (conv) of the feature extraction layer 302 to obtain convolution characteristics, performing pooling operation on the convolution characteristics through a pooling layer (pooling) of the feature extraction layer 302 to obtain shallow features and deep features, and finally fusing the shallow features and the deep features to obtain target features.

The feature extraction layer 302 transmits the target feature to the candidate generation network 303, and the candidate generation network 303 generates a plurality of candidate frames and determines the category of the candidate frame using the wavelet neural network inside the candidate generation network 303. The candidate generation network 303 transmits the feature map obtained by discriminating the candidate box to the prrolipoling 304. The prroliploling 304 maps the candidate frame to the target image data, and the prroliploling 304 is connected to the full-link layer 305, and determines the target detection value and the classification determination result of the target to be detected through the full-link layer 305.

And 104, judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result.

In a specific embodiment, the specific obtaining process of the second judgment result is as follows: executing a calculation process based on a target detection value of a target to be detected in the target image data corresponding to the current time to obtain a third position intersection value; when the third position intersection value is determined to be larger than a second preset threshold value, determining that the second judgment result is that the target to be detected does not move; and when the intersection value of the third position is determined to be smaller than or equal to a second preset threshold value, determining that the second judgment result is that the target to be detected moves.

The specific implementation of the calculation process is shown in fig. 4:

step 401, determining a target detection value of a target to be detected in each frame of target image data at the current time.

Wherein, the current time has n frames of target image data, wherein n is the number of frames greater than 1. For example, assuming that the current time is 1s, the n frames of target image data are specifically expressed as: 1 st s 1 st frame, 1s 2 nd frame, …, 1s nth frame.

Step 402, calculating a target detection value of the target to be detected in the first frame of target image data at the current time, and a first position intersection value of the target detection value of the target to be detected in other frames of target image data at the current time.

And recording the target detection value of the target to be detected in the first frame target image data of the current time as the current target detection value.

For example, the target detection value of the target to be detected in the 1 st frame 1s is denoted as loc_1,1The target detection value of the target to be detected in the 1 st frame 2 is recorded as loc_1,2…, the target detection value of the target to be detected in the 1 st nth frame is recorded as loc_1,n。

Wherein, the first position intersection value is a set, which comprises: loc_1,1And loc_1,2IoU value of, loc_1,1And loc_1,3IoU value of …, loc_1,1And loc_1,nIoU value of (a). Wherein the position intersection value is IoU value.

Step 403, determining a target detection value of the target to be detected in each frame of target image data corresponding to the time point after the first preset time.

Taking the first preset time equal to 5s as an example for explanation, the target detection value of the target to be detected in each frame of target image data corresponding to the time point of 5s is recorded as: target detection value loc of target to be detected in 5s 1 st frame_5,1The target detection value of the target to be detected in the 5s 2 nd frame is recorded as loc_5,2…, the target detection value of the target to be detected in the nth frame of 5s is recorded as loc_5,n。

Step 404, calculating a current target detection value, and a second position intersection value of the target detection value of the target to be detected in each frame of target image data corresponding to the time point after the first preset time.

Wherein, the second position intersection value is a set, including: loc_1,1And loc_5,1IoU value of, loc_1,1And loc_5,2IoU value of, loc_1,1And loc_5,3IoU value of …, loc_1,1And loc_5,nIoU value of (a).

Step 405, when it is determined that the first position cross value and the second position cross value are respectively greater than the first preset threshold, determining a target detection value of a target to be detected in the last frame of target image data corresponding to a time point after at least one second preset time.

Specifically, when loc_1,1And loc_1,2IoU value of, loc_1,1And loc_1,3IoU value of …, loc_1,1And loc_1,nIoU value of, loc_1,1And loc_5,1IoU value of, loc_1,1And loc_5,2IoU value of, loc_1,1And loc_5,3IoU value of …, loc_1,1And loc_5,nWhen the IoU values are all respectively larger than the first preset threshold value, determining the target detection value of the target to be detected in the last frame of target image data corresponding to the time point after at least one second preset time.

Wherein the second preset time includes 1min, 3min, 5min, 10min, 12min, 15min and the like.

And 406, calculating a current target detection value, and a third position intersection value of the target detection value of the target to be detected in the last frame of target image data corresponding to the time point after at least one second preset time.

The second preset time includes 1min, 3min, 5min, and 10min for example. Wherein 1min equals 60s, 3min equals 180s, 5min equals 300s, and 10min equals 600 s.

Wherein the third position intersection value is a set including loc_1,1And loc_60,nIoU value of, loc_1,1And loc_180,nIoU value of, loc_1,1And loc_300,nIoU value of …, loc_1,1And loc_600,nIoU value of (a). Specifically, each second preset time corresponds to a respective second preset threshold, for example, 1min corresponds to a second preset first sub-threshold, 3min corresponds to a second preset second sub-threshold, 5min corresponds to a second preset third sub-threshold, and 10min corresponds to a second preset fourth sub-threshold. And the second preset fourth sub-threshold is greater than the second preset third sub-threshold, the second preset third sub-threshold is greater than the second preset second sub-threshold, and the second preset second sub-threshold is greater than the second preset first sub-threshold.

When loc is_1,1And loc_60,nIs greater than a second predetermined first sub-threshold, and loc_1,1And loc_180,nIs greater than a second predetermined fourth sub-threshold, and loc_1,1And loc_300,nIs greater than a second predetermined third sub-threshold, and loc_1,1And loc_600,nWhen the IoU value is larger than a second preset fourth sub-threshold, determining that the second judgment result is that the target to be detected does not move, otherwise, determining that the second judgment result is that the target to be detected does not move when any one of the conditions is not metAnd judging that the target to be detected moves.

Specifically, this embodiment first determines whether the target to be detected corresponding to the scene type needs to move, and then determines whether the target to be detected moves according to the target detection value. However, it should be noted that the implementation steps are not specifically limited in this embodiment, and this is only an example, and whether the target to be detected moves may be determined first through the target detection value, and then whether the target to be detected corresponding to the scene type needs to move is determined.

And 105, judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result.

In a specific embodiment, the determining whether the target to be detected is abnormal specifically includes: when the first judgment result is that the target to be detected needs to move and the second judgment result is that the target to be detected does not move, judging that the target to be detected is abnormal; when the first judgment result is that the target to be detected needs to move and the second judgment result is that the target to be detected moves, judging that the target to be detected is not abnormal; when the first judgment result is that the target to be detected does not need to move and the second judgment result is that the target to be detected does not move, judging that the target to be detected does not have abnormality; and when the first judgment result is that the target to be detected does not need to move and the second judgment result is that the target to be detected moves, judging that the target to be detected is abnormal.

Specifically, when it is determined that the target to be detected is abnormal, the worker is prompted in various forms such as voice and text, for example, the target is sent to the intelligent terminal of the worker in a short message manner, for example, the target is sent to the intelligent terminal of the worker in a message pushing manner, for example, the target is sent to the intelligent terminal of the worker in a voice alarm manner, and the like, so that the worker can visit the monitored area on the spot and process the target based on the actual situation of the monitored area.

According to the invention, by utilizing technologies such as artificial intelligence and deep learning, real-time intelligent analysis of abnormal conditions such as no action of passengers for a long time, displacement of important articles and the like is realized, accidents occur, related workers are reminded to check in time, and the problems of casualties, untimely emergency treatment and the like caused by abnormal conditions such as no action of passengers for a long time, displacement of important articles and the like can be effectively reduced.

The following describes the abnormality determination device provided by the present invention, and the abnormality determination device described below and the abnormality determination method described above may be referred to correspondingly, and repeated details are not repeated, and the device is specifically shown in fig. 5:

a determining module 501, configured to determine a scene type of video data obtained by a camera;

the first judging module 502 is configured to judge whether a target to be detected corresponding to a scene category needs to move, so as to obtain a first judgment result;

the output module 503 is configured to input the video data into a target detection model, and output a target detection value of a target to be detected in the video data through the target detection model, where the target detection model is obtained by training video sample data and a target detection sample value corresponding to the target to be detected;

a second judging module 504, configured to judge whether the target to be detected moves based on the target detection value, so as to obtain a second judgment result;

and a determining module 505, configured to determine whether the target to be detected is abnormal based on the first determination result and the second determination result.

In a specific embodiment, the output module 503 is specifically configured to cut each frame of data in the video data to obtain image data, and sort the image data according to the timestamp to obtain target image data; and sequentially inputting the target image data into the target detection model, and sequentially outputting the target detection value of the target to be detected in each frame of target image data through the target detection model.

In a specific embodiment, the output module 503 is specifically configured to sequentially input the target image data into the target detection model; executing the following processing procedures on each frame of target image data through a target detection model: extracting target characteristics of a target to be detected in target image data; generating a candidate frame for the target feature; mapping the candidate frame to the target image data; determining a target detection value of a target to be detected through a candidate frame on the target image data; and determining a target candidate frame from the candidate frames, and outputting a target detection value of the target to be detected corresponding to the target candidate frame.

In a specific embodiment, the output module 503 is specifically configured to perform shallow feature extraction on target image data to obtain a target shallow feature; carrying out deep feature extraction on the target image data to obtain target deep features; and fusing the target shallow layer characteristic and the target deep layer characteristic to obtain the target characteristic.

In a specific embodiment, the second determining module has a function of executing the following calculation process based on a target detection value of a target to be detected in target image data corresponding to a current time: determining a target detection value of a target to be detected in each frame of target image data at the current time; calculating a target detection value of a target to be detected in a first frame of target image data at the current time, and taking a first position intersection value of the target detection value of the target to be detected in other frames of target image data at the current time, wherein the target detection value of the target to be detected in the first frame of target image data at the current time is taken as a current target detection value; determining a target detection value of a target to be detected in each frame of target image data corresponding to a time point after a first preset time; calculating a current target detection value, and calculating a second position intersection value of the target detection value of the target to be detected in each frame of target image data corresponding to a time point after the first preset time; when the first position intersection value and the second position intersection value are determined to be respectively larger than a first preset threshold value, determining a target detection value of a target to be detected in the last frame of target image data corresponding to a time point after at least one second preset time; calculating a current target detection value and a third position intersection value of the target detection value of the target to be detected in the last frame of target image data corresponding to at least one time point after a second preset time; when the third position intersection value is determined to be larger than a second preset threshold value, determining that the second judgment result is that the target to be detected does not move; and when the intersection value of the third position is determined to be smaller than or equal to a second preset threshold value, determining that the second judgment result is that the target to be detected moves.

In a specific embodiment, the determining module is specifically configured to determine that the target to be detected is abnormal when the first determination result indicates that the target to be detected needs to move and the second determination result indicates that the target to be detected does not move; when the first judgment result is that the target to be detected needs to move and the second judgment result is that the target to be detected moves, judging that the target to be detected is not abnormal; when the first judgment result is that the target to be detected does not need to move and the second judgment result is that the target to be detected does not move, judging that the target to be detected does not have abnormality; and when the first judgment result is that the target to be detected does not need to move and the second judgment result is that the target to be detected moves, judging that the target to be detected is abnormal.

Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)601, a communication Interface (Communications Interface)602, a memory (memory)603 and a communication bus 604, wherein the processor 601, the communication Interface 602 and the memory 603 complete communication with each other through the communication bus 604. The processor 601 may call logic instructions in the memory 603 to perform an exception determination method comprising: determining a scene type of video data obtained through a camera device; judging whether a target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result; inputting video data into a target detection model, and outputting a target detection value of a target to be detected in the video data through the target detection model, wherein the target detection model is obtained by training video sample data and a target detection sample value corresponding to the target of the sample to be detected; judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result; and judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result.

In addition, the logic instructions in the memory 603 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the anomaly determination method provided by the above methods, the method comprising: determining a scene type of video data obtained through a camera device; judging whether a target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result; inputting video data into a target detection model, and outputting a target detection value of a target to be detected in the video data through the target detection model, wherein the target detection model is obtained by training video sample data and a target detection sample value corresponding to the target of the sample to be detected; judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result; and judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result.

The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform the abnormality determination method provided above, the method including: determining a scene type of video data obtained through a camera device; judging whether a target to be detected corresponding to the scene category needs to move or not to obtain a first judgment result; inputting video data into a target detection model, and outputting a target detection value of a target to be detected in the video data through the target detection model, wherein the target detection model is obtained by training video sample data and a target detection sample value corresponding to the target of the sample to be detected; judging whether the target to be detected moves or not based on the target detection value to obtain a second judgment result; and judging whether the target to be detected is abnormal or not based on the first judgment result and the second judgment result.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An abnormality determination method characterized by comprising:

determining a scene type of video data obtained through a camera device;

2. The abnormality determination method according to claim 1, wherein said inputting the video data into an object detection model and outputting a target detection value of the object to be detected in the video data by the object detection model includes:

3. The abnormality determination method according to claim 2, wherein the sequentially inputting the target image data into the target detection model and sequentially outputting the target detection value of the target to be detected in each frame of the target image data by the target detection model includes:

sequentially inputting the target image data into the target detection model;

4. The abnormality determination method according to claim 3, wherein the extracting of the target feature of the target to be detected in the target image data includes:

5. The abnormality determination method according to any one of claims 2 to 4, wherein the determining whether the target to be detected moves based on the target detection value to obtain a second determination result includes:

6. The abnormality determination method according to claim 5, wherein the determining whether there is an abnormality in the target to be detected based on the first determination result and the second determination result includes:

7. An abnormality determination device, comprising:

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the abnormality determination method according to any one of claims 1 to 6 when executing the program.

9. A non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the abnormality determination method according to any one of claims 1 to 6.

10. A computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, implement the steps of the anomaly determination method as claimed in any one of claims 1 to 6.