WO2022116322A1 - 异常检测模型生成方法和装置、异常事件检测方法和装置 - Google Patents
异常检测模型生成方法和装置、异常事件检测方法和装置 Download PDFInfo
- Publication number
- WO2022116322A1 WO2022116322A1 PCT/CN2020/139499 CN2020139499W WO2022116322A1 WO 2022116322 A1 WO2022116322 A1 WO 2022116322A1 CN 2020139499 W CN2020139499 W CN 2020139499W WO 2022116322 A1 WO2022116322 A1 WO 2022116322A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- frame
- training
- anomaly detection
- feature information
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 110
- 238000000034 method Methods 0.000 title claims abstract description 73
- 238000012549 training Methods 0.000 claims abstract description 74
- 238000000605 extraction Methods 0.000 claims abstract description 29
- 230000004044 response Effects 0.000 claims abstract description 19
- 230000002159 abnormal effect Effects 0.000 claims description 40
- 230000003287 optical effect Effects 0.000 claims description 31
- 230000005856 abnormality Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 11
- 208000002693 Multiple Abnormalities Diseases 0.000 claims 1
- 230000006870 function Effects 0.000 description 34
- 238000010586 diagram Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000004913 activation Effects 0.000 description 6
- 230000006399 behavior Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 206010000117 Abnormal behaviour Diseases 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
Definitions
- the embodiments of the present application relate to the field of computer technologies, and in particular, to a method and device for generating an abnormality detection model, and a method and device for detecting abnormal events.
- Anomaly detection is a common application of machine learning algorithms. Let a system learn some normal features from many unlabeled data, so as to be able to diagnose abnormal data, we call this process anomaly detection. The so-called anomaly detection is to find objects that are different from most objects, in fact, to find outliers. Anomaly detection is defined in different fields. Anomaly detection in video refers to identifying events that do not match expected behavior and distinguishing between normal and abnormal events.
- the purpose of the embodiments of the present application is to propose an improved method and apparatus for generating an anomaly detection model to solve the technical problems mentioned in the above background technology section.
- an embodiment of the present application provides a method for generating an anomaly detection model, the method includes: acquiring a plurality of sample image frame sequences, wherein each sample image frame sequence includes a first image and a second image, and the first image frame sequence includes a first image and a second image.
- the second image is the next frame of the first image; based on the first image and the second image, the prediction frame generator included in the initial model is trained, wherein the prediction frame generator includes a multi-level feature extraction network and a generation network, The feature extraction network is used to extract the feature information of different depths of the first image and fuse the feature information, and the generation network is used to generate the predicted frame using the fused feature information; based on the predicted frame and the second image, the frame discriminator included in the training initial model is trained ; In response to the end of training, determine the initial model after training as the anomaly detection model.
- training the prediction frame generator included in the initial model includes: optimizing parameters of the feature extraction network based on a preset first loss function, wherein the first loss function includes At least one of the following: L2 distance loss, gradient constraint loss, and optical flow loss; optimize the parameters of the generation network based on a preset second loss function, where the second loss function includes least squares loss.
- training the frame discriminator included in the initial model includes: superimposing a preset number of image frames and the predicted frame before the second image into a multi-channel image; extracting the multi-channel image feature information of the image; perform optical flow estimation on the feature information of the multi-channel image to determine the optical flow loss between the predicted frame and the second image; optimize the parameters of the frame discriminator based on the optical flow loss.
- the number of first images is at least two.
- the method further includes: acquiring multiple anomaly detection models obtained through multiple trainings; determining the detection performance of the multiple anomaly detection models, and determining the anomaly detection model with the best detection performance as the abnormal event detection model model used.
- an embodiment of the present application provides a method for detecting an abnormal event.
- the apparatus includes: acquiring a sequence of image frames collected by an image acquisition device, wherein the sequence of image frames includes a first image and a second image, and the second image is the next frame of the first image; inputting the first image into a prediction frame generator included in a pre-trained anomaly detection model to obtain a predicted frame, wherein the anomaly detection model is pre-trained based on the method described in the first aspect above; Input the predicted frame and the second image into a pre-trained frame discriminator to obtain a numerical value representing the degree of similarity between the predicted frame and the second image; in response to determining that the numerical value is less than or equal to a preset threshold, output the corresponding value representing the second image.
- an embodiment of the present application provides a device for generating an abnormality detection model
- the device includes: a first acquisition module, configured to acquire a plurality of sample image frame sequences, wherein each sample image frame sequence includes a first image and a The second image, the second image is the next frame of the first image; the first training module is used to train the prediction frame generator included in the initial model based on the first image and the second image, wherein the prediction The frame generator includes a multi-level feature extraction network and a generation network, the feature extraction network is used to extract the feature information of different depths of the first image and fuse the feature information, and the generation network is used to generate a prediction frame by using the fused feature information; the second The training module is used for training the frame discriminator included in the initial model based on the predicted frame and the second image; the first determination module is used for determining the trained initial model as an abnormality detection model in response to the end of training.
- an embodiment of the present application provides an abnormal event detection apparatus, the apparatus includes: a third acquisition module, configured to acquire a sequence of image frames collected by an image acquisition device, wherein the sequence of image frames includes a first image and a first image Two images, the second image is the next frame of the first image; the prediction module is used to input the first image into the prediction frame generator included in the pre-trained anomaly detection model to obtain a predicted frame, wherein the abnormal The detection model is pre-trained based on the method described in the first aspect; the discrimination module is used to input the predicted frame and the second image into a pre-trained frame discriminator to obtain a numerical value representing the degree of similarity between the predicted frame and the second image; The output module is configured to, in response to determining that the value is less than or equal to a preset threshold, output information representing that an abnormal event occurs at a time point corresponding to the second image.
- a third acquisition module configured to acquire a sequence of image frames collected by an image acquisition device, wherein the sequence of image frames includes a first
- embodiments of the present application provide an electronic device, including: one or more processors; a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors , causing one or more processors to implement the method as described in any one of the first aspect or the second aspect.
- an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method described in any implementation manner of the first aspect or the second aspect .
- the method and device for generating an abnormality detection model, and the method and device for detecting abnormal events provided by the embodiments of the present application, by training the prediction frame generator included in the initial model based on the first image and the second image included in the obtained sample image frame sequence, predicting The frame generator generates the predicted frame, trains the frame discriminator included in the initial model based on the predicted frame and the second image, and finally determines the initial model after training as the anomaly detection model, because the frame generator uses features that combine a variety of different depths Informative approach, can generate predicted frames that are closer to reality, thereby improving the accuracy of anomaly detection.
- FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
- FIG. 2 is a flowchart of an embodiment of an anomaly detection model generation method according to the present application.
- FIG. 3 is a schematic structural diagram of an initial model according to the anomaly detection model generation method of the present application.
- FIG. 4 is a flowchart of another embodiment of the method for generating an anomaly detection model according to the present application.
- FIG. 5 is a flowchart of an embodiment of an abnormal event detection method according to the present application.
- FIG. 6 is a schematic structural diagram of an embodiment of an anomaly detection model generating apparatus according to the present application.
- FIG. 7 is a flowchart of an embodiment of an abnormal event detection apparatus according to the present application.
- FIG. 8 is a schematic structural diagram of a computer system suitable for implementing the electronic device according to the embodiment of the present application.
- FIG. 1 shows an exemplary system architecture 100 to which an anomaly detection model generation method according to an embodiment of the present application may be applied.
- the system architecture 100 may include a terminal device 101 , a network 102 and a server 103 .
- the network 102 is a medium used to provide a communication link between the terminal device 101 and the server 103 .
- the network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
- the user can use the terminal device 101 to interact with the server 103 through the network 102 to receive or send messages and the like.
- Various communication client applications such as monitoring applications, image processing applications, video processing applications, etc., may be installed on the terminal device 101 .
- the terminal device 101 may be various electronic devices, including but not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), in-vehicle terminals ( For example, mobile terminals such as car navigation terminals) and the like, and stationary terminals such as digital TVs, desktop computers, and the like.
- PDAs personal digital assistants
- PADs tablets
- PMPs portable multimedia players
- in-vehicle terminals For example, mobile terminals such as car navigation terminals) and the like, and stationary terminals such as digital TVs, desktop computers, and the like.
- the server 103 may be a server that provides various services, such as an image processing server that processes the sequence of image frames uploaded by the terminal device 101 .
- the image processing server can perform model training, anomaly detection and other processing on the received image frame sequence, and obtain processing results (such as anomaly detection model, anomaly detection information, etc.).
- the abnormality detection model generation method or the abnormal event monitoring method provided in the embodiment of the present application may be executed by the terminal device 101 or the server 103 , and accordingly, the abnormality detection model generation device or the abnormal event monitoring device may be set on the terminal device. 101 or server 103.
- terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs. It should be noted that, in the case where the samples for training the model or the images used for anomaly detection do not need to be obtained remotely, the above-mentioned system architecture may not include a network, but only a server or a terminal device.
- FIG. 2 shows a flow 200 of an embodiment of the method for generating an anomaly detection model according to the present application.
- the method includes the following steps:
- Step 201 acquiring multiple sample image frame sequences.
- the execution body of the method for generating an anomaly detection model may acquire multiple sample image frame sequences locally or remotely.
- each sequence of sample image frames may be image frames included in video clips cut from different videos.
- the above-mentioned multiple sample image sequences can come from preset datasets, such as UCSD-Ped2 or CUHK datasets.
- each sample image frame sequence includes a first image and a second image, and the second image is the next frame of the first image.
- the number of the first images can be set arbitrarily, for example, three.
- the number of the first images is at least two.
- the first image may include F t-1 , F t-2 , . . . , F tn , and the second image is F t .
- the image frames included in the sample image sequence may be color images of a fixed size obtained by scaling the original image, for example, 256 ⁇ 256 ⁇ 3, where 3 is the number of color channels.
- Step 202 based on the first image and the second image, train a prediction frame generator included in the initial model.
- the above-mentioned execution body may train the prediction frame generator included in the initial model based on the first image and the second image.
- the prediction frame generator includes a multi-level feature extraction network and a generation network, the feature extraction network is used to extract the feature information of different depths of the first image and fuse the feature information, and the generation network is used to generate the prediction frame by using the fused feature information.
- the fused feature information can be a feature map.
- a feature extraction network can include 20 convolutional layers (using 1x1 convolution and 3x3 convolution), 4 max pooling layers and 1 activation layer.
- Multi-layer convolution calculation is used to extract the feature information of different depths in the first image (that is, the normal behavior image), and these feature information are fused.
- the convolution calculation and the use of the Tanh activation function result in a 256 ⁇ 256 ⁇ 3 image, which is the predicted frame image.
- p 1 , p 2 , p 3 , and p 4 are the first image
- p t+1 is the second image.
- 301 is the initial network.
- the first image is input to the initial network through multiple convolutions to obtain the fused feature information, and then through three Conv(3,3) convolution operations, and through the Tanh activation function, the predicted frame is output.
- the predicted frame can be compared with the corresponding second image, the loss value representing the gap between the predicted frame and the second image can be determined by using a preset loss function, and the feature extraction network and generation The parameters of the network make the predicted frame close to the second image, and when the training end conditions are met (for example, the loss value converges, the training duration reaches the preset duration, the number of training times reaches the preset number, etc.), the training ends.
- 302 represents three loss functions, by comparing p t+1 and The parameters of the initial model are optimized to minimize the loss values of the three loss functions.
- step 202 may include the following steps:
- Step 1 based on the preset first loss function, optimize the parameters of the feature extraction network.
- the first loss function includes at least one of the following: L2 distance loss, gradient constraint loss, and optical flow loss.
- Step 2 Based on a preset second loss function, the parameters of the generation network are optimized, wherein the second loss function includes a least squares loss.
- the above loss functions can be added to obtain the sum of the loss values, and the network parameters can be optimized by using the sum of the loss values.
- the above-mentioned optical flow loss solves the problem of motion detection of objects under complex lighting conditions, and can learn the potential laws of normal behavior characteristics to the greatest extent.
- the network parameters can be optimized from various aspects in this implementation manner, which is helpful to improve the accuracy of the prediction frame generated by the frame generator obtained by training.
- Step 203 based on the predicted frame and the second image, train a frame discriminator included in the initial model.
- the above-mentioned executive body may train the frame discriminator included in the initial model based on the predicted frame and the second image.
- the frame discriminator is used to discriminate whether the two input images are the same.
- Frame discriminators are usually trained based on convolutional neural networks. During training, the predicted frame and the actual frame (ie, the second image) are used as input, the annotation information used to distinguish the predicted frame and the actual frame is used as the expected output, and the frame discriminator is trained by using the machine learning method. The training goal is that the time frame discriminator has the highest discriminative accuracy.
- the predicted frame generator and frame discriminator are trained alternately. For example, the parameters of the frame discriminator are fixed first, and the parameters of the predicted frame generator are optimized until the frame discriminator cannot correctly discriminate between the predicted frame and the actual frame. Then the parameters of the predicted frame generator are fixed, and the parameters of the frame discriminator are optimized until the frame discriminator can accurately discriminate between the predicted frame and the actual frame.
- D is the frame discriminator, which combines p t+1 and Enter D to get information indicating whether the current frame is normal or abnormal.
- step 203 may be performed as follows:
- Step 2031 superimposing a preset number of image frames and predicted frames before the second image into a multi-channel image.
- the preset number is 5, the first 5 frames of images are superimposed into multi-channel image data.
- the multi-channel image can also be cropped to meet the input requirements of the subsequent neural network. For example, crop to a 512 ⁇ 384 size image.
- Step 2032 extract feature information of the multi-channel image.
- a neural network model can be used to extract feature information of a multi-channel image.
- a Flownet (optical flow neural network) model can be used for feature extraction.
- Flownet may include 12 3x3 convolutional layers for feature extraction on the input image.
- Flownet can perform optical flow estimation on the input image, and the obtained feature information can reflect the relationship between adjacent multi-frame images.
- Step 2033 Perform optical flow estimation on the feature information of the multi-channel image to determine the optical flow loss between the predicted frame and the second image.
- the method of optical flow estimation can be an existing method.
- the above-mentioned Flownet model can be used to perform optical flow estimation on the extracted feature information, and use a preset optical flow loss function to determine the optical flow loss.
- Step 2034 optimize the parameters of the frame discriminator based on the optical flow loss.
- multiple pairs of image data input models can be used repeatedly, and the parameters of the Flownet model can be iteratively optimized to minimize the loss value of optical flow loss. It should be understood that the optical flow estimation and the optical flow loss are the current prior art, and the specific implementation manner will not be repeated here.
- This implementation method optimizes the frame discriminator by stacking multiple frames of images and using the optical flow estimation method, which can accurately reflect the movement of the object under complex lighting conditions and improve the discrimination accuracy.
- Step 204 in response to the end of the training, determine the initial model after training as an anomaly detection model.
- the above-mentioned execution body may determine the initial model after training as the abnormality detection model in response to the end of the training.
- the training end condition may include, but is not limited to, at least one of the following: the loss value of the loss function converges, the number of training times reaches a preset number of times, and the training duration reaches a preset duration, and the like.
- the resulting anomaly detection model consists of a trained predicted frame generator and frame discriminator.
- the method may further include the following steps:
- the method for training the above-mentioned multiple anomaly detection models is the same as the above-mentioned steps 201 to 204 .
- the detection performance of the multiple anomaly detection models is determined, and the anomaly detection model with the best detection performance is determined as the model used for abnormal event detection.
- the performance of the anomaly detection model can be characterized by various indicators, such as at least one of the following: detection accuracy (that is, the higher the accuracy, the better the performance), detection time (that is, under the condition of ensuring the accuracy, the length of a single detection The shorter, the better the performance) and so on.
- an anomaly detection model with better performance can be obtained by performing performance screening on multiple anomaly detection models obtained by multiple trainings.
- the prediction frame generator included in the initial model is trained based on the first image and the second image included in the obtained sample image frame sequence, and the prediction frame generator generates a prediction frame, based on the prediction frame and For the second image, train the frame discriminator included in the initial model, and finally determine the initial model after training as the anomaly detection model. Since the frame generator adopts the method of fusing feature information of various depths, the generated predicted frame can be more accurate. close to reality, thereby improving the accuracy of anomaly detection.
- a flow 500 of one embodiment of an abnormal event detection method according to the present application is shown.
- the method includes the following steps:
- Step 501 Acquire a sequence of image frames collected by an image collection device.
- the above-mentioned execution subject may acquire the image frame sequence acquired by the image acquisition device locally or remotely.
- the image acquisition device may be a device such as a camera included in the above-mentioned execution body, or may be a device such as a camera included in other devices communicatively connected to the above-mentioned execution body.
- the image frame sequence may be an image frame sequence included in a video collected in real time, or may be an image frame sequence included in a pre-stored video file.
- the image frame sequence includes a first image and a second image, and the second image is the next frame of the first image.
- the definitions of the first image and the second image are basically the same as the above step 201, and details are not repeated here.
- Step 502 Input the first image into a prediction frame generator included in the pre-trained anomaly detection model to obtain a prediction frame.
- the above-mentioned execution subject may input the first image into the prediction frame generator included in the pre-trained anomaly detection model to obtain the prediction frame.
- the anomaly detection model is pre-trained based on the method described in the above-mentioned embodiment corresponding to FIG. 2 .
- the prediction frame generator reference may be made to the description in the above-mentioned embodiment corresponding to FIG. 2 .
- Step 503 Input the predicted frame and the second image into a pre-trained frame discriminator, and obtain a numerical value representing the degree of similarity between the predicted frame and the second image.
- the above-mentioned execution subject may input the predicted frame and the second image into a pre-trained frame discriminator, and obtain a numerical value representing the degree of similarity between the predicted frame and the second image.
- a numerical value representing the degree of similarity For the description of the frame discriminator, reference may be made to the description in the corresponding embodiment of FIG. 2 above.
- the above numerical value representing the similarity degree can be calculated by various methods, such as determining the cosine distance, Euclidean distance, etc. between the images to calculate the similarity degree.
- Step 504 in response to determining that the value is less than or equal to a preset threshold, output information representing that an abnormal event occurs at a time point corresponding to the second image.
- the execution subject in response to determining that the value is less than or equal to a preset threshold, may output information representing that an abnormal event occurs at a time point corresponding to the second image.
- the above-mentioned numerical value when the above-mentioned numerical value is less than or equal to the preset threshold, it indicates that there is a large gap between the predicted image frame and the actual image frame. At this time, an abnormal situation may occur within the shooting range of the camera, and various forms of information are further output. To alert the user that an abnormal situation has occurred.
- the above-mentioned information representing the occurrence of an abnormal event may include, but is not limited to, information in at least one of the following forms: text, image, alarm sound, and the like.
- the abnormal event detection method provided by the above-mentioned embodiments of the present application by using the abnormality detection model trained in the above-mentioned embodiment corresponding to FIG. 2 , can output information indicating that an abnormal phenomenon has occurred when the predicted frame is greatly different from the actual frame, so that it is possible to Monitor abnormal behavior efficiently and accurately.
- the present application provides an embodiment of an anomaly detection model generation apparatus, the apparatus embodiment corresponds to the method embodiment shown in FIG. 2 , the apparatus Specifically, it can be applied to various electronic devices.
- the anomaly detection model generation apparatus 600 in this embodiment includes: a first acquisition module 601, configured to acquire a plurality of sample image frame sequences, wherein each sample image frame sequence includes a first image and a second image , the second image is the next frame of the first image; the first training module 602 is used to train the prediction frame generator included in the initial model based on the first image and the second image, wherein the prediction frame generator includes a multi-level The feature extraction network and the generation network, the feature extraction network is used to extract the feature information of different depths of the first image and fuse the feature information, and the generation network is used to generate the prediction frame by using the fused feature information; the second training module 603 is used for Based on the predicted frame and the second image, the frame discriminator included in the training initial model is trained; the first determination module 604 is configured to determine the trained initial model as an anomaly detection model in response to the end of training
- the first acquisition module 601 may acquire multiple sample image frame sequences locally or remotely.
- each sequence of sample image frames may be image frames included in video clips cut from different videos.
- the above-mentioned multiple sample image sequences can come from preset datasets, such as UCSD-Ped2 or CUHK datasets.
- each sample image frame sequence includes a first image and a second image, and the second image is the next frame of the first image.
- the number of the first images can be set arbitrarily, for example, three.
- the first training module 602 may train the prediction frame generator included in the initial model based on the first image and the second image.
- the prediction frame generator includes a multi-level feature extraction network and a generation network, the feature extraction network is used to extract the feature information of different depths of the first image and fuse the feature information, and the generation network is used to generate the prediction frame by using the fused feature information.
- the fused feature information can be a feature map.
- a feature extraction network can include 20 convolutional layers (using 1x1 convolution and 3x3 convolution), 4 max pooling layers and 1 activation layer.
- Multi-layer convolution calculation is used to extract the feature information of different depths in the first image (that is, the normal behavior image), and these feature information are fused.
- the convolution calculation and the use of the Tanh activation function result in a 256 ⁇ 256 ⁇ 3 image, which is the predicted frame image.
- p 1 , p 2 , p 3 , and p 4 are the first image
- p t+1 is the second image.
- 301 is the initial network, the first image is input to the initial network after multiple convolutions to obtain the fused feature information, and then through three Conv(3,3) convolution operations, and through the Tanh activation function, the predicted frame is output.
- the predicted frame can be compared with the corresponding second image, the loss value representing the gap between the predicted frame and the second image can be determined by using a preset loss function, and the feature extraction network and generation The parameters of the network make the predicted frame close to the second image, and when the training end conditions are met (for example, the loss value converges, the training duration reaches the preset duration, the number of training times reaches the preset number, etc.), the training ends.
- 302 represents three loss functions, by comparing p t+1 and The parameters of the initial model are optimized to minimize the loss values of the three loss functions.
- the second training module 603 may train the frame discriminator included in the initial model based on the predicted frame and the second image.
- the frame discriminator is used to discriminate whether the two input images are the same.
- Frame discriminators are usually trained based on convolutional neural networks. During training, the predicted frame and the actual frame (ie, the second image) are used as input, the annotation information used to distinguish the predicted frame and the actual frame is used as the expected output, and the frame discriminator is trained by using the machine learning method. The training goal is that the time frame discriminator has the highest discriminative accuracy.
- the predicted frame generator and frame discriminator are trained alternately. For example, the parameters of the frame discriminator are fixed first, and the parameters of the predicted frame generator are optimized until the frame discriminator cannot correctly discriminate between the predicted frame and the actual frame. Then the parameters of the predicted frame generator are fixed, and the parameters of the frame discriminator are optimized until the frame discriminator can accurately discriminate between the predicted frame and the actual frame.
- D is the frame discriminator, which combines p t+1 and Enter D to get information indicating whether the current frame is normal or abnormal.
- the first determination module 604 may determine the trained initial model as the abnormality detection model in response to the end of the training.
- the training end conditions may include but are not limited to at least one of the following: the loss value of the loss function converges, the number of training times reaches a preset number of times, and the training duration reaches a preset duration, etc.
- the resulting anomaly detection model consists of a trained predicted frame generator and frame discriminator.
- the first training module may include: a first optimization unit (not shown in the figure), configured to optimize the parameters of the feature extraction network based on the preset first loss function,
- the first loss function includes at least one of the following: L2 distance loss, gradient constraint loss, and optical flow loss; and a second optimization unit (not shown in the figure), configured to optimize the generation network based on a preset second loss function , where the second loss function includes a least squares loss.
- the second training module 603 may include: a superimposing unit (not shown in the figure), configured to superimpose a preset number of image frames before the second image and the predicted frame is a multi-channel image; an extraction unit (not shown in the figure) is used to extract the feature information of the multi-channel image; an estimation unit (not shown in the figure) is used to perform optical flow estimation on the feature information of the multi-channel image to determine The optical flow loss between the predicted frame and the second image; the third optimization unit (not shown in the figure) is used to optimize the parameters of the frame discriminator based on the optical flow loss.
- a superimposing unit (not shown in the figure), configured to superimpose a preset number of image frames before the second image and the predicted frame is a multi-channel image
- an extraction unit (not shown in the figure) is used to extract the feature information of the multi-channel image
- an estimation unit (not shown in the figure) is used to perform optical flow estimation on the feature information of the multi-channel image to determine The optical flow loss between the predicted frame and the second image
- the number of the first images is at least two.
- the apparatus 600 may further include: a second acquisition module (not shown in the figure), configured to acquire multiple anomaly detection models obtained through multiple trainings; a second determination module (not shown in the figure), used to determine the detection performance of multiple anomaly detection models, and determine the anomaly detection model with the best detection performance as the model used for abnormal event detection.
- a second acquisition module (not shown in the figure), configured to acquire multiple anomaly detection models obtained through multiple trainings
- a second determination module (not shown in the figure), used to determine the detection performance of multiple anomaly detection models, and determine the anomaly detection model with the best detection performance as the model used for abnormal event detection.
- the predicted frame generator included in the initial model is trained based on the first image and the second image included in the obtained sample image frame sequence, and the predicted frame generator generates a predicted frame, based on the predicted frame and For the second image, train the frame discriminator included in the initial model, and finally determine the initial model after training as the anomaly detection model. Since the frame generator adopts the method of fusing feature information of various depths, the generated predicted frame can be more accurate. close to reality, thereby improving the accuracy of anomaly detection.
- the present application provides an embodiment of an abnormality detection model generation apparatus
- the apparatus embodiment corresponds to the method embodiment shown in FIG. 5
- the apparatus 700 for generating an anomaly detection model in this embodiment includes: a third acquiring module 701, configured to acquire a sequence of image frames acquired by an image acquisition device, wherein the sequence of image frames includes a first image and a second image , the second image is the next frame of the first image; the prediction module 702 is configured to input the first image into the prediction frame generator included in the pre-trained anomaly detection model to obtain a predicted frame, wherein the anomaly detection model is pre-based on the above The method described in the first aspect is obtained by training; the discrimination module 703 is used to input the predicted frame and the second image into a pre-trained frame discriminator to obtain a numerical value representing the degree of similarity between the predicted frame and the second image; the output module 704, In response to determining that the value is less than or equal to a preset threshold, outputting information representing the occurrence of an abnormal event at a time point corresponding to the second image
- the third acquisition module 701 may acquire the image frame sequence acquired by the image acquisition device locally or remotely.
- the image acquisition device may be a device such as a camera included in the foregoing apparatus 700 , or may be a device such as a camera included in other devices communicatively connected to the foregoing apparatus 700 .
- the image frame sequence may be an image frame sequence included in a video collected in real time, or may be an image frame sequence included in a pre-stored video file.
- the image frame sequence includes a first image and a second image, and the second image is the next frame of the first image.
- the definitions of the first image and the second image are basically the same as the above step 201, and details are not repeated here.
- the prediction module 702 may input the first image into the prediction frame generator included in the pre-trained anomaly detection model to obtain the prediction frame.
- the anomaly detection model is pre-trained based on the method described in the above-mentioned embodiment corresponding to FIG. 2 .
- the prediction frame generator reference may be made to the description in the above-mentioned embodiment corresponding to FIG. 2 .
- the discrimination module 703 may input the predicted frame and the second image into a pre-trained frame discriminator to obtain a numerical value representing the degree of similarity between the predicted frame and the second image.
- a numerical value representing the degree of similarity For the description of the frame discriminator, reference may be made to the description in the corresponding embodiment of FIG. 2 above.
- the above numerical value representing the similarity degree can be calculated by various methods, such as determining the cosine distance, Euclidean distance, etc. between the images to calculate the similarity degree.
- the output module 704 may, in response to determining that the value is less than or equal to a preset threshold, output information representing that an abnormal event occurs at a time point corresponding to the second image.
- the above-mentioned numerical value when the above-mentioned numerical value is less than or equal to the preset threshold, it indicates that there is a large gap between the predicted image frame and the actual image frame. At this time, an abnormal situation may occur within the shooting range of the camera, and various forms of information are further output. To alert the user that an abnormal situation has occurred.
- the above-mentioned information representing the occurrence of an abnormal event may include, but is not limited to, information in at least one of the following forms: text, image, alarm sound, and the like.
- the device provided by the above-mentioned embodiment of the present application by using the abnormality detection model trained in the above-mentioned embodiment corresponding to FIG. 2 , can output the information indicating the occurrence of abnormal phenomenon when the predicted frame is greatly different from the actual frame, so that it can be efficiently and accurately Monitor abnormal behavior.
- FIG. 8 shows a schematic structural diagram of a computer system 800 suitable for implementing the electronic device of the embodiment of the present application.
- the electronic device shown in FIG. 8 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
- a computer system 800 includes a central processing unit (CPU) 801, which can be loaded into a random access memory (RAM) 803 according to a program stored in a read only memory (ROM) 802 or a program from a storage section 808 Instead, various appropriate actions and processes are performed.
- RAM random access memory
- ROM read only memory
- various programs and data required for the operation of the system 800 are also stored.
- the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
- An input/output (I/O) interface 805 is also connected to bus 804 .
- the following components are connected to the I/O interface 805: an input section 806 including a keyboard, a mouse, etc.; an output section 807 including a liquid crystal display (LCD), etc. and a speaker, etc.; a storage section 808 including a hard disk, etc.; Communication section 809 of a network interface card such as a modem.
- the communication section 809 performs communication processing via a network such as the Internet.
- a drive 810 is also connected to the I/O interface 805 as needed.
- a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 810 as needed so that a computer program read therefrom is installed into the storage section 808 as needed.
- embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
- the computer program may be downloaded and installed from the network via the communication portion 809, and/or installed from the removable medium 811.
- CPU central processing unit
- the computer-readable storage medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
- the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
- a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium can also be any computer-readable storage medium other than a computer-readable storage medium that can be sent, propagated, or transmitted for use by or in connection with the instruction execution system, apparatus, or device program of.
- Program code embodied on a computer-readable storage medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
- Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional procedural programming language - such as "C" language or similar programming language.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
- the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
- LAN local area network
- WAN wide area network
- Internet service provider e.g., using an Internet service provider through Internet connection.
- each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
- the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
- the modules involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner.
- the described modules can also be set in the processor, for example, it can be described as: a processor includes a first acquisition module, a first training module, a second training module and a first determination module.
- a processor includes a first acquisition module, a first training module, a second training module and a first determination module.
- the names of these modules do not constitute a limitation of the unit itself under certain circumstances, for example, the first acquisition module can also be described as "a module for acquiring multiple sample image frame sequences".
- the present application also provides a computer-readable storage medium.
- the computer-readable storage medium may be included in the electronic device described in the above-mentioned embodiments; in electronic equipment.
- the above-mentioned computer-readable storage medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires a plurality of sample image frame sequences, wherein each sample image frame sequence includes The first image and the second image, where the second image is the next frame of the first image; based on the first image and the second image, the prediction frame generator included in the training initial model, wherein the prediction frame generator includes a multi-level Feature extraction network and generation network, the feature extraction network is used to extract the feature information of different depths of the first image and fuse the feature information, and the generation network is used to generate the predicted frame by using the fused feature information; based on the predicted frame and the second image, training The frame discriminator included in the initial model; in response to the end of training, the trained initial model is determined as
- the electronic device can also be caused to: acquire a sequence of image frames acquired by the image acquisition device, wherein the sequence of image frames includes a first image and a second image, and the second image The image is the next frame of the first image; the first image is input into the prediction frame generator included in the pre-trained anomaly detection model to obtain the predicted frame, wherein the anomaly detection model is pre-trained based on the method described in the first aspect above; Input the predicted frame and the second image into a pre-trained frame discriminator to obtain a numerical value representing the degree of similarity between the predicted frame and the second image; in response to determining that the numerical value is less than or equal to a preset threshold, output the corresponding value representing the second image.
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (10)
- 一种异常检测模型生成方法,其特征在于,所述方法包括:获取多个样本图像帧序列,其中,每个样本图像帧序列包括第一图像和第二图像,所述第二图像为所述第一图像的下一帧图像;基于所述第一图像和所述第二图像,训练初始模型包括的预测帧生成器,其中,所述预测帧生成器包括多层次的特征提取网络和生成网络,所述特征提取网络用于提取所述第一图像的不同深度的特征信息并融合所述特征信息,所述生成网络用于利用融合后的特征信息生成预测帧;基于所述预测帧和所述第二图像,训练所述初始模型包括的帧判别器;响应于训练结束,将训练后的初始模型确定为异常检测模型。
- 根据权利要求1所述的方法,其特征在于,所述基于所述第一图像和所述第二图像,训练初始模型包括的预测帧生成器,包括:基于预设的第一损失函数,优化所述特征提取网络的参数,其中,所述第一损失函数包括以下至少一种:L2距离损失、梯度约束损失、光流损失;基于预设的第二损失函数,优化所述生成网络的参数,其中,所述第二损失函数包括最小二乘损失。
- 根据权利要求1所述的方法,其特征在于,所述基于所述预测帧和所述第二图像,训练所述初始模型包括的帧判别器,包括:将位于所述第二图像之前的预设数量个图像帧与所述预测帧叠加为多通道图像;提取所述多通道图像的特征信息;对所述多通道图像的特征信息进行光流估计以确定所述预测帧与所述第二图像之间的光流损失;基于所述光流损失,对所述帧判别器的参数进行优化。
- 根据权利要求1-3之一所述的方法,其特征在于,所述第一图像的数量为至少两个。
- 根据权利要求1-3之一所述的方法,其特征在于,所述方法还包括:获取经过多次训练得到的多个异常检测模型;确定所述多个异常检测模型的检测性能,并将检测性能最优的异常检测模型确定为进行异常事件检测所用的模型。
- 一种异常事件检测方法,其特征在于,所述方法包括:获取由图像采集设备采集的图像帧序列,其中,所述图像帧序列包括第一图像和第二图像,所述第二图像为所述第一图像的下一帧图像;将所述第一图像输入预先训练的异常检测模型包括的预测帧生成器,得到预测帧,其中,所述异常检测模型预先基于权利要求1-5之一所述的方法训练得到;将所述预测帧和所述第二图像输入预先训练的帧判别器,得到表征所述预测帧和所述第二图像之间的相似程度的数值;响应于确定所述数值小于或等于预设的阈值,输出表征所述第二图像对应的时间点发生异常事件的信息。
- 一种异常检测模型生成装置,其特征在于,所述装置包括:第一获取模块,用于获取多个样本图像帧序列,其中,每个样本图像帧序列包括第一图像和第二图像,所述第二图像为所述第一图像的下一帧图像;第一训练模块,用于基于所述第一图像和所述第二图像,训练初始模型包括的预测帧生成器,其中,所述预测帧生成器包括多层次的特征 提取网络和生成网络,所述特征提取网络用于提取所述第一图像的不同深度的特征信息并融合所述特征信息,所述生成网络用于利用融合后的特征信息生成预测帧;第二训练模块,用于基于所述预测帧和所述第二图像,训练所述初始模型包括的帧判别器;第一确定模块,用于响应于训练结束,将训练后的初始模型确定为异常检测模型。
- 一种异常事件检测装置,其特征在于,所述装置包括:第三获取模块,用于获取由图像采集设备采集的图像帧序列,其中,所述图像帧序列包括第一图像和第二图像,所述第二图像为所述第一图像的下一帧图像;预测模块,用于将所述第一图像输入预先训练的异常检测模型包括的预测帧生成器,得到预测帧,其中,所述异常检测模型预先基于权利要求1-5之一所述的方法训练得到;判别模块,用于将所述预测帧和所述第二图像输入预先训练的帧判别器,得到表征所述预测帧和所述第二图像之间的相似程度的数值;输出模块,用于响应于确定所述数值小于或等于预设的阈值,输出表征所述第二图像对应的时间点发生异常事件的信息。
- 一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-6中任一所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011405894.4 | 2020-12-02 | ||
CN202011405894.4A CN112465049A (zh) | 2020-12-02 | 2020-12-02 | 异常检测模型生成方法和装置、异常事件检测方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022116322A1 true WO2022116322A1 (zh) | 2022-06-09 |
Family
ID=74806531
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/139499 WO2022116322A1 (zh) | 2020-12-02 | 2020-12-25 | 异常检测模型生成方法和装置、异常事件检测方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112465049A (zh) |
WO (1) | WO2022116322A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115238805A (zh) * | 2022-07-29 | 2022-10-25 | 中国电信股份有限公司 | 异常数据识别模型的训练方法及相关设备 |
CN115296984A (zh) * | 2022-08-08 | 2022-11-04 | 中国电信股份有限公司 | 异常网络节点的检测方法及装置、设备、存储介质 |
CN115546293A (zh) * | 2022-12-02 | 2022-12-30 | 广汽埃安新能源汽车股份有限公司 | 障碍物信息融合方法、装置、电子设备和计算机可读介质 |
CN115984757A (zh) * | 2023-03-20 | 2023-04-18 | 松立控股集团股份有限公司 | 一种基于全局局部双流特征互学习的异常事件检测方法 |
CN117115740A (zh) * | 2023-09-05 | 2023-11-24 | 北京智芯微电子科技有限公司 | 基于深度学习的电梯开关门状态检测方法、装置及设备 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113468945A (zh) * | 2021-03-26 | 2021-10-01 | 厦门大学 | 游泳者溺水检测方法 |
CN113364792B (zh) * | 2021-06-11 | 2022-07-12 | 奇安信科技集团股份有限公司 | 流量检测模型的训练方法、流量检测方法、装置及设备 |
CN113435432B (zh) * | 2021-08-27 | 2021-11-30 | 腾讯科技(深圳)有限公司 | 视频异常检测模型训练方法、视频异常检测方法和装置 |
CN113743607B (zh) * | 2021-09-15 | 2023-12-05 | 京东科技信息技术有限公司 | 异常检测模型的训练方法、异常检测方法及装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180189610A1 (en) * | 2015-08-24 | 2018-07-05 | Carl Zeiss Industrielle Messtechnik Gmbh | Active machine learning for training an event classification |
CN109522828A (zh) * | 2018-11-01 | 2019-03-26 | 上海科技大学 | 一种异常事件检测方法及系统、存储介质及终端 |
CN110705376A (zh) * | 2019-09-11 | 2020-01-17 | 南京邮电大学 | 一种基于生成式对抗网络的异常行为检测方法 |
CN112016500A (zh) * | 2020-09-04 | 2020-12-01 | 山东大学 | 基于多尺度时间信息融合的群体异常行为识别方法及系统 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259814B (zh) * | 2020-01-17 | 2023-10-31 | 杭州涂鸦信息技术有限公司 | 一种活体检测方法及系统 |
CN111881750A (zh) * | 2020-06-24 | 2020-11-03 | 北京工业大学 | 基于生成对抗网络的人群异常检测方法 |
-
2020
- 2020-12-02 CN CN202011405894.4A patent/CN112465049A/zh active Pending
- 2020-12-25 WO PCT/CN2020/139499 patent/WO2022116322A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180189610A1 (en) * | 2015-08-24 | 2018-07-05 | Carl Zeiss Industrielle Messtechnik Gmbh | Active machine learning for training an event classification |
CN109522828A (zh) * | 2018-11-01 | 2019-03-26 | 上海科技大学 | 一种异常事件检测方法及系统、存储介质及终端 |
CN110705376A (zh) * | 2019-09-11 | 2020-01-17 | 南京邮电大学 | 一种基于生成式对抗网络的异常行为检测方法 |
CN112016500A (zh) * | 2020-09-04 | 2020-12-01 | 山东大学 | 基于多尺度时间信息融合的群体异常行为识别方法及系统 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115238805A (zh) * | 2022-07-29 | 2022-10-25 | 中国电信股份有限公司 | 异常数据识别模型的训练方法及相关设备 |
CN115238805B (zh) * | 2022-07-29 | 2023-12-15 | 中国电信股份有限公司 | 异常数据识别模型的训练方法及相关设备 |
CN115296984A (zh) * | 2022-08-08 | 2022-11-04 | 中国电信股份有限公司 | 异常网络节点的检测方法及装置、设备、存储介质 |
CN115296984B (zh) * | 2022-08-08 | 2023-12-19 | 中国电信股份有限公司 | 异常网络节点的检测方法及装置、设备、存储介质 |
CN115546293A (zh) * | 2022-12-02 | 2022-12-30 | 广汽埃安新能源汽车股份有限公司 | 障碍物信息融合方法、装置、电子设备和计算机可读介质 |
CN115546293B (zh) * | 2022-12-02 | 2023-03-07 | 广汽埃安新能源汽车股份有限公司 | 障碍物信息融合方法、装置、电子设备和计算机可读介质 |
CN115984757A (zh) * | 2023-03-20 | 2023-04-18 | 松立控股集团股份有限公司 | 一种基于全局局部双流特征互学习的异常事件检测方法 |
CN115984757B (zh) * | 2023-03-20 | 2023-05-16 | 松立控股集团股份有限公司 | 一种基于全局局部双流特征互学习的异常事件检测方法 |
CN117115740A (zh) * | 2023-09-05 | 2023-11-24 | 北京智芯微电子科技有限公司 | 基于深度学习的电梯开关门状态检测方法、装置及设备 |
Also Published As
Publication number | Publication date |
---|---|
CN112465049A (zh) | 2021-03-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022116322A1 (zh) | 异常检测模型生成方法和装置、异常事件检测方法和装置 | |
US11995528B2 (en) | Learning observation representations by predicting the future in latent space | |
CN111314733B (zh) | 用于评估视频清晰度的方法和装置 | |
US11392792B2 (en) | Method and apparatus for generating vehicle damage information | |
WO2020087974A1 (zh) | 生成模型的方法和装置 | |
EP3893125A1 (en) | Method and apparatus for searching video segment, device, medium and computer program product | |
CN109376267B (zh) | 用于生成模型的方法和装置 | |
CN109447156B (zh) | 用于生成模型的方法和装置 | |
CN111523640B (zh) | 神经网络模型的训练方法和装置 | |
WO2022252881A1 (zh) | 图像处理方法、装置、可读介质和电子设备 | |
CN108228428B (zh) | 用于输出信息的方法和装置 | |
CN113140012B (zh) | 图像处理方法、装置、介质及电子设备 | |
CN112200173B (zh) | 多网络模型训练方法、图像标注方法和人脸图像识别方法 | |
CN117690063B (zh) | 电缆线路检测方法、装置、电子设备与计算机可读介质 | |
CN111598006A (zh) | 用于标注对象的方法和装置 | |
CN118053123B (zh) | 报警信息生成方法、装置、电子设备与计算机介质 | |
CN115294501A (zh) | 视频识别方法、视频识别模型训练方法、介质及电子设备 | |
CN118229967A (zh) | 模型构建方法、图像分割方法、装置、设备、介质 | |
CN113033707B (zh) | 视频分类方法、装置、可读介质及电子设备 | |
WO2022148239A1 (zh) | 信息输出方法、装置和电子设备 | |
US11954591B2 (en) | Picture set description generation method and apparatus, and computer device and storage medium | |
CN115375656A (zh) | 息肉分割模型的训练方法、分割方法、装置、介质及设备 | |
CN114510932A (zh) | 自然语言处理方法、电子设备、存储介质 | |
CN114238968A (zh) | 应用程序检测方法及装置、存储介质及电子设备 | |
CN114004229A (zh) | 文本识别方法、装置、可读介质及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20964167 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20964167 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.11.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20964167 Country of ref document: EP Kind code of ref document: A1 |