CN115035432A - Abnormal video detection method, device, medium and equipment - Google Patents

Abnormal video detection method, device, medium and equipment Download PDF

Info

Publication number
CN115035432A
CN115035432A CN202210240166.5A CN202210240166A CN115035432A CN 115035432 A CN115035432 A CN 115035432A CN 202210240166 A CN202210240166 A CN 202210240166A CN 115035432 A CN115035432 A CN 115035432A
Authority
CN
China
Prior art keywords
video
image
abnormal
target image
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210240166.5A
Other languages
Chinese (zh)
Inventor
于淼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yuncong Technology Group Co Ltd
Original Assignee
Yuncong Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yuncong Technology Group Co Ltd filed Critical Yuncong Technology Group Co Ltd
Priority to CN202210240166.5A priority Critical patent/CN115035432A/en
Publication of CN115035432A publication Critical patent/CN115035432A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method for detecting abnormal videos, which comprises the following steps: acquiring a video to be detected; converting the video to be detected into a video continuous frame sequence ordered according to a time sequence; extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image; classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result; and judging whether the video to be detected is an abnormal video or not according to the classification result and the similarity calculation result. According to the method, subtle changes among frames are captured through continuous frames, and algorithm differentiation is not performed through texture features of single frames; and the final judgment is carried out by combining the similarity measurement of adjacent frames instead of only depending on a simple model classification result, so that the specific algorithm has the characteristics of strong robustness and good generalization, and can be applied to scenes with more variable environmental conditions.

Description

Abnormal video detection method, device, medium and equipment
Technical Field
The invention relates to the technical field of picture processing, in particular to a method, a device, a medium and equipment for detecting abnormal videos.
Background
With the development of the internet and deep learning technologies, the splicing and tampering problems of videos are increased day by day, and some existing online videos have risks of abnormal transformation such as splicing and tampering. Some lawbreakers can acquire various videos through different channels, and illegally splice a series of videos through black software and then recycle the videos, so that the problems of infringement and even user privacy infringement exist. In other scenes, due to the problems of frame loss, blocking and the like of normal videos caused by some reasons, unnatural jumping among continuous frames of the videos can be caused, the actual experience of a user is influenced, and an automatic algorithm is required to effectively detect the abnormal spliced videos.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, it is an object of the present invention to provide a method, an apparatus, a medium and a device for detecting abnormal video, which are used to solve at least one of the shortcomings in the prior art.
To achieve the above and other related objects, the present invention provides a method for detecting abnormal video, including:
acquiring a video to be detected;
converting the video to be detected into a video continuous frame sequence ordered according to a time sequence;
extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image;
classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result;
and judging whether the video to be detected is an abnormal video or not according to the classification result and the similarity calculation result.
Optionally, the method further comprises:
carrying out image enhancement on the target image to obtain an enhanced image; and carrying out classification and similarity calculation on the enhanced image.
Optionally, the image enhancing the target image to obtain an enhanced image includes:
adjusting image parameters of the target image to obtain an adjusted image; wherein the image parameters include at least one of: brightness, saturation, hue;
and carrying out image denoising on the adjusted image to obtain an enhanced image.
Optionally, the method further comprises:
and carrying out image segmentation on the target image to obtain a foreground image comprising key information.
Optionally, the classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result includes:
performing feature extraction on the target image to obtain target features;
and classifying the target image and calculating the similarity based on the target characteristic.
Optionally, performing feature extraction on the target image to obtain a target feature, including:
and carrying out feature extraction on the target image through a feature extraction network in a classification model based on the convolutional neural network to obtain target features.
Optionally, the determining whether the video to be detected is an abnormal video according to the classification result and the similarity calculation result includes:
carrying out weighted summation on the classification result and the similarity calculation result to obtain a comprehensive probability value; the classification result and the similarity calculation result are respectively used for representing the probability value that the video is an abnormal video;
and comparing the comprehensive probability value with a preset threshold, if the comprehensive probability value is greater than or equal to the preset threshold, determining that the video to be detected is an abnormal video, and if the comprehensive probability value is less than the preset threshold, determining that the video to be detected is a non-abnormal video.
To achieve the above and other related objects, the present invention provides an abnormal video detection apparatus, comprising:
the video acquisition module is used for acquiring a video to be detected;
the conversion module is used for converting the video to be detected into a video continuous frame sequence ordered according to a time sequence;
the extraction module is used for extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image;
the result calculation module is used for classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result;
and the abnormity judgment module is used for judging whether the video to be detected is an abnormal video or not according to the classification result and the similarity calculation result.
To achieve the above and other related objects, the present invention provides an abnormal video detection apparatus, comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform one or more of the methods described.
To achieve the above objects and other related objects, the present invention provides one or more machine-readable media having stored thereon instructions, which when executed by one or more processors, cause an apparatus to perform one or more of the described methods.
As described above, the method, apparatus, medium, and device for detecting an abnormal video according to the present invention have the following advantages:
the invention discloses a method for detecting abnormal videos, which comprises the following steps: acquiring a video to be detected; converting the video to be detected into a video continuous frame sequence ordered according to a time sequence; extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image; classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result; and judging whether the video to be detected is an abnormal video or not according to the classification result and the similarity calculation result. According to the method, the slight change between frames is captured through continuous frames, and algorithm differentiation is not performed through texture features of single frames; and the final judgment is carried out by combining the similarity measurement of adjacent frames instead of only depending on a simple model classification result, so that the specific algorithm has the characteristics of strong robustness and good generalization, and can be applied to scenes with more variable environmental conditions.
Description of the drawings
Fig. 1 is a flowchart illustrating a method for detecting abnormal video according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a first Block unit according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a second Block unit according to one embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a hardware structure of an abnormal video detection apparatus according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than being drawn according to the number, shape and size of the components in actual implementation, and the type, amount and proportion of each component in actual implementation can be changed freely, and the layout of the components can be more complicated.
As shown in fig. 1, an embodiment of the present application provides a method for detecting an abnormal video, including the following steps:
s100, acquiring a video to be detected;
s101, converting the video to be detected into a continuous video frame sequence ordered according to a time sequence;
s102, extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image;
s103, classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result;
s104, judging whether the video to be detected is an abnormal video or not according to the classification result and the similarity calculation result.
In the invention, whether the video is an abnormal video can be judged by classifying the target image; and then, the judgment of whether the video is abnormal or not can be carried out by combining the similarity calculation. The two judgment modes are integrated to judge whether the video is abnormal or not, so that the algorithm is strong in robustness and good in generalization, and can be applied to scenes with various environment conditions.
In an embodiment, the method further comprises:
carrying out image enhancement on the target image to obtain an enhanced image; and carrying out classification and similarity calculation on the enhanced image. After image enhancement processing, the quality of an original video frame is enhanced, irrelevant information such as background interference does not exist, subsequent feature extraction is facilitated, and the accuracy of the algorithm is improved.
Specifically, the image enhancement of the target image to obtain an enhanced image includes:
adjusting image parameters of the target image to obtain an adjusted image; wherein the image parameters include at least one of: brightness, saturation, hue;
and carrying out image denoising on the adjusted image to obtain an enhanced image.
The image brightness is understood in a popular way to be the brightness of the image, and if the digital image f (x, y) is i (x, y) r (x, y), and if the gray-scale value is between [0, 255], the closer the f value is to 0, the lower the brightness is, and the closer the f value is to 255, the higher the brightness is.
The saturation refers to the number of the colors of the image, the gray level of the image is [ Lmin, Lmax ], the more the intermediate value between Lmin and Lmax, the more the colors of the image are represented, the higher the saturation, the brighter the image looks, and the adjustment of the saturation can correct the overexposed or underexposed picture, so that the image looks more natural.
Hue is the dominant wavelength of light reflected by an object, and different wavelengths produce different color senses, and hue is an important characteristic of color hue, and determines the essential characteristic of the color essence.
Since the brightness, saturation, and hue affect the quality of an image, the image can be enhanced by adjusting the brightness, saturation, and hue.
The image is often influenced by noise interference of imaging equipment and external environment in the digitization and transmission processes, so the image needs to be denoised. Image denoising refers to the process of reducing noise in a digital image. The image denoising method can comprise the following steps: mean filtering, adaptive wiener filtering, median filtering, morphological noise filter, wavelet de-noising, and the like.
The enhanced image obtained after image enhancement may contain some background information unrelated to the foreground, and therefore, the background information unrelated to the foreground needs to be removed. Based on this, in an embodiment, the method further comprises:
and carrying out image segmentation on the target image to obtain a foreground image comprising key information.
Specifically, an image threshold segmentation method may be employed to segment the image. It should be noted that, in an embodiment, in the threshold foreground segmentation stage, the best matching segmentation threshold may be dynamically selected according to the video frame. Of course, in another embodiment, a single fixed threshold may be used for segmentation.
The image segmentation means that an image is divided into a plurality of mutually disjoint areas according to characteristics such as gray scale, color, spatial texture, geometric shape and the like, so that the characteristics show consistency or similarity in the same area and obviously differ among different areas. Briefly, in one image, objects are separated from the background for further processing.
In an embodiment, the classifying and similarity calculating the target image to obtain a classification result and a similarity calculation result includes:
performing feature extraction on the target image to obtain target features;
and classifying the target image and calculating the similarity based on the target characteristic.
Specifically, performing feature extraction on the target image to obtain a target feature, including:
and carrying out feature extraction on the target image through a feature extraction network in a classification model based on the convolutional neural network to obtain target features.
In the present embodiment, feature extraction is performed on a target image by a feature extraction network composed of a convolutional neural network. Since the convolutional neural network requires that a video frame image of a fixed size be input, a target image needs to be adjusted to an image of the same size and then input to the feature extraction network.
In the feature extraction process, a target image is firstly convoluted (after convolution, a Batch Normalization layer and a ReLu activation layer are connected), then is sequentially processed by a plurality of Block modules, and finally is processed by a Global average Pooling layer to obtain a final feature map (feature map) which is used as a feature map for extracting the target image. Each Block module includes a first Block cell and a second Block cell. In the same Block module, the input of the second Block unit is the output of the first Block unit, and in two adjacent Block modules, the output of the previous Block module is used as the input of the next Block module. As shown in fig. 2, the first Block unit includes: and the Channel Split layer comprises two output branches, one output branch of the Channel Split layer sequentially passes through 3 convolutional layers (Conv: convolution), DWConv: deep convolution, Conv: convolution), and the 3 convolutional layers have the same Channel number. And the other output branch of the Channel Split layer and the outputs of the 3 convolutional layers are spliced Concat, and finally Channel Shuffle is carried out. As shown in fig. 3, the input of the second Block unit is decomposed into two input branches, wherein one input branch is subjected to DWConv (depth convolution) and then to convolutional layer Conv to obtain an output, and the other input branch of the second Block unit is subjected to 3 convolutional layers Conv, DWConv (depth convolution) and convolutional layer Conv to obtain an output; after splicing Concat, the two outputs are processed by Channel Shuffle.
And (3) correspondingly processing the extracted characteristic graph to obtain a 1-dimensional vector, wherein the 1-dimensional vector is called as the characteristic of the sample, and calculating to obtain a final classification result of the classification model according to the characteristic, wherein the classification result is used for expressing the probability value that the video is abnormal.
In an embodiment, according to the extracted target feature, the cosine distance between feature vectors of adjacent frame images is calculated, feature similarity between adjacent video frames is further obtained, and whether the video is an abnormal video is judged according to the similarity.
In another embodiment, other distance metric methods may be used to calculate the similarity.
In an embodiment, the determining whether the video to be detected is an abnormal video according to the classification result and the similarity calculation result includes:
carrying out weighted summation on the classification result and the similarity calculation result to obtain a comprehensive probability value; the classification result and the similarity calculation result are respectively used for representing the probability value that the video is abnormal; note that, the smaller the similarity, the greater the probability that the video is an abnormal video, and the greater the similarity, the smaller the probability that the video is an abnormal video.
And comparing the comprehensive probability value with a preset threshold, if the comprehensive probability value is greater than or equal to the preset threshold, determining that the video to be detected is an abnormal video, and if the comprehensive probability value is less than the preset threshold, determining that the video to be detected is a non-abnormal video.
As shown in fig. 4, an embodiment of the present application provides an abnormal video detection apparatus, including:
a video acquisition module 400, configured to acquire a video to be detected;
a conversion module 401, configured to convert the video to be detected into a video continuous frame sequence ordered according to a time sequence;
an extracting module 402, configured to extract two adjacent frames of images from the continuous video frame sequence to obtain a target image;
a result calculating module 403, configured to classify the target image and calculate a similarity, so as to obtain a classification result and a similarity calculation result;
and an anomaly judgment module 404, configured to judge whether the video to be detected is an abnormal video according to the classification result and the similarity calculation result.
In the invention, whether the video is an abnormal video or not can be judged by classifying the target image; and then, the judgment of whether the video is abnormal or not can be carried out by combining the similarity calculation. The two judgment modes are integrated to judge whether the video is abnormal or not, so that the algorithm is strong in robustness and good in generalization, and can be applied to scenes with various environment conditions.
In one embodiment, the apparatus further comprises:
the image enhancement module is used for carrying out image enhancement on the target image to obtain an enhanced image; and carrying out classification and similarity calculation on the enhanced image. After image enhancement processing, the quality of an original video frame is enhanced, irrelevant information such as background interference does not exist, subsequent feature extraction is facilitated, and the accuracy of the algorithm is improved.
Specifically, the image enhancement of the target image to obtain an enhanced image includes:
adjusting image parameters of the target image to obtain an adjusted image; wherein the image parameters include at least one of: brightness, saturation, hue;
and carrying out image denoising on the adjusted image to obtain an enhanced image.
The image brightness is commonly understood to be the brightness of the image, and if the digital image f (x, y) is i (x, y) r (x, y), and if the gray scale value is between [0, 255], the brightness is lower as the f value approaches 0, and the brightness is higher as the f value approaches 255.
The saturation refers to the number of colors of an image, the gray level of the image is [ Lmin, Lmax ], the more the intermediate value between Lmin and Lmax, the more the colors of the image are represented, the higher the saturation is, the more vivid the image looks, and the adjustment of the saturation can correct the overexposed or underexposed picture, so that the image looks more natural.
Hue is the dominant wavelength of light reflected by an object, and different wavelengths produce different color senses, and hue is an important characteristic of color hue, and determines the essential characteristic of the color essence.
Since the brightness, saturation, and hue all affect the quality of an image, the image can be enhanced by adjusting the brightness, saturation, and hue.
The image is often influenced by interference of imaging equipment and external environment noise and the like in the digitization and transmission processes, so the image denoising module is required to denoise the image. Image denoising refers to the process of reducing noise in a digital image. The image denoising method can comprise the following steps: mean filtering, adaptive wiener filtering, median filtering, morphological noise filter, wavelet de-noising, etc.
The enhanced image obtained after image enhancement may contain some background information unrelated to the foreground, and therefore, the background information unrelated to the foreground needs to be removed. Based on this, in an embodiment, the apparatus further comprises:
and the image segmentation module is used for carrying out image segmentation on the target image to obtain a foreground image comprising key information.
Specifically, an image threshold segmentation method may be employed to segment the image. It should be noted that, in an embodiment, in the threshold foreground segmentation stage, the best matching segmentation threshold may be dynamically selected according to the video frame. Of course, in another embodiment, a single fixed threshold may be used for segmentation.
The image segmentation means that an image is divided into a plurality of mutually disjoint areas according to characteristics such as gray scale, color, spatial texture, geometric shape and the like, so that the characteristics show consistency or similarity in the same area and obviously differ among different areas. In brief, in an image, objects are separated from the background for further processing.
In one embodiment, the result calculation module comprises:
the characteristic extraction submodule is used for extracting the characteristics of the target image to obtain target characteristics;
and the result calculation submodule is used for classifying the target image and calculating the similarity based on the target feature.
Specifically, the feature extraction of the target image to obtain a target feature includes:
and carrying out feature extraction on the target image through a feature extraction network in a classification model based on the convolutional neural network to obtain target features.
In the present embodiment, feature extraction is performed on a target image by a feature extraction network composed of a convolutional neural network. Since the convolutional neural network requires that a video frame image of a fixed size be input, a target image needs to be adjusted to an image of the same size and then input to the feature extraction network.
In the feature extraction process, a target image is firstly convoluted (after convolution, a Batch Normalization layer and a ReLu activation layer are connected), then is sequentially processed by a plurality of Block modules, and finally is processed by a Global average Pooling layer to obtain a final feature map (feature map) which is used as a feature map for extracting the target image. Each Block module includes a first Block unit and a second Block unit. In the same Block module, the input of the second Block unit is the output of the first Block unit, and in two adjacent Block modules, the output of the previous Block module is used as the input of the next Block module. As shown in fig. 2, the first Block unit includes: and the Channel Split layer comprises two output branches, one output branch of the Channel Split layer sequentially passes through 3 convolutional layers (GConv: group convolution), DWConv: deep convolution, GConv: group convolution), and the 3 convolutional layers have the same Channel number. And the other output branch of the Channel Split layer and the outputs of the 3 convolutional layers are spliced Concat, and finally Channel Shuffle is carried out. As shown in fig. 3, the input of the second Block unit is decomposed into two input branches, one of the input branches is subjected to DWConv (depthwise convolution) and then to convolutional layer Conv to obtain an output, and the other input branch of the second Block unit is sequentially subjected to 3 convolutional layers Conv, DWConv (depthwise convolution) and convolutional layer Conv to obtain an output; after the two outputs are spliced Concat, Channel Shuffle is performed.
And (3) correspondingly processing the extracted characteristic graph to obtain a 1-dimensional vector, wherein the 1-dimensional vector is called as the characteristic of the sample, and calculating to obtain a final classification result of the classification model according to the characteristic, wherein the classification result is used for expressing the probability value that the video is abnormal.
In an embodiment, according to the extracted target feature, the cosine distance between feature vectors of adjacent frame images is calculated, feature similarity between adjacent video frames is further obtained, and whether the video is an abnormal video is judged according to the similarity.
In another embodiment, other distance metric methods may be used to calculate the similarity.
In one embodiment, the result calculation module further comprises:
the weighting submodule is used for carrying out weighted summation on the classification result and the similarity calculation result to obtain a comprehensive probability value; the classification result and the similarity calculation result are respectively used for representing the probability value that the video is an abnormal video; note that, the smaller the similarity, the greater the probability that the video is an abnormal video, and the greater the similarity, the smaller the probability that the video is an abnormal video.
And the comprehensive comparison submodule is used for comparing the comprehensive probability value with a preset threshold value, if the comprehensive probability value is greater than or equal to the preset threshold value, the video to be detected is an abnormal video, and if the comprehensive probability value is smaller than the preset threshold value, the video to be detected is a non-abnormal video.
Since the device embodiment corresponds to the method embodiment, the implementation of the functions of the modules in the device embodiment may refer to the implementation manner of the method embodiment, and details are not described here.
An embodiment of the present application further provides an apparatus, which may include: one or more processors; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method of fig. 1. In practical applications, the device may be used as a terminal device, and may also be used as a server, where examples of the terminal device may include: the mobile terminal includes a smart phone, a tablet computer, an electronic book reader, an MP3 (Moving Picture Experts Group Audio Layer III) player, an MP4 (Moving Picture Experts Group Audio Layer IV) player, a laptop, a vehicle-mounted computer, a desktop computer, a set-top box, an intelligent television, a wearable device, and the like.
The present application further provides a non-transitory readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the device may be caused to execute instructions (instructions) of steps included in the method in fig. 1 according to the present application.
Fig. 5 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application. As shown, the terminal device may include: an input device 1100, a first processor 1101, an output device 1102, a first memory 1103, and at least one communication bus 1104. The communication bus 1104 is used to implement communication connections between the elements. The first memory 1103 may include a high-speed RAM memory, and may also include a non-volatile storage NVM, such as at least one disk memory, and the first memory 1103 may store various programs for performing various processing functions and implementing the method steps of the present embodiment.
Alternatively, the first processor 1101 may be, for example, a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the first processor 1101 is coupled to the input device 1100 and the output device 1102 through a wired or wireless connection.
Optionally, the input device 1100 may include a variety of input devices, such as at least one of a user-oriented user interface, a device-oriented device interface, a software programmable interface, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware plug-in interface (e.g., a USB interface, a serial port, etc.) for data transmission between devices; optionally, the user-facing user interface may be, for example, a user-facing control key, a voice input device for receiving voice input, and a touch sensing device (e.g., a touch screen with a touch sensing function, a touch pad, etc.) for receiving user touch input; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, such as an input pin interface or an input interface of a chip; the output devices 1102 may include output devices such as a display, audio, and the like.
In this embodiment, the processor of the terminal device includes a module for executing functions of each module in each device, and specific functions and technical effects may refer to the foregoing embodiments, which are not described herein again.
Fig. 6 is a schematic hardware structure diagram of a terminal device according to an embodiment of the present application. FIG. 6 is a specific embodiment of FIG. 5 in an implementation. As shown, the terminal device of the present embodiment may include a second processor 1201 and a second memory 1202.
The second processor 1201 executes the computer program code stored in the second memory 1202 to implement the method described in fig. 1 in the above embodiment.
The second memory 1202 is configured to store various types of data to support operations at the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, such as messages, pictures, videos, and so forth. The second memory 1202 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Optionally, a second processor 1201 is provided in the processing assembly 1200. The terminal device may further include: communication components 1203, power components 1204, multimedia components 1205, speech components 1206, input/output interfaces 1207, and/or sensor components 1208. The specific components included in the terminal device are set according to actual requirements, which is not limited in this embodiment.
The processing component 1200 generally controls the overall operation of the terminal device. The processing assembly 1200 may include one or more second processors 1201 to execute instructions to perform all or part of the steps of the data processing method described above. Further, the processing component 1200 can include one or more modules that facilitate interaction between the processing component 1200 and other components. For example, the processing component 1200 can include a multimedia module to facilitate interaction between the multimedia component 1205 and the processing component 1200.
The power supply component 1204 provides power to the various components of the terminal device. The power components 1204 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the terminal device.
The multimedia components 1205 include a display screen that provides an output interface between the terminal device and the user. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The voice component 1206 is configured to output and/or input voice signals. For example, the voice component 1206 includes a Microphone (MIC) configured to receive external voice signals when the terminal device is in an operational mode, such as a voice recognition mode. The received voice signal may further be stored in the second memory 1202 or transmitted via the communication component 1203. In some embodiments, the speech component 1206 further comprises a speaker for outputting speech signals.
The input/output interface 1207 provides an interface between the processing component 1200 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor component 1208 includes one or more sensors for providing various aspects of status assessment for the terminal device. For example, the sensor component 1208 may detect an open/closed state of the terminal device, a relative positioning of the components, a presence or absence of user contact with the terminal device. The sensor assembly 1208 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact, including detecting the distance between the user and the terminal device. In some embodiments, the sensor assembly 1208 may also include a camera or the like.
The communication component 1203 is configured to facilitate communications between the terminal device and other devices in a wired or wireless manner. The terminal device may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one embodiment, the terminal device may include a SIM card slot therein for inserting a SIM card therein, so that the terminal device may log onto a GPRS network to establish communication with the server via the internet.
As can be seen from the above, the communication component 1203, the voice component 1206, the input/output interface 1207 and the sensor component 1208 referred to in the embodiment of fig. 6 can be implemented as the input device in the embodiment of fig. 5.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (10)

1. A method for detecting abnormal videos, comprising:
acquiring a video to be detected;
converting the video to be detected into a video continuous frame sequence ordered according to a time sequence;
extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image;
classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result;
and judging whether the video to be detected is an abnormal video or not according to the classification result and the similarity calculation result.
2. The method for detecting abnormal video according to claim 1, further comprising:
carrying out image enhancement on the target image to obtain an enhanced image; and carrying out classification and similarity calculation on the enhanced image.
3. The method for detecting abnormal video according to claim 2, wherein the image enhancement of the target image to obtain an enhanced image comprises:
adjusting image parameters of the target image to obtain an adjusted image; wherein the image parameters include at least one of: brightness, saturation, hue;
and carrying out image denoising on the adjusted image to obtain an enhanced image.
4. The method for detecting abnormal video according to claim 1 or 2, wherein the method further comprises:
and carrying out image segmentation on the target image to obtain a foreground image comprising key information.
5. The method for detecting an abnormal video according to claim 1, wherein the classifying and similarity calculating the target image to obtain a classification result and a similarity calculation result comprises:
performing feature extraction on the target image to obtain target features;
and classifying the target image and calculating the similarity based on the target characteristic.
6. The method for detecting the abnormal video according to claim 5, wherein the extracting the features of the target image to obtain the target features comprises:
and performing feature extraction on the target image through a feature extraction network in a classification model based on a convolutional neural network to obtain target features.
7. The method for detecting the abnormal video according to claim 1, wherein the determining whether the video to be detected is the abnormal video according to the classification result and the similarity calculation result comprises:
carrying out weighted summation on the classification result and the similarity calculation result to obtain a comprehensive probability value; the classification result and the similarity calculation result are respectively used for representing the probability value that the video is an abnormal video;
and comparing the comprehensive probability value with a preset threshold, if the comprehensive probability value is greater than or equal to the preset threshold, determining that the video to be detected is an abnormal video, and if the comprehensive probability value is less than the preset threshold, determining that the video to be detected is a non-abnormal video.
8. An apparatus for detecting an abnormal video, comprising:
the video acquisition module is used for acquiring a video to be detected;
the conversion module is used for converting the video to be detected into a video continuous frame sequence ordered according to a time sequence;
the extraction module is used for extracting two adjacent frames of images from the continuous video frame sequence to obtain a target image;
the result calculation module is used for classifying the target image and calculating the similarity to obtain a classification result and a similarity calculation result;
and the abnormity judgment module is used for judging whether the video to be detected is an abnormal video according to the classification result and the similarity calculation result.
9. An apparatus for detecting an abnormal video, comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method recited by one or more of claims 1-7.
10. One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method recited by one or more of claims 1-7.
CN202210240166.5A 2022-03-10 2022-03-10 Abnormal video detection method, device, medium and equipment Pending CN115035432A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210240166.5A CN115035432A (en) 2022-03-10 2022-03-10 Abnormal video detection method, device, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210240166.5A CN115035432A (en) 2022-03-10 2022-03-10 Abnormal video detection method, device, medium and equipment

Publications (1)

Publication Number Publication Date
CN115035432A true CN115035432A (en) 2022-09-09

Family

ID=83118982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210240166.5A Pending CN115035432A (en) 2022-03-10 2022-03-10 Abnormal video detection method, device, medium and equipment

Country Status (1)

Country Link
CN (1) CN115035432A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965899A (en) * 2023-03-16 2023-04-14 山东省凯麟环保设备股份有限公司 Unmanned sweeping robot vehicle abnormality detection method and system based on video segmentation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965899A (en) * 2023-03-16 2023-04-14 山东省凯麟环保设备股份有限公司 Unmanned sweeping robot vehicle abnormality detection method and system based on video segmentation

Similar Documents

Publication Publication Date Title
CN108229277B (en) Gesture recognition method, gesture control method, multilayer neural network training method, device and electronic equipment
US20240112035A1 (en) 3d object recognition using 3d convolutional neural network with depth based multi-scale filters
CN112040337B (en) Video watermark adding and extracting method, device, equipment and storage medium
CN110335216B (en) Image processing method, image processing apparatus, terminal device, and readable storage medium
CN111771226A (en) Electronic device, image processing method thereof, and computer-readable recording medium
CN112602088B (en) Method, system and computer readable medium for improving quality of low light images
CN109116129B (en) Terminal detection method, detection device, system and storage medium
CN107690804B (en) Image processing method and user terminal
KR20210092138A (en) System and method for multi-frame contextual attention for multi-frame image and video processing using deep neural networks
CN112308797A (en) Corner detection method and device, electronic equipment and readable storage medium
CN115631122A (en) Image optimization method and device for edge image algorithm
CN110689478B (en) Image stylization processing method and device, electronic equipment and readable medium
US10853921B2 (en) Method and apparatus for image sharpening using edge-preserving filters
CN115035432A (en) Abnormal video detection method, device, medium and equipment
CN110991412A (en) Face recognition method and device, storage medium and electronic equipment
CN112529939A (en) Target track matching method and device, machine readable medium and equipment
CN108805883B (en) Image segmentation method, image segmentation device and electronic equipment
CN110705653A (en) Image classification method, image classification device and terminal equipment
EP4287110A1 (en) Method and device for correcting image on basis of compression quality of image in electronic device
CN115880347B (en) Image processing method, electronic device, storage medium, and program product
CN111710011B (en) Cartoon generation method and system, electronic device and medium
CN115330610A (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN115830362A (en) Image processing method, apparatus, device, medium, and product
CN114943872A (en) Training method and device of target detection model, target detection method and device, medium and equipment
CN112052863B (en) Image detection method and device, computer storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination