CN116681734A

CN116681734A - Video detection method, device, electronic equipment and storage medium

Info

Publication number: CN116681734A
Application number: CN202210156196.8A
Authority: CN
Inventors: 徐璐; 樊顺利; 刘阳兴
Original assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Current assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date: 2022-02-21
Filing date: 2022-02-21
Publication date: 2023-09-01

Abstract

The application discloses a video detection method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining a current optical flow and a next frame predicted optical flow according to the current video frame and the previous frame video frame; according to the current optical flow and the next frame predicted optical flow, performing motion compensation on the current video frame to obtain a predicted video frame; obtaining a predicted video frame of a next frame according to the current video frame, the video frame of the previous frame and the predicted video frame; and determining whether the predicted video frame of the next frame is abnormal or not according to the predicted video frame of the next frame and the real video frame of the next frame. In the video detection and judgment process, the problem of inaccurate prediction caused by nonlinear motion is reduced by predicting the residual error of a video frame, and meanwhile, the accuracy of video frame prediction is improved by compensating the motion characteristic in the video frame, so that the method has better accuracy in abnormal judgment.

Description

Video detection method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of video detection technologies, and in particular, to a video detection method, a video detection device, an electronic device, and a storage medium.

Background

Video anomaly detection is an important challenging application in the field of computer vision. Instead, the conventional method is to directly use the future optical flow of the convolutional neural network or the next frame by using a method based on the predicted frame, and then determine whether the error occurs by judging. However, the optical flow estimation is a complex problem, and if there are situations such as occlusion, large motion or uneven illumination, the prediction of the future optical flow and the future frame will have a large error.

Disclosure of Invention

The embodiment of the application aims to provide a video detection method, a device, electronic equipment and a storage medium, which are used for improving the accuracy of future frame prediction and the accuracy of video frame abnormality judgment.

In a first aspect, to achieve the above object, an embodiment of the present application provides a video detection method, including:

obtaining a current optical flow and a next frame predicted optical flow according to a current video frame and a video frame of a previous frame of the current video frame;

according to the current optical flow and the next frame predicted optical flow, performing motion compensation on the current video frame to obtain a predicted video frame;

obtaining a predicted video frame of a next frame according to the current video frame, the video frame of the previous frame and the predicted video frame;

and determining a detection result of the video frame next to the current video frame according to the video frame next to the predicted video frame and the video frame next to the current video frame.

In a second aspect, to solve the same technical problem, an embodiment of the present application provides a video detection apparatus, including:

the optical flow estimation module is used for obtaining a current optical flow and a predicted optical flow of a next frame according to a current video frame and a video frame of a previous frame of the current video frame;

the motion compensation module is used for performing motion compensation on the current video frame according to the current optical flow and the next frame predicted optical flow to obtain a predicted video frame;

the prediction output module is used for obtaining a predicted video frame of the next frame according to the current video frame, the video frame of the previous frame and the predicted video frame;

and the abnormality detection module is used for determining the detection result of the video frame of the next frame of the current video frame according to the video frame of the next frame prediction and the video frame of the next frame of the current video frame.

In a third aspect, to solve the same technical problem, an embodiment of the present application provides an electronic device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the memory is coupled to the processor, and where the processor executes the computer program to implement the steps in the video detection method described in any one of the above.

In a fourth aspect, to solve the same technical problem, an embodiment of the present application provides a computer readable storage medium, where a computer program is stored, where a device where the computer readable storage medium is controlled to execute the steps in the video detection method according to any one of the above when the computer program runs.

The embodiment of the application provides a video detection method, a device, electronic equipment and a storage medium, wherein when video detection and judgment are carried out, the current video frame and the previous frame video frame are utilized to realize the prediction of the next frame optical flow and the next frame, specifically, the optical flow of the next frame is obtained by utilizing an optical flow device and a next frame optical flow prediction network according to the current video frame and the previous frame video frame, then the next frame prediction optical flow and the current video frame are utilized to carry out motion compensation so as to realize the rough prediction of the next frame, the residual error of the next frame video frame is obtained by utilizing the current video frame and the previous frame video frame, and finally, the rough prediction and the residual error of the next frame are overlapped, so that the obtained predicted next frame video frame is further utilized to realize the judgment of video frame abnormality according to the obtained next frame video frame. In the video detection and judgment process, the problem of inaccurate prediction caused by nonlinear motion is reduced by predicting the residual error of a video frame, and meanwhile, the accuracy of video frame prediction is improved by compensating the motion characteristic in the video frame, so that the method has better accuracy in abnormal judgment.

Drawings

Fig. 1 is a schematic flow chart of a video detection method according to an embodiment of the present application;

FIG. 2 is a flow chart illustrating steps for obtaining a predicted optical flow for a next frame according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a video frame prediction network according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating steps for obtaining a predicted video frame according to an embodiment of the present application;

FIG. 5 is a flow chart illustrating steps for obtaining a next video frame according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a video detection device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 8 is a schematic diagram of another structure of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

Referring to fig. 1, fig. 1 is a schematic flow chart of a video detection method according to an embodiment of the present application, and as shown in fig. 1, the video detection method according to an embodiment of the present application includes steps S101 to S104.

Step S101, obtaining a predicted optical flow of a next frame according to a video frame to be detected and a video frame of a previous frame of the current video frame.

When the video frame is detected abnormally, the video frame is predicted and then compared with the real video frame to determine whether the video frame to be judged is a normal video frame. Specifically, when detecting and judging, determining a video frame to be detected which is required to be detected and judged currently, and simultaneously acquiring a video frame of a previous frame adjacent to the current video frame, so as to obtain a predicted optical flow of a next frame according to the current video frame and the video frame of the previous frame.

The video frame to be detected is a current video frame, and actually, when optical flow calculation is performed, an optical flow corresponding to each video frame can be calculated by using a related optical flow estimator, wherein the optical flow estimator can be obtained based on optimization constructed by a RAFT network. Thus, in performing optical flow prediction, the optical flow estimator performs an estimation of the current optical flow based on the current video frame and the previous frame video frame. When obtaining the predicted optical flow of the next frame according to the current video frame and the previous frame video, the next frame video frame needs to be predicted first, so as to further realize the prediction of the optical flow of the next frame.

In practical applications, the threshold value setting may be directly performed on the loss value herein, or the overall setting of the loss value may be performed on the entire anomaly detection system or the network of each part in the apparatus, specifically referring to the following description.

Referring to fig. 2, fig. 2 is a flow chart illustrating a step of obtaining a predicted optical flow of a next frame according to an embodiment of the present application. Wherein, the step includes step S201 to step S202.

Step S201, performing optical flow estimation according to a current video frame and a video frame of a frame previous to the current video frame to obtain a current optical flow;

step S202, obtaining a predicted optical flow of a next frame according to the current video frame, the previous frame video frame and the current optical flow.

When predicting the optical flow of the next frame, firstly, performing optical flow estimation according to the current video frame and the previous frame video frame to obtain the current optical flow corresponding to the current video frame, and then, processing according to the obtained current optical flow, the current video frame and the previous frame video frame which are obtained in advance to obtain the predicted optical flow of the next frame.

When the method is actually applied, the optical flow estimator is used for estimating the optical flow, the optical flow is estimated until the current optical flow corresponding to the video to be processed is reached, when the current optical flow is obtained, the current video frame and the previous frame of video frame are input into the optical flow estimator for calculating the optical flow, and after the current optical flow is obtained, the optical flow of the next frame is predicted, and the predicted optical flow of the next frame is obtained.

And when obtaining the predicted optical flow of the next frame according to the current video frame, the previous frame video frame and the current optical flow, the method comprises the following steps: inputting the current video frame and the previous frame video frame into an optical flow prediction network to obtain an optical flow residual error of the next frame; and superposing the next frame optical flow residual error and the current optical flow to obtain a next frame predicted optical flow.

Specifically, during processing, a current video frame and a previous frame video frame are input into an optical flow prediction network to obtain an optical flow residual error of a next frame, and then the obtained optical flow residual error of the next frame is overlapped with the obtained current optical flow to obtain a predicted optical flow of the next frame.

In practical application, when the optimization of the optical flow estimator is completed, the optical flow of any input video frame can be estimated, and when the next video frame is predicted, the optical flow is obtained based on the current video frame, and the optical flow represents the motion information of the image, so that when the next video frame is predicted, the predicted optical flow of the next frame needs to be obtained. When obtaining the predicted optical flow of the next frame, firstly, estimating and calculating the optical flow of the current video frame by using an optical flow estimator to obtain the current optical flow, and simultaneously, predicting the optical flow residual error of the next frame by using an optical flow prediction network to determine the corresponding compensation of the predicted optical flow of the next frame, specifically, after obtaining the optical flow residual error of the next frame, superposing the optical flow residual error of the next frame and the obtained current optical flow to obtain the predicted optical flow of the next frame.

Step S102, according to the current optical flow and the predicted optical flow of the next frame, performing motion compensation on the current video frame to obtain a predicted video frame.

When the predicted optical flow of the next frame is obtained, when the video frame of the next frame is predicted, corresponding processing is carried out according to the current optical flow and the predicted optical flow of the next frame, and the video frame of the next frame corresponding to the stage is obtained. Specifically, motion compensation is performed on a current video frame according to the current optical flow and the next frame optical flow to obtain a predicted video frame, wherein the predicted video frame is the next frame video frame obtained at the stage.

In practice, when predicting the next frame of video frame, motion compensation is performed according to the current video frame and the predicted optical flow of the next frame, so as to obtain the next frame of video frame corresponding to the stage. And when the prediction of the next frame of video frame is performed, the prediction is obtained based on the constructed video frame prediction network, and the constructed video frame prediction network can be shown in fig. 3.

Specifically, fig. 3 is a schematic structural diagram of a video frame prediction network according to an embodiment of the present application. Specifically, the video frame prediction network includes an optical flow estimation network, a next frame optical flow prediction network, a next frame prediction network, and a motion compensation network, where the optical flow estimation network is such as the optical flow estimator described above, and is used for performing calculation estimation on the optical flow of the input video frame, the next frame optical flow prediction network is used for obtaining the optical flow of the next frame video frame according to the current video frame prediction, the next frame prediction network is used for obtaining the next frame video frame residual according to the current video frame prediction, and the motion compensation network is used for performing motion compensation on the obtained next frame video frame, so that the obtained next frame optical flow can be more accurate.

In an embodiment, when the current video frame and the previous frame video frame are obtained, the video frame prediction network can obtain the next adjacent frame video frame, and further, the next predicted frame video frame is compared with the next actually played frame video frame to determine whether the video frame is abnormal or not.

When the final next frame of video frame is obtained, the current optical flow is obtained by utilizing an optical flow estimation network according to the current video frame and the previous frame of video frame, the optical flow residual error corresponding to the current is obtained by utilizing a next frame of optical flow prediction network, the next frame of video frame obtained by the current prediction is obtained by utilizing the next frame of prediction network, then the corresponding next frame of video frame is obtained by utilizing motion compensation according to the current optical flow and the optical flow residual error, and the next frame of video frame after motion compensation is processed, such as superposition, according to the next frame of residual error obtained by the next frame of prediction network, so as to obtain the final output next frame of video frame.

When obtaining a predicted video frame, the obtained predicted optical flow of the next frame and the current video frame are input into a motion compensation network for processing, specifically, referring to fig. 4, fig. 4 is a flow chart of steps for obtaining the predicted video frame according to an embodiment of the present application. Wherein, the step includes steps S401 to S403.

Step S401, performing feature restoration on the current video frame according to the predicted optical flow of the next frame to obtain a restored video frame of the next frame;

step S402, splicing the predicted optical flow of the next frame with the current video frame to obtain the spliced current video frame;

step S403, inputting the restored next frame video frame and the spliced current video frame into a prediction network to obtain a predicted video frame.

And after obtaining a predicted optical flow of the next frame, carrying out feature restoration on the current video frame according to the predicted optical flow of the next frame to obtain a restored next frame video frame, simultaneously splicing the current video frame of the predicted optical flow of the next frame to obtain a spliced current video frame, and finally inputting the obtained restored next frame video frame and the spliced current video frame into a corresponding prediction network to obtain a predicted video frame.

In practice, this process is an implementation process of inputting the predicted optical flow of the next frame and the current video frame into the motion compensation network for processing. Specifically, during processing, the current video frame is subjected to a warping operation, that is, a restoring operation, according to the next frame prediction optical flow, which can be understood as predicting the next frame video frame according to the next frame prediction optical flow, at this time, a restored current video frame is obtained, and when the next frame prediction optical flow is spliced with the current video frame, a prediction result of the next frame video frame is also obtained, and finally, the next frame video frame after motion compensation can be obtained by inputting the prediction result obtained by two times into a corresponding prediction network.

It should be noted that, the prediction network used herein may be a CNN network of the Unet structure. After two different predicted video frames are obtained, inputting the two video frames into the CNN network, and predicting and outputting the next frame video frame after motion compensation are realized by utilizing the structural characteristics of the network.

Step S103, obtaining a predicted video frame of the next frame according to the current video frame, the video frame of the previous frame and the predicted video frame.

After the predicted video frame is obtained through motion compensation, a final output predicted video frame of the next frame is obtained according to the current video frame, the previous frame video frame and the predicted video frame. Specifically, when obtaining the predicted video frame of the next frame, firstly obtaining the residual error of the video frame of the next frame according to the current video frame and the video frame of the previous frame, and then obtaining the predicted video frame of the next frame by complementing the residual error.

Referring to fig. 5, fig. 5 is a flow chart illustrating steps for obtaining a predicted video frame of a next frame according to an embodiment of the present application. Wherein, this step includes step S501 to step S502.

Step S501, performing residual prediction according to the current video frame and the previous frame video frame to obtain a next frame video frame residual;

and step S502, overlapping the video frame residual error of the next frame with the predicted video frame to obtain the predicted video frame of the next frame.

When the predicted video frame is obtained, the current video frame and the previous video frame are input by utilizing the next frame prediction network in the video frame prediction network to obtain a next frame video frame residual error, and then the obtained next frame video frame residual error is overlapped with the predicted video frame to obtain the finally output video frame. The next frame prediction network is a pre-optimized network for predicting video frame residuals between two adjacent video frames.

For the next frame prediction network, the next frame prediction network to be optimized can be optimized in a pre-training mode, and the next frame prediction network to be optimized is learned and optimized by constructing a corresponding training sample set, so that the next frame video frame residual is predicted after the learning and the optimization are completed.

Similarly, for the video frame prediction network described above, each network structure may be learned and optimized in advance, may be independently learned and optimized, and may be learned and optimized in combination. Again without limitation. The residual represents the difference between the predicted value and the true value, so that the difference between the predicted value and the true value can be smaller, i.e. with higher accuracy, by complementing the residual.

Taking global learning as an example, because of the variety of information contained in video frames, a loss function can be constructed in a manner that includes optical flow loss, luminance loss, and gradient loss, where optical flow loss uses RAFT to calculate optical flow for the next frame and does loss with predicted optical flow for the next frame,

loss of brightness, using the predicted next frame to make a loss with the actual next frame,

gradient loss, using the predicted gradient of the next frame to make loss with the true gradient of the next frame,

thus, the total loss is L=λ _f L _f +λ _p L _p +λ _gd L _gd And the parameters may be taken 0.05,1,2 separately. It is then determined whether learning is complete by comparing the total loss to a set associated loss threshold.

Step S104, determining a detection result of the video frame of the next frame of the current video frame according to the predicted video frame of the next frame and the video frame of the next frame of the current video frame.

In one embodiment, after obtaining the next frame predicted video frame, in determining whether the video frame is abnormal, the detection result of the next frame video frame of the current video frame is determined by comparing and processing the next frame predicted video frame obtained at this time with the next frame video frame of the current video frame that is actually played, wherein the detection result includes but is not limited to determining whether the video frame is played normally.

And when determining the detection result of the video frame of the next frame which is actually played, the method comprises the following steps: acquiring a next frame video frame of the current video frame in a video stream, and calculating peak signal-to-noise ratios of each pixel point in the next frame prediction video frame and the next frame video frame; normalizing the peak signal-to-noise ratio of each pixel point to obtain a normalized value; and comparing the normalized value with a preset threshold value, and determining the detection result of the video frame of the next frame of the current video frame.

Specifically, when processing is performed, calculating the peak signal-to-noise ratio of each pixel point in the next frame of the predicted video frame and the next frame of the current video frame, performing normalization processing to obtain a normalized value, comparing the normalized value with a preset threshold value, and determining the detection result of the next frame of the video frame according to the obtained comparison result. And when the obtained normalized value is smaller than the preset threshold value, indicating that the video frame of the next frame played at the moment is abnormal, otherwise, determining that the video frame is normal.

In practical applications, the peak signal-to-noise ratio is often used as a measurement method for signal reconstruction quality in the fields of image compression and the like, where the peak signal-to-noise ratio is defined as:

in order to better choose the threshold, the PSNR is also normalized to [0,1], where the normalization process formula can be as follows:

finally, by setting a threshold value, for example, a threshold value of 0.6, all values lower than 0.6 are judged to be abnormal, and otherwise, normal.

In summary, in the video detection method provided by the embodiment of the present application, when video detection and judgment are performed, prediction of a next frame optical flow and a next frame is implemented by using a current video frame (current video frame) and a previous frame video frame, specifically, a next frame predicted optical flow is obtained by using an optical flow device and a next frame optical flow prediction network according to the current video frame and the previous frame video frame, then, motion compensation is performed by using the next frame predicted optical flow and the current video frame to implement coarse prediction of the next frame, and residual errors of the next frame video frame are obtained by using the current video frame and the previous frame video frame, and finally, the coarse prediction and the residual errors of the next frame are overlapped, so that the obtained predicted next frame predicted video frame is further, and abnormal judgment of the video frame is implemented according to the obtained next frame predicted video frame. In the video detection and judgment process, the problem of inaccurate prediction caused by nonlinear motion is reduced by predicting the residual error of the video frame, and meanwhile, the accuracy of video frame prediction is improved by compensating the motion characteristic in the video frame, so that the method has better accuracy in abnormality detection and judgment.

The method according to the above embodiment will be further described from the perspective of a video detection device, which may be implemented as a separate entity or may be implemented in an electronic device, such as a terminal, which may include a mobile phone, a tablet computer, etc.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a video detection device according to an embodiment of the present application, and as shown in fig. 6, a video detection device 600 according to an embodiment of the present application includes:

the optical flow estimation module 601 is configured to obtain a current optical flow and a predicted optical flow of a next frame according to a current video frame and a video frame of a previous frame of the current video frame;

the motion compensation module 602 is configured to perform motion compensation on the current video frame according to the current optical flow and the predicted optical flow of the next frame, so as to obtain a predicted video frame;

a prediction output module 603, configured to obtain a predicted video frame of a next frame according to the current video frame, the previous frame video frame, and the predicted video frame;

the anomaly detection module 604 is configured to determine a detection result of a video frame next to the current video frame according to the predicted video frame next to the current video frame and the real video frame next to the current video frame.

In the implementation, each module and/or unit may be implemented as an independent entity, or may be combined arbitrarily and implemented as the same entity or a plurality of entities, where the implementation of each module and/or unit may refer to the foregoing method embodiment, and the specific beneficial effects that may be achieved may refer to the beneficial effects in the foregoing method embodiment, which are not described herein again.

In addition, referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device may be a mobile terminal, such as a smart phone, a tablet computer, or the like. As shown in fig. 7, the electronic device 700 includes a processor 701, a memory 702. The processor 701 is electrically connected to the memory 702.

The processor 701 is a control center of the electronic device 700, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device 700 and processes data by running or loading application programs stored in the memory 702, and calling data stored in the memory 702, thereby performing overall monitoring of the electronic device 700.

In this embodiment, the processor 701 in the electronic device 700 loads the instructions corresponding to the processes of one or more application programs into the memory 702 according to the following steps, and the processor 701 executes the application programs stored in the memory 702, so as to implement various functions:

The electronic device 700 may implement the steps in any embodiment of the video detection method provided by the embodiment of the present application, so that the beneficial effects that any video detection method provided by the embodiment of the present application can implement are described in detail in the previous embodiments, and are not described herein.

Referring to fig. 8, fig. 8 is another schematic structural diagram of an electronic device provided in an embodiment of the present application, and fig. 8 is a specific structural block diagram of the electronic device provided in the embodiment of the present application, where the electronic device may be used to implement the video detection method provided in the embodiment. The electronic device 800 may be a mobile terminal such as a smart phone or a notebook computer.

The RF circuit 810 is configured to receive and transmit electromagnetic waves, and to perform mutual conversion between the electromagnetic waves and the electrical signals, thereby communicating with a communication network or other devices. RF circuitry 810 may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and so forth. The RF circuitry 810 may communicate with various networks such as the internet, intranets, wireless networks, or other devices via wireless networks. The wireless network may include a cellular telephone network, a wireless local area network, or a metropolitan area network. The wireless network may use various communication standards, protocols, and technologies including, but not limited to, global system for mobile communications (Global System for Mobile Communication, GSM), enhanced mobile communications technology (Enhanced Data GSM Environment, EDGE), wideband code division multiple access technology (Wideband Code Division Multiple Access, WCDMA), code division multiple access technology (Code Division Access, CDMA), time division multiple access technology (Time Division Multiple Access, TDMA), wireless fidelity technology (Wireless Fidelity, wi-Fi) (e.g., institute of electrical and electronics engineers standards IEEE 802.11a,IEEE 802.11b,IEEE802.11g and/or IEEE802.11 n), internet telephony (Voice over Internet Protocol, voIP), worldwide interoperability for microwave access (Worldwide Interoperability for Microwave Access, wi-Max), other protocols for mail, instant messaging, and short messaging, as well as any other suitable communication protocols, even including those not currently developed.

The memory 820 may be used to store software programs and modules, such as program instructions/modules corresponding to the video detection method in the above embodiments, and the processor 880 executes the software programs and modules stored in the memory 820 to perform various functional applications and video detection, that is, to implement the following functions:

Memory 820 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 820 may further include memory located remotely from processor 880, which may be connected to electronic device 800 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input unit 830 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, the input unit 830 may include a touch-sensitive surface 831 as well as other input devices 832. The touch-sensitive surface 831, also referred to as a touch screen or touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch-sensitive surface 831 or thereabout by using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection device according to a predetermined program. Alternatively, touch-sensitive surface 831 can include both a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 880 and can receive commands from the processor 880 and execute them. In addition, the touch-sensitive surface 831 can be implemented using a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. In addition to the touch-sensitive surface 831, the input unit 830 may also include other input devices 832. In particular, other input devices 832 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 840 may be used to display information entered by a user or provided to a user as well as various graphical user interfaces of the electronic device 800, which may be composed of graphics, text, icons, video, and any combination thereof. The display unit 840 may include a display panel 841, and optionally, the display panel 841 may be configured in the form of an LCD (Liquid Crystal Display ), an OLED (Organic Light-Emitting Diode), or the like. Further, touch-sensitive surface 831 can overlay display panel 841, and upon detection of a touch operation thereon or thereabout by touch-sensitive surface 831, is communicated to processor 880 for determining the type of touch event, whereupon processor 880 provides a corresponding visual output on display panel 841 based on the type of touch event. Although in the figures, touch-sensitive surface 831 and display panel 841 are implemented as two separate components, in some embodiments touch-sensitive surface 831 may be integrated with display panel 841 to implement input and output functions.

The electronic device 800 may also include at least one sensor 850, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 841 according to the brightness of ambient light, and a proximity sensor that may generate an interrupt when the folder is closed or closed. As one of the motion sensors, the gravity acceleration sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and the direction when the mobile phone is stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the electronic device 800 are not described in detail herein.

Audio circuitry 860, speakers 861, and microphone 862 may provide an audio interface between the user and the electronic device 800. The audio circuit 860 may transmit the received electrical signal converted from audio data to the speaker 861, and the electrical signal is converted into a sound signal by the speaker 861 to be output; on the other hand, the microphone 862 converts the collected sound signals into electrical signals, which are received by the audio circuit 860 and converted into audio data, which are processed by the audio data output processor 880 and transmitted to, for example, another terminal via the RF circuit 810, or which are output to the memory 820 for further processing. Audio circuitry 860 may also include an ear bud jack to provide communication of peripheral headphones with electronic device 800.

The electronic device 800, via the transmission module 870 (e.g., wi-Fi module), may facilitate user reception of requests, transmission of information, etc., that provides wireless broadband internet access to the user. Although the transmission module 870 is shown in the figures, it is understood that it is not a necessary component of the electronic device 800 and may be omitted entirely as desired within the scope of not changing the essence of the application.

The processor 880 is a control center of the electronic device 800, connects various parts of the entire cellular phone using various interfaces and lines, and performs various functions of the electronic device 800 and processes data by running or executing software programs and/or modules stored in the memory 820, and calling data stored in the memory 820, thereby performing overall monitoring of the electronic device. Optionally, processor 880 may include one or more processing cores; in some embodiments, processor 880 may integrate an application processor that primarily handles operating systems, user interfaces, applications, and the like, with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 880.

The electronic device 800 also includes a power supply 890 (e.g., a battery) that provides power to the various components, and in some embodiments, may be logically connected to the processor 880 via a power management system to perform functions such as managing charging, discharging, and power consumption via the power management system. Power supply 890 may also include one or more of any components of a dc or ac power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, etc.

Although not shown, the electronic device 800 further includes a camera (e.g., front camera, rear camera), a bluetooth module, etc., which are not described herein. In particular, in this embodiment, the display unit of the electronic device is a touch screen display, the mobile terminal further includes a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for:

In the implementation, each module may be implemented as an independent entity, or may be combined arbitrarily, and implemented as the same entity or several entities, and the implementation of each module may be referred to the foregoing method embodiment, which is not described herein again.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor. To this end, an embodiment of the present application provides a storage medium having stored therein a plurality of instructions capable of being loaded by a processor to perform the steps of any one of the embodiments of the video detection method provided by the embodiment of the present application.

Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

The steps in any embodiment of the video detection method provided by the embodiment of the present application can be executed by the instructions stored in the storage medium, so that the beneficial effects that any video detection method provided by the embodiment of the present application can achieve can be achieved, and detailed descriptions of the previous embodiments are omitted.

The foregoing describes in detail a video detection method, apparatus, electronic device and storage medium provided by the embodiments of the present application, and specific examples are applied to illustrate the principles and implementations of the present application, where the foregoing examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, the present description should not be construed as limiting the present application. Moreover, it will be apparent to those skilled in the art that various modifications and variations can be made without departing from the principles of the present application, and such modifications and variations are also considered to be within the scope of the application.

Claims

1. A video detection method, comprising:

2. The method of claim 1, wherein the obtaining the current optical flow and the predicted optical flow for the next frame from the current video frame and the video frame preceding the current video frame comprises:

performing optical flow estimation by using an optical flow estimator according to a current video frame and a video frame of a previous frame of the current video frame to obtain a current optical flow;

and obtaining a predicted optical flow of a next frame according to the current video frame, the previous frame video frame and the current optical flow.

3. The method of claim 2, wherein the deriving a next frame predicted optical flow from the current video frame, the previous frame video frame, and the current optical flow comprises:

inputting the current video frame and the previous frame video frame into an optical flow prediction network to obtain an optical flow residual error of the next frame;

and superposing the next frame optical flow residual error and the current optical flow to obtain a next frame predicted optical flow.

4. The method of claim 1, wherein the motion compensating the current video frame based on the current optical flow and the next frame predicted optical flow to obtain a predicted video frame comprises:

performing feature restoration on the current video frame according to the next frame prediction optical flow to obtain a restored next frame video frame;

splicing the predicted optical flow of the next frame with the current video frame to obtain the spliced current video frame;

and inputting the restored next frame video frame and the spliced current video frame into a prediction network to obtain a predicted video frame.

5. The method of claim 1, wherein the obtaining a next frame predicted video frame from the current video frame, the previous frame video frame, and the predicted video frame comprises:

residual prediction is carried out according to the current video frame and the previous frame video frame, so as to obtain a next frame video frame residual;

and superposing the video frame residual error of the next frame with the predicted video frame to obtain the predicted video frame of the next frame.

6. The method according to claim 1, wherein the determining the detection result of the next frame video frame of the current video frame according to the next frame predicted video frame and the next frame video frame of the current video frame comprises:

acquiring a next frame video frame of the current video frame in a video stream, and calculating peak signal-to-noise ratios of each pixel point in the next frame prediction video frame and the next frame video frame;

normalizing the peak signal-to-noise ratio of each pixel point to obtain a normalized value;

and comparing the normalized value with a preset threshold value, and determining the detection result of the video frame of the next frame of the current video frame.

7. The method according to claim 1, wherein comparing the normalized value with a preset threshold value to determine that the detection result of the video frame next to the current video frame is constant comprises:

if the normalized value is smaller than the preset threshold value, determining that the predicted video frame of the next frame is abnormal;

and if the normalized value is greater than or equal to the preset threshold value, determining that the predicted video frame of the next frame is normal.

8. A video detection apparatus, comprising:

9. An electronic device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the memory being coupled to the processor and the processor implementing the steps in the video detection method according to any one of claims 1 to 7 when the computer program is executed by the processor.

10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, wherein the computer program when run controls a device in which the computer readable storage medium is located to perform the steps in the video detection method according to any one of claims 1 to 7.