CN114079725B

CN114079725B - Video anti-shake method, terminal device, and computer-readable storage medium

Info

Publication number: CN114079725B
Application number: CN202010811800.7A
Authority: CN
Inventors: 吴虹; 苗磊; 贾志平; 刘蒙; 刘志鹏
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-08-13
Filing date: 2020-08-13
Publication date: 2023-02-07
Anticipated expiration: 2040-08-13
Also published as: CN114079725A; WO2022033344A1

Abstract

The embodiment of the application discloses a video anti-shake method, terminal equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring an image to be processed, wherein the image to be processed is a frame of image in a video to be processed; performing foreground and background segmentation on an image to be processed to obtain a first background image and a first foreground image; performing anti-shake processing on the first foreground image to obtain a first foreground stability-increasing image; carrying out anti-shake processing on the first background image to obtain a first background stability-increasing image; and fusing the first foreground stability augmentation image and the first background stability augmentation image to obtain a first stability augmentation image of the image to be processed. The method and the device for processing the foreground and the background divide the foreground and the background of the video image to obtain the foreground image and the background image, and then perform anti-shake processing on the foreground image and the background image respectively, namely the anti-shake processing processes of the foreground and the background are separated and not coupled, so that the simultaneous image stabilization of the foreground and the background can be realized without sacrificing the anti-shake capability of the background.

Description

Video anti-shake method, terminal device and computer-readable storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a video anti-shake method, a terminal device, and a computer-readable storage medium.

Background

At present, common video anti-shake can be divided into Electronic Image Stabilization (EIS) and Optical Image Stabilization (OIS).

In general, an EIS is a software compensation algorithm without the assistance of additional components, and the principle is to perform post-processing on a video image through the algorithm so as to prevent the acquired video image from shaking. In the existing EIS method, background motion information of a video image is obtained through a gyroscope of a terminal device, and the background motion information (or called background jitter) of the video image may refer to motion information generated by motion of a camera of the terminal device. Meanwhile, the face characteristics of the video image are collected, and the smooth motion track of the face is calculated according to the face characteristics. And finally, based on the weight between the background motion information and the smooth motion track of the human face, obtaining a smooth track of the camera according to the background motion information and the smooth motion track of the human face. And finally, carrying out shake elimination compensation on the video image according to the smooth track of the camera, and realizing simultaneous image stabilization of the foreground and the background of the video.

However, since the background motion direction and the face motion direction are opposite, the weight between the background motion information and the smooth motion trajectory of the face can only be selected in a compromise manner, and the anti-shake effect of the foreground and the anti-shake effect of the background are also a compromise effect. In other words, if foreground anti-shake needs to be realized, the background anti-shake capability must be sacrificed, and then the simultaneous image stabilization of the foreground and the background can be realized.

Disclosure of Invention

In view of this, embodiments of the present application provide a video anti-shake method, a terminal device, and a computer-readable storage medium, so as to solve the problem that in the existing EIS method, the foreground and the background can be simultaneously image-stabilized only by sacrificing the background anti-shake capability.

In a first aspect, an embodiment of the present application provides a video anti-shake method, where the method may be applied to a terminal device, and the terminal device may be, for example, a mobile phone, a tablet, and the method may include: acquiring an image to be processed, wherein the image to be processed is a frame image in a video to be processed; performing foreground and background segmentation on an image to be processed to obtain a first background image and a first foreground image; carrying out anti-shake processing on the first foreground image to obtain a first foreground stability-enhancing image; performing anti-shake processing on the first background image to obtain a first background stability-increasing image; and fusing the first foreground stability enhancement image and the first background stability enhancement image to obtain a first stability enhancement image of the image to be processed.

The foreground image and the background image are obtained by segmenting the foreground and the background of the video image, and then the foreground image and the background image are subjected to anti-shake processing respectively, namely the anti-shake processing processes of the foreground and the background are separated and not coupled, so that the anti-shake capability of the background is not sacrificed, and the simultaneous image stabilization of the foreground and the background can be realized.

In some possible implementation manners of the first aspect, the performing anti-shake processing on the first foreground image to obtain the first foreground stability enhancement image may include: extracting characteristic points of the first foreground image; obtaining a first motion track curve according to the characteristic points of a first foreground image and the characteristic points of a second foreground image, wherein the second foreground image is a foreground image of a first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence; according to the foreground anti-shake intensity of the image to be processed, smoothing the path of the first motion trajectory curve to obtain a second motion trajectory curve after the path is smoothed; obtaining a first foreground image after stability augmentation according to the second motion trail curve and the first foreground image; and performing edge compensation on the stabilized first foreground image to obtain a first foreground stability-enhancing image.

In the first foreground image after the stability enhancement, meaningless pixel points are generated at the edge of the image, so that the foreground and background fusion is not facilitated.

In some possible implementation manners of the first aspect, performing edge compensation on the augmented first foreground image to obtain the first foreground augmented stable image may include:

by passing

Calculating a target pixel value of each pixel point to be compensated in the first foreground image after the stability is increased;

taking the target pixel value as the pixel value of the pixel point to be compensated, keeping the pixel values of other pixel points unchanged, and obtaining a first foreground stability-enhanced image, wherein the other pixel points are pixel points except the pixel point to be compensated in the first foreground image after stability enhancement;

wherein, the first and the second end of the pipe are connected with each other,

v (q) is a target pixel value of a q point, and the q point is a pixel point to be compensated; p (i, j) is the pixel value of the pixel point in the ith row and the jth column, w (i, j) is the Gaussian weight of the pixel point in the ith row and the jth column, sigma is a hyperparameter, and N is a positive integer.

In this implementation, the edge black edge of the image is inpainted by gaussian weight interpolation.

In some possible implementations of the first aspect, the method further comprises: extracting first edge band information of the image to be processed, wherein the first edge band information is pixel information of an edge band between a foreground and a background in the image to be processed; extracting second edge zone information of the first stability enhancement image, wherein the first edge zone information is pixel information of an edge zone between a foreground and a background in the first stability enhancement image; and adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed according to the first edge zone information and the second edge zone information, wherein the anti-shake intensity comprises foreground anti-shake intensity and/or background anti-shake intensity.

In the implementation mode, the anti-shake intensity is adjusted according to the edge zone information of the original image (namely the image to be processed) and the edge zone information of the fused image (namely the first stability enhancement image), and then the artifact intensity generated during fusion of the foreground and the background is automatically adjusted, so that the obvious isolation zone generated during fusion due to excessive anti-shake is prevented as much as possible, and the natural transition of the fused edges of the foreground and the background is ensured.

In some possible implementations of the first aspect, the anti-shake intensity may be adjusted according to a Structural SIMilarity Index (SSIM). That is, the adjusting the anti-shake intensity of the image to be processed of the next frame or the image to be processed according to the first edge band information and the second edge band information may include: determining a first structural similarity index between the first edge band information and the second edge band information; and if the first structural similarity index does not fall into the preset similarity threshold interval, adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed. If the first structural similarity index does not fall within the preset similarity threshold interval, the anti-shake intensity is not required to be adjusted.

In some possible implementation manners of the first aspect, if the first structural similarity index does not fall within the preset similarity threshold interval, the adjusting the anti-shake intensity of the image to be processed of the next frame or the image to be processed may include: and if the first structural similarity index does not fall into the preset similarity threshold interval, multiplying the anti-shake intensity of the image to be processed by a preset numerical value to obtain an adjusted first anti-shake intensity, and taking the adjusted first anti-shake intensity as the anti-shake intensity of the next frame of image to be processed or the anti-shake intensity of the image to be processed.

Further, the first structural similarity index does not fall within the preset similarity threshold interval and can be divided into two cases, one case being: the first structural similarity index is higher than the preset similarity threshold interval, namely the first structural similarity index is higher than the maximum value of the preset similarity threshold interval, and at the moment, the anti-shake intensity can be reduced. The other situation is as follows: the first structural similarity index is lower than the preset similarity threshold interval, namely the first structural similarity index is lower than the minimum value of the preset similarity threshold interval, and at the moment, the anti-shake intensity can be increased.

Illustratively, when the anti-shake intensity needs to be reduced, the preset value is 0.9, that is, the anti-shake intensity of the image to be processed is multiplied by 0.9, and the obtained product is used as the adjusted first anti-shake intensity. When the anti-shake intensity needs to be increased, the preset value is 1.1, namely the anti-shake intensity of the image to be processed is multiplied by 1.1, and the obtained product is used as the adjusted first anti-shake intensity.

The adjusted first anti-shake intensity can be used as the foreground anti-shake intensity of the next frame of image and/or the background anti-shake intensity of the next frame of image, and at the moment, the adjusted first anti-shake intensity can be used for performing foreground anti-shake or background anti-shake on the next frame of image to be processed. Of course, the adjusted first anti-shake intensity may also be used as the foreground anti-shake intensity and/or the background anti-shake intensity of the image to be processed, and at this time, the foreground anti-shake is performed again according to the adjusted first anti-shake intensity, or the background anti-shake is performed until the structural similarity index falls into the preset similarity threshold interval. Therefore, in some possible implementations of the first aspect, the adjusted first anti-shake intensity includes a foreground anti-shake intensity of the image to be processed; the method may further comprise:

according to the adjusted first anti-shake intensity, performing path smoothing on the first motion track curve to obtain a third motion track curve after the path smoothing, wherein the first motion track curve is a motion track curve obtained according to the characteristic points of the first foreground image and the characteristic points of the second foreground image, the second foreground image is a foreground image of the first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence;

obtaining a second foreground stability-increasing image according to the third motion track curve and the first foreground image;

fusing the second foreground stability enhancement image and the first background stability enhancement image to obtain a second stability enhancement image of the image to be processed;

determining a second structural similarity index between the second edge zone information and third edge zone information, wherein the third edge zone information is pixel information of an edge zone between a foreground and a background in a second stability augmentation image;

if the second structural similarity index does not fall into the preset similarity threshold interval, multiplying the adjusted first anti-shake intensity by a preset numerical value to obtain an adjusted second anti-shake intensity;

and taking the adjusted second anti-shake intensity as the adjusted first anti-shake intensity, returning to the step of smoothing the path of the first motion trajectory curve according to the adjusted first anti-shake intensity to obtain a third motion trajectory curve with the smoothed path until the second structural similarity index falls into a preset similarity threshold interval.

In some possible implementation manners of the first aspect, the performing anti-shake processing on the first background image to obtain a first background stability-enhanced image may include: extracting feature points of the first background image; obtaining a fourth motion trail curve according to the feature points of the first background image and the feature points of the second background image, wherein the second background image is a background image of a second target image, the video to be processed comprises the second target image, and the second target image and the image to be processed are continuous image frames in an image sequence; according to the background anti-shake intensity of the image to be processed, path smoothing is carried out on the fourth motion trajectory curve to obtain a second motion trajectory curve after the path smoothing; and obtaining a first foreground stability enhancement image according to the second motion track curve and the first foreground image.

In a second aspect, an embodiment of the present application provides a video anti-shake apparatus, which is applied to a terminal device, and the apparatus may include:

the image acquisition module is used for acquiring an image to be processed, wherein the image to be processed is a frame of image in a video to be processed;

the foreground and background segmentation module is used for carrying out foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image;

the foreground anti-shake module is used for carrying out anti-shake processing on the first foreground image to obtain a first foreground stability-increasing image;

the background anti-shake module is used for carrying out anti-shake processing on the first background image to obtain a first background stability-increasing image;

and the image fusion module is used for fusing the first foreground stability enhancement image and the first background stability enhancement image to obtain a first stability enhancement image of the image to be processed.

In some possible implementations of the second aspect, the foreground anti-shake module is specifically configured to: extracting feature points of the first foreground image; obtaining a first motion track curve according to the characteristic points of the first foreground image and the characteristic points of the second foreground image, wherein the second foreground image is a foreground image of the first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence; according to the foreground anti-shake intensity of the image to be processed, path smoothing is carried out on the first motion trajectory curve, and a second motion trajectory curve after the path smoothing is obtained; obtaining a first foreground image after stability augmentation according to the second motion trail curve and the first foreground image; and performing edge compensation on the stabilized first foreground image to obtain a first foreground stability-enhancing image.

In some possible implementations of the second aspect, the foreground anti-shake module is specifically configured to: by passing

Calculating a target pixel value of each pixel point to be compensated in the stabilized first foreground image;

taking the target pixel value as the pixel value of the pixel point to be compensated, and keeping the pixel values of other pixel points unchanged to obtain a first foreground stability-enhanced image, wherein the other pixel points are pixel points except the pixel point to be compensated in the first foreground image after stability enhancement; wherein the content of the first and second substances,

In some possible implementations of the second aspect, the apparatus further comprises: the anti-shake intensity adjusting module is used for extracting first edge band information of the image to be processed, wherein the first edge band information is pixel information of an edge band between a foreground and a background in the image to be processed; extracting second edge zone information of the first stability enhancement image, wherein the second edge zone information is pixel information of an edge zone between a foreground and a background in the first stability enhancement image; and adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed according to the first edge zone information and the second edge zone information, wherein the anti-shake intensity comprises foreground anti-shake intensity and/or background anti-shake intensity.

In some possible implementations of the second aspect, the anti-shake intensity adjustment module is specifically configured to: determining a first structural similarity index between the first edge band information and the second edge band information; and if the first structural similarity index does not fall into the preset similarity threshold interval, adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed.

In some possible implementations of the second aspect, the anti-shake intensity adjustment module is specifically configured to: and if the first structural similarity index does not fall into the preset similarity threshold interval, multiplying the anti-shake intensity of the image to be processed by a preset numerical value to obtain an adjusted first anti-shake intensity, and taking the adjusted first anti-shake intensity as the anti-shake intensity of the next frame of image to be processed or the anti-shake intensity of the image to be processed.

In some possible implementations of the second aspect, the adjusted first anti-shake intensity includes a foreground anti-shake intensity of the image to be processed; the device also comprises a current frame anti-shake result adjusting module used for: according to the adjusted first anti-shake intensity, performing path smoothing on the first motion track curve to obtain a third motion track curve after the path smoothing, wherein the first motion track curve is a motion track curve obtained according to the characteristic points of the first foreground image and the characteristic points of the second foreground image, the second foreground image is a foreground image of the first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence; obtaining a second foreground stability-enhancing image according to the third motion trail curve and the first foreground image; fusing the second foreground stability enhancement image and the first background stability enhancement image to obtain a second stability enhancement image of the image to be processed; determining a second structural similarity index between the second edge zone information and third edge zone information, wherein the third edge zone information is pixel information of an edge zone between a foreground and a background in a second stability-enhanced image; if the second structural similarity index does not fall into the preset similarity threshold interval, multiplying the adjusted first anti-shake intensity by a preset numerical value to obtain an adjusted second anti-shake intensity; and taking the adjusted second anti-shake intensity as the adjusted first anti-shake intensity, returning to the step of smoothing the path of the first motion trajectory curve according to the adjusted first anti-shake intensity to obtain a third motion trajectory curve with the smoothed path until the second structural similarity index falls into a preset similarity threshold interval.

In some possible implementations of the second aspect, the background anti-shake module is specifically configured to: extracting feature points of the first background image; obtaining a fourth motion trail curve according to the feature points of the first background image and the feature points of the second background image, wherein the second background image is a background image of a second target image, the video to be processed comprises the second target image, and the second target image and the image to be processed are continuous image frames in an image sequence; according to the background anti-shake intensity of the image to be processed, path smoothing is carried out on the fourth motion trajectory curve to obtain a second motion trajectory curve after the path smoothing; and obtaining a first foreground stability-increasing image according to the second motion track curve and the first foreground image.

The video anti-shake apparatus has a function of implementing the video anti-shake method of the first aspect, and the function may be implemented by hardware, or may be implemented by hardware executing corresponding software, where the hardware or the software includes one or more modules corresponding to the above function, and the modules may be software and/or hardware.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the video anti-shake method according to any one of the above first aspects is implemented.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor, the computer program implements the video anti-shake method according to any one of the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip system, where the chip system includes a processor, the processor is coupled with a memory, and the processor executes a computer program stored in the memory to implement the video anti-shake method according to any one of the above first aspects. The chip system can be a single chip or a chip module consisting of a plurality of chips.

In a sixth aspect, an embodiment of the present application provides a computer program product, which, when running on a terminal device, causes the terminal device to execute the video anti-shake method according to any one of the above first aspects.

It is understood that the beneficial effects of the second to sixth aspects can be seen from the description of the first aspect, and are not described herein again.

Drawings

Fig. 1 is a schematic structural diagram of a terminal device 100 according to an embodiment of the present application;

fig. 2 is a schematic block diagram of a software structure of the terminal device 100 according to the embodiment of the present application;

fig. 3 is a schematic diagram of a video recording interface of a mobile phone according to an embodiment of the present application;

fig. 4 is a schematic block diagram of a video anti-shake process provided in an embodiment of the present application;

fig. 5 is a schematic diagram illustrating an anti-shake processing process of a video image frame according to an embodiment of the present application;

fig. 6 is another schematic block diagram of a video anti-shake process provided in an embodiment of the present application;

fig. 7 is a further schematic diagram of a video anti-shake process provided in an embodiment of the present application;

fig. 8 is a further schematic diagram of a video anti-shake process provided in an embodiment of the present application;

fig. 9 is a schematic diagram of a background anti-shake process provided in the embodiment of the present application;

fig. 10 is a flowchart illustrating a video anti-shake process according to an embodiment of the present application;

fig. 11 is a schematic flowchart of another specific process of video anti-shake according to an embodiment of the present application;

FIG. 12 is a schematic diagram of edge compensation provided by an embodiment of the present application;

fig. 13 is a schematic flowchart of another specific video anti-shaking process according to an embodiment of the present disclosure;

fig. 14 is a schematic flowchart of another specific video anti-shaking process provided in the embodiment of the present application;

fig. 15 is a schematic flowchart of another specific video anti-shaking process according to an embodiment of the present application;

fig. 16 is a schematic block diagram of a flow of a video anti-shake method provided in an embodiment of the present application;

fig. 17 is a schematic block diagram of a video anti-shake apparatus according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application.

First, an exemplary description is given of a terminal device provided in an embodiment of the present application.

The video anti-shake scheme provided by the embodiment of the application can be applied to terminal equipment, the terminal equipment can be terminal equipment which has an image shooting function and data processing capacity, the terminal equipment generally comprises a camera, and the terminal equipment can shoot video images through the camera; of course, the terminal device may be a terminal device that does not have an image capturing function but has a data processing capability, and in this case, the terminal device may receive a video image captured by another device.

The terminal device may be a mobile phone, a tablet computer, a wearable device, an in-vehicle device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), and other terminal devices, and the embodiment of the present application does not set any limit to the specific type of the terminal device.

For example, referring to fig. 1, a schematic structural diagram of a terminal device 100 provided in an embodiment of the present application is shown.

As shown in fig. 1, the terminal device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identification Module (SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.

It is to be understood that the illustrated structure of the embodiment of the present application does not constitute a specific limitation to the terminal device 100. In other embodiments of the present application, terminal device 100 may include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Processor 110 may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.

The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.

A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.

In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose-input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.

The I2C interface is a bidirectional synchronous serial bus comprising a serial data line (SDA) and a Serial Clock Line (SCL). In some embodiments, the processor 110 may include multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, a charger, a flash, a camera 193, etc. through different I2C bus interfaces, respectively. For example: the processor 110 may be coupled to the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through an I2C bus interface to implement a touch function of the terminal device 100.

The I2S interface may be used for audio communication. In some embodiments, processor 110 may include multiple sets of I2S buses. The processor 110 may be coupled to the audio module 170 through an I2S bus to enable communication between the processor 110 and the audio module 170. In some embodiments, the audio module 170 may transmit the audio signal to the wireless communication module 160 through the I2S interface, so as to implement a function of receiving a call through a bluetooth headset.

The PCM interface may also be used for audio communication, sampling, quantizing and encoding analog signals. In some embodiments, the audio module 170 and the wireless communication module 160 may be coupled by a PCM bus interface. In some embodiments, the audio module 170 may also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to implement a function of answering a call through a bluetooth headset. Both the I2S interface and the PCM interface may be used for audio communication.

The UART interface is a universal serial data bus used for asynchronous communications. The bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is generally used to connect the processor 110 with the wireless communication module 160. For example: the processor 110 communicates with a bluetooth module in the wireless communication module 160 through a UART interface to implement a bluetooth function. In some embodiments, the audio module 170 may transmit the audio signal to the wireless communication module 160 through a UART interface, so as to implement the function of playing music through a bluetooth headset.

MIPI interfaces may be used to connect processor 110 with peripheral devices such as display screen 194, camera 193, and the like. The MIPI interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, processor 110 and camera 193 communicate through a CSI interface to implement the capture function of terminal device 100. The processor 110 and the display screen 194 communicate through the DSI interface to implement the display function of the terminal device 100.

The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 110 with the camera 193, the display 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like. The GPIO interface may also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, and the like.

The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the terminal device 100, and may also be used to transmit data between the terminal device 100 and a peripheral device. And the method can also be used for connecting a headset and playing audio through the headset. The interface may also be used to connect other terminal devices, such as AR devices and the like.

It should be understood that the interface connection relationship between the modules illustrated in the embodiment of the present application is only an exemplary illustration, and does not constitute a limitation on the structure of the terminal device 100. In other embodiments of the present application, the terminal device 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.

The charging management module 140 is configured to receive charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 may receive charging input from a wired charger via the USB interface 130. In some wireless charging embodiments, the charging management module 140 may receive a wireless charging input through a wireless charging coil of the terminal device 100. The charging management module 140 may also supply power to the terminal device through the power management module 141 while charging the battery 142.

The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140, and supplies power to the processor 110, the internal memory 121, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be used to monitor parameters such as battery capacity, battery cycle count, battery state of health (leakage, impedance), etc. In some other embodiments, the power management module 141 may also be disposed in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may also be disposed in the same device.

The wireless communication function of the terminal device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.

The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in terminal device 100 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

The mobile communication module 150 may provide a solution including 2G/3G/4G/5G wireless communication applied on the terminal device 100. The mobile communication module 150 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 150 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the electromagnetic wave to the modem processor for demodulation. The mobile communication module 150 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the same device as at least some of the modules of the processor 110.

The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.) or displays an image or video through the display screen 194. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 150 or other functional modules, independent of the processor 110.

The wireless communication module 160 may provide a solution for wireless communication applied to the terminal device 100, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (bluetooth, BT), global Navigation Satellite System (GNSS), frequency Modulation (FM), near Field Communication (NFC), infrared (IR), and the like. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering processing on electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, perform frequency modulation and amplification on the signal, and convert the signal into electromagnetic waves through the antenna 2 to radiate the electromagnetic waves.

In some embodiments, the antenna 1 of the terminal device 100 is coupled to the mobile communication module 150 and the antenna 2 is coupled to the wireless communication module 160 so that the terminal device 100 can communicate with the network and other devices through wireless communication technology. The wireless communication technology may include global system for mobile communications (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), time division code division multiple access (time-division multiple access, TD-SCDMA), long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. The GNSS may include a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS), and/or a Satellite Based Augmentation System (SBAS).

The terminal device 100 implements a display function by the GPU, the display screen 194, and the application processor. The GPU is a microprocessor for image processing, connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 194 is used to display images, video, and the like. The display screen 194 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the terminal device 100 may include 1 or N display screens 194, N being a positive integer greater than 1.

The terminal device 100 may implement a shooting function through the ISP, the camera 193, the video codec, the GPU, the display screen 194, the application processor, and the like.

The ISP is used to process the data fed back by the camera 193. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 193.

The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to be converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the terminal device 100 may include 1 or N cameras 193, N being a positive integer greater than 1.

The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the terminal device 100 selects a frequency point, the digital signal processor is used to perform fourier transform or the like on the frequency point energy.

Video codecs are used to compress or decompress digital video. The terminal device 100 may support one or more video codecs. In this way, the terminal device 100 can play or record video in a plurality of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.

The NPU is a neural-network (NN) computing processor, which processes input information quickly by referring to a biological neural network structure, for example, by referring to a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can implement applications such as intelligent recognition of the terminal device 100, for example: image recognition, face recognition, speech recognition, text understanding, and the like.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the storage capability of the terminal device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in the external memory card.

The internal memory 121 may be used to store computer-executable program code, which includes instructions. The internal memory 121 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (such as audio data, a phonebook, etc.) created during use of the terminal device 100, and the like. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. The processor 110 executes various functional applications of the terminal device 100 and data processing by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

The terminal device 100 may implement an audio function through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playing, recording, etc.

The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.

The speaker 170A, also called a "horn", is used to convert the audio electrical signal into an acoustic signal. The terminal device 100 can listen to music through the speaker 170A, or listen to a handsfree call.

The receiver 170B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the terminal device 100 answers a call or voice information, it is possible to answer a voice by bringing the receiver 170B close to the human ear.

The microphone 170C, also referred to as a "microphone," is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can input a voice signal to the microphone 170C by speaking near the microphone 170C through the mouth. The terminal device 100 may be provided with at least one microphone 170C. In other embodiments, the terminal device 100 may be provided with two microphones 170C, which may implement a noise reduction function in addition to collecting sound signals. In other embodiments, the terminal device 100 may further include three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions.

The earphone interface 170D is used to connect a wired earphone. The headset interface 170D may be the USB interface 130, or may be an open mobile platform (OMTP) standard interface of 3.5mm, a cellular telecommunications industry association (cellular telecommunications industry association) standard interface of the USA.

The pressure sensor 180A is used for sensing a pressure signal, and can convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194. The pressure sensor 180A can be of a wide variety, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 180A, the capacitance between the electrodes changes. The terminal device 100 determines the intensity of the pressure from the change in the capacitance. When a touch operation is applied to the display screen 194, the terminal device 100 detects the intensity of the touch operation based on the pressure sensor 180A. The terminal device 100 may also calculate the touched position from the detection signal of the pressure sensor 180A. In some embodiments, the touch operations that are applied to the same touch position but different touch operation intensities may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.

The gyro sensor 180B may be used to determine the motion attitude of the terminal device 100. In some embodiments, the angular velocity of terminal device 100 about three axes (i.e., x, y, and z axes) may be determined by gyroscope sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. Illustratively, when a shutter is pressed or a video button is pressed, the gyro sensor 180B detects a shake angle of the terminal device 100, calculates a distance to be compensated for the lens module according to the shake angle, and allows the lens to counteract the shake of the terminal device 100 through a reverse movement, thereby achieving anti-shake. The gyroscope sensor 180B may also be used for navigation, somatosensory gaming scenes.

The air pressure sensor 180C is used to measure air pressure. In some embodiments, the terminal device 100 calculates an altitude from the barometric pressure measured by the barometric pressure sensor 180C, and assists in positioning and navigation.

The magnetic sensor 180D includes a hall sensor. The terminal device 100 may detect the opening and closing of the flip holster using the magnetic sensor 180D. In some embodiments, when the terminal device 100 is a folder, the terminal device 100 may detect the opening and closing of the folder according to the magnetic sensor 180D. And then according to the opening and closing state of the leather sheath or the opening and closing state of the flip cover, the automatic unlocking of the flip cover is set.

The acceleration sensor 180E can detect the magnitude of acceleration of the terminal device 100 in various directions (generally, three axes). The magnitude and direction of gravity can be detected when the terminal device 100 is stationary. The method can also be used for identifying the attitude of the terminal equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.

A distance sensor 180F for measuring a distance. The terminal device 100 may measure the distance by infrared or laser. In some embodiments, shooting a scene, the terminal device 100 may range using the distance sensor 180F to achieve fast focus.

The proximity light sensor 180G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The terminal device 100 emits infrared light to the outside through the light emitting diode. The terminal device 100 detects infrared reflected light from a nearby object using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the terminal device 100. When insufficient reflected light is detected, the terminal device 100 can determine that there is no object near the terminal device 100. The terminal device 100 may utilize the proximity light sensor 180G to detect that the user holds the terminal device 100 close to the ear for talking, so as to automatically turn off the screen to achieve the purpose of saving power. The proximity light sensor 180G may also be used in a holster mode, a pocket mode automatically unlocks and locks the screen.

The ambient light sensor 180L is used to sense ambient light brightness. The terminal device 100 may adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness. The ambient light sensor 180L may also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 180L may also cooperate with the proximity light sensor 180G to detect whether the terminal device 100 is in a pocket, in order to prevent accidental touches.

The fingerprint sensor 180H is used to collect a fingerprint. The terminal device 100 can utilize the collected fingerprint characteristics to realize fingerprint unlocking, access to an application lock, fingerprint photographing, fingerprint incoming call answering and the like.

The temperature sensor 180J is used to detect temperature. In some embodiments, the terminal device 100 executes a temperature processing policy using the temperature detected by the temperature sensor 180J. For example, when the temperature reported by the temperature sensor 180J exceeds the threshold, the terminal device 100 performs a reduction in performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, the terminal device 100 heats the battery 142 when the temperature is below another threshold to avoid the terminal device 100 being abnormally shut down due to low temperature. In other embodiments, when the temperature is lower than a further threshold, the terminal device 100 performs boosting on the output voltage of the battery 142 to avoid abnormal shutdown due to low temperature.

The touch sensor 180K is also called a "touch device". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation applied thereto or nearby. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output associated with the touch operation may be provided via the display screen 194. In other embodiments, the touch sensor 180K may be disposed on the surface of the terminal device 100, different from the position of the display screen 194.

The bone conduction sensor 180M may acquire a vibration signal. In some embodiments, the bone conduction sensor 180M may acquire a vibration signal of the human vocal part vibrating the bone mass. The bone conduction sensor 180M may also contact the human pulse to receive the blood pressure pulsation signal. In some embodiments, the bone conduction sensor 180M may also be disposed in a headset, integrated into a bone conduction headset. The audio module 170 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.

The keys 190 include a power-on key, a volume key, and the like. The keys 190 may be mechanical keys. Or may be touch keys. The terminal device 100 may receive a key input, and generate a key signal input related to user setting and function control of the terminal device 100.

The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration cues, as well as for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 194. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.

Indicator 192 may be an indicator light that may be used to indicate a state of charge, a change in charge, or a message, missed call, notification, etc.

The SIM card interface 195 is used to connect a SIM card. The SIM card can be brought into and out of contact with the terminal device 100 by being inserted into the SIM card interface 195 or being pulled out of the SIM card interface 195. The terminal device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1. The SIM card interface 195 may support a Nano SIM card, a Micro SIM card, a SIM card, etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 195 may also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The terminal device 100 interacts with the network through the SIM card to implement functions such as communication and data communication. In some embodiments, the terminal device 100 employs eSIM, namely: an embedded SIM card. The eSIM card may be embedded in the terminal device 100 and cannot be separated from the terminal device 100.

The software system of the terminal device 100 may adopt a hierarchical architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes an Android system with a layered architecture as an example, and exemplarily illustrates a software structure of the terminal device 100.

Fig. 2 is a schematic block diagram of a software structure of the terminal device 100 according to the embodiment of the present application.

The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom.

The application layer may include a series of application packages.

As shown in fig. 2, the application package may include applications such as camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, etc.

The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions.

As shown in FIG. 2, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like.

The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.

The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide the communication function of the terminal device 100. Such as management of call status (including on, off, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.

The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a brief dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, text information is prompted in the status bar, a prompt tone is given, the terminal device vibrates, an indicator light flickers, and the like.

The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.

The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. And executing java files of the application program layer and the application program framework layer into a binary file by the virtual machine. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface managers (surface managers), media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., openGL ES), 2D graphics engines (e.g., SGL), and the like.

The surface manager is used to manage the display subsystem and provide a fusion of the 2D and 3D layers for multiple applications.

The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

The following describes exemplary workflow of the terminal device 100 software and hardware in connection with capturing a photo scene.

Referring to fig. 3, which is a schematic view of a video recording interface of a mobile phone according to an embodiment of the present disclosure, a touch sensor 180K of the terminal device 100 receives a touch operation, and a corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes the touch operation into an original input event (including touch coordinates, a time stamp of the touch operation, and other information). The raw input events are stored at the kernel layer. And the application program framework layer acquires the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation, and taking the control corresponding to the click operation as the control 31 of the camera application icon as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera drive by calling the kernel layer, and captures a still image or a video by the camera 193.

When the terminal device 100 receives a touch operation of the user on the control 32, the terminal device 100 records a video in response to the touch operation, and the terminal device 100 captures a video image and displays the video image 33 through the display screen 194. Of course, the terminal device 100 may also capture background motion information of the video image through the gyro sensor 180B when capturing the video image.

After the terminal device 100 captures the video images, the video anti-shake method provided by the embodiment of the application can be adopted for anti-shake processing on each frame of video image, and after the anti-shake processing, a video with a foreground and a background simultaneously stabilizing images is obtained. The following describes a video anti-shake process provided in an embodiment of the present application.

Referring to a schematic block diagram of a video anti-shake process provided by the embodiment of the present application shown in fig. 4, as shown in fig. 4, first, the terminal device 100 acquires an input video, where the input video may be a video captured by the camera 193 through a video recording operation of the terminal device 100 in response to a user, and at this time, the camera 193 may be a front camera or a rear camera, that is, the terminal device 100 may perform anti-shake processing on a video shot by the front camera or perform anti-shake processing on a video shot by the rear camera. The input video includes a plurality of frames of images.

Of course, in some other embodiments, the terminal device 100 may also obtain the input video by receiving a video that has been captured by another device. For example, the terminal device 100 receives a video that has been taken by another cell phone.

The terminal device 100 performs image foreground and background segmentation on each frame of image in the input video to obtain a foreground image and a background image. Then, the terminal device 100 performs anti-shake processing on the foreground image to obtain an anti-shake foreground image; and carrying out anti-shake processing on the background image to obtain the anti-shake background image. And finally, carrying out image fusion on the anti-shake foreground image and the anti-shake background image to obtain an anti-shake image.

Referring to fig. 5, an exemplary anti-shake processing procedure of a video image frame according to an embodiment of the present application is shown. As shown in fig. 5, a certain frame of image in the input video acquired by the terminal device 100 is an image 51, and after the image 51 is subjected to foreground-background segmentation operation, a foreground image 52 and a background image 53 are obtained. For the foreground image 52, feature extraction is performed first, and then the foreground image is subjected to an anti-shake module 54 to obtain an anti-shake foreground image. The background image 53 passes through the anti-shake module 55, and an anti-shake background image 56 is obtained. And performing image fusion on the anti-shake background image 56 and the anti-shake foreground image to output a video image 57, wherein the video image 57 refers to an image in which the foreground and the background are stabilized at the same time.

The terminal device 100 performs anti-shake processing on each frame of input video image to obtain a frame of anti-shake image. And forming a video by one frame of anti-shake images to obtain a video with a foreground and a background simultaneously stabilized.

In some embodiments, after acquiring the anti-shake video image, the terminal device 100 may further adjust the anti-shake intensity according to the edge band information of the input video image and the edge band information of the output video image. The edge band information of the input video image may refer to pixel information of an edge band extracted when a foreground scene and a background scene are divided, and the edge band information of the output video image may refer to pixel information of an edge band between the foreground scene and the background scene in an image after image fusion. The edge band may refer to an edge portion between the foreground and background in the image.

Referring to fig. 6, another schematic block diagram of a video anti-shake process provided in the embodiment of the present application is shown. As shown in fig. 6, similarly to fig. 4, the terminal device 100 first acquires an input video; then, performing foreground and background segmentation on each frame of video image to obtain a foreground image and a background image; and then, carrying out background anti-shake processing on the background image to obtain an anti-shake background image. Performing anti-shake processing on the foreground image to obtain an anti-shake foreground image; and finally, carrying out image fusion on the anti-shake background image and the anti-shake foreground image to obtain an anti-shake video image.

Different from fig. 4, in the case of dividing the foreground and background images in fig. 6, the pixel information of the edge zone between the foreground and background images of the image to be processed may be extracted and recorded as the first edge zone information. In addition, pixel information of an edge band between the foreground and the background may be extracted from the anti-shake video image (i.e., the image obtained by fusion) and recorded as second edge band information. And performing feedback control on the anti-shake range constraint according to the first edge zone information and the second edge zone information, namely adjusting the anti-shake intensity. The anti-shake range may refer to the magnitude of the anti-shake intensity. Anti-shake intensity can include prospect anti-shake intensity and background anti-shake intensity, can adjust prospect anti-shake intensity and background anti-shake intensity according to first edge zone information and second edge zone information.

In some other embodiments, the foreground anti-shake intensity or the background anti-shake intensity is adjusted according to the first edge band information and the second edge band information. Referring to fig. 7 and fig. 8, both are further schematic diagrams of video anti-shake processes provided by the embodiments of the present application. In fig. 7, the foreground anti-shake intensity is adjusted based on the first edge band information and the second edge band information. The adjusted foreground anti-shaking strength acts on the foreground anti-shaking process. In fig. 8, the background anti-shake intensity is adjusted according to the first edge zone information and the second edge zone information, and the adjusted background anti-shake intensity functions and the background anti-shake process.

At this time, it is also exemplarily shown in fig. 7 that the foreground anti-shake manner is full-frame anti-shake. Illustratively, the full-frame anti-shake process may include foreground image feature point extraction, path smoothing, edge compensation, and other processes. The feature point extraction method of the foreground image may be any method, and for example, the feature point extraction is performed on the foreground image by an optical flow method. And the path smoothing refers to performing path smoothing on the motion track curve of the foreground characteristic point according to the anti-shake intensity. The edge compensation can be pixel compensation of the edge part of the image, and the edge black edge of the foreground image after the anti-shake processing can be eliminated or reduced through the edge compensation, so that the fusion of the subsequent foreground image and the subsequent background image is more facilitated. The way of edge compensation can be arbitrary, for example, the black edge of the image edge is repaired by gaussian weight interpolation.

It should be noted that both the anti-shake method for the foreground image and the anti-shake method for the background image may be arbitrary, that is, any video anti-shake method may be adopted to perform anti-shake processing on the foreground image and the background image. For example, for a background image, anti-shake processing may be performed using gyro sensor data. Specifically, when the terminal device 100 records a video, the data output by the gyro sensor 180B may be read, and the gyro data may be subjected to angle integration to obtain background motion information of the video image. And then, according to the background motion information, carrying out shake elimination compensation on the background image to realize image stabilization of the background, namely carrying out shake prevention treatment on the background. Referring to a background anti-shake process diagram shown in fig. 9, the processes of 3-dimensional rotation vector estimation, 3-dimensional rotation vector smoothing, motion compensation amount, image affine transformation (Warp) output and the like are performed on data output by the gyroscope to obtain an anti-shake background image. In this process, the background anti-shake intensity may be applied to the 3-dimensional rotation vector smoothing process.

To better describe the video anti-shake scheme provided in the embodiments of the present application, the following description is provided with reference to fig. 10.

Fig. 10 is a schematic flowchart of a video anti-shake process according to an embodiment of the present disclosure. As shown in fig. 10, for the foreground image, a full-frame image stabilization (or called full-frame anti-shake) method is adopted to perform anti-shake processing, so as to obtain a stabilized foreground image (i.e., an anti-shake foreground image). The full-frame image stabilization process in fig. 10 includes foreground image feature point extraction, path smoothing, edge compensation, and other processes. Specifically, the feature points of the multiple frames of foreground images may be extracted, and a motion trajectory curve (or called foreground feature point path) of the foreground feature points may be obtained according to the feature points of the multiple frames of foreground images. The multi-frame foreground image is a foreground image of a continuous multi-frame image. For example, the terminal device 100 performs foreground and background segmentation on the current frame image to obtain a foreground image and a background image of the current frame. At the current moment, the terminal device 100 caches 10 continuous images, where the 10 continuous images include a current frame image, and n frames of images are taken from the front and back of the current frame image as a boundary, so as to obtain foreground images of the taken continuous multi-frame images; respectively extracting the characteristic points of the plurality of foreground images; and obtaining a foreground characteristic point path according to the characteristic points of the plurality of foreground images.

And after the foreground characteristic point path is obtained, smoothing the path of the foreground characteristic point by using the foreground anti-shake intensity of the current frame image to obtain a smoothed foreground characteristic point path. And then, obtaining a stability-enhanced foreground image according to the foreground characteristic point path after the path smoothing and the foreground image of the current frame.

And for the background image, performing path smoothing by using the background anti-shake intensity of the current frame image to obtain a stability-enhanced background image (namely the anti-shake background image). The motion trail curve of the background image feature points can be obtained in a feature point extraction mode. For example, feature points of a background image of a current frame are extracted, a feature point path of the background image is obtained depending on feature points of a plurality of frames of background images which are continuous in an image sequence, and the feature point path of the background image is smoothed by using the shake intensity of the background image. Of course, the motion trail curve of the background image can also be obtained by other manners, for example, the motion trail curve of the background image can be obtained by gyroscope data.

When the foreground and background of the video image are divided, the pixel information of the edge zone can be extracted to obtain the first edge zone information. And the image fusion is carried out on the stability-enhanced foreground image and the stability-enhanced background image to obtain a video image with the front scene and the back scene being stabilized simultaneously. Meanwhile, the pixel information of the edge band in the video image obtained by fusion can be extracted to obtain second edge band information.

And calculating the similarity between the first edge zone information and the second edge zone information, evaluating the difference between the foreground and background edge parts of the input video image and the foreground and background edge parts of the video image obtained by fusion according to the similarity, and adjusting the anti-shake intensity of the current frame image or the anti-shake intensity of the next frame image according to the difference.

In specific application, the anti-shake intensity is adjusted or regulated according to the structural similarity index by calculating the structural similarity index between the first edge zone information and the second edge zone information. Specifically, after the structural similarity index is calculated, it may be determined whether the structural similarity index falls within a preset similarity threshold interval. If the structural similarity index does not fall into the preset similarity threshold interval, and the structural similarity index is larger than the maximum value of the preset similarity threshold interval, reducing the anti-shake intensity until the structural similarity index falls into the preset similarity threshold interval; and if the structural similarity index does not fall into the preset similarity threshold interval, increasing the anti-shake intensity until the structural similarity index falls into the preset similarity threshold interval.

The manner of increasing or decreasing the anti-shake intensity may be arbitrary. In some embodiments, the current anti-shake intensity may be multiplied by a corresponding coefficient to increase or decrease the anti-shake intensity. In general, when the anti-shake intensity needs to be increased, the corresponding coefficient is greater than 1, and when the anti-shake intensity needs to be decreased, the corresponding coefficient is less than 1. For example, if the current anti-shake intensity is a and the anti-shake intensity needs to be increased, the current anti-shake intensity a is multiplied by 1.1, and the obtained product is used as the adjusted anti-shake intensity; and when the anti-shake intensity needs to be reduced, multiplying the current anti-shake intensity A by 0.9, and taking the obtained product as the adjusted anti-shake intensity.

In fig. 10, it is shown that the path smoothing in the foreground anti-shake process and the path smoothing in the background anti-shake process can be performed after the anti-shake intensity adjustment. In practical applications, the adjusted anti-shake intensity may include a foreground anti-shake intensity and/or a background anti-shake intensity of the current frame image, or a foreground anti-shake intensity and/or a background anti-shake intensity of the next frame image, and accordingly, the adjusted anti-shake intensity may act on the foreground anti-shake process and/or the background anti-shake process of the current frame image, or act on the foreground anti-shake process and/or the background anti-shake process of the next frame image. Various aspects will now be described with reference to the drawings.

Referring to fig. 11, another specific flowchart of a video anti-shake process provided in the embodiment of the present application is shown. As shown in fig. 11, a Deep Neural Network (DNN) model is used to perform foreground and background segmentation on an input video image, feature points of the foreground image are extracted by an optical flow method, and edge black edges are interpolated by gaussian weights to perform edge supplement on the image. In addition, a structural similarity index between the two pieces of edge band information is calculated, the foreground anti-shake intensity of the next frame image is adjusted according to the structural similarity index, and the adjusted foreground anti-shake intensity is applied to the foreground feature point path smoothing process of the next frame image.

It is understood that the foreground and background segmentation method, the foreground image feature point extraction method, and the edge compensation method shown in fig. 11 are all exemplary methods. The gaussian weight edge compensation method is described below with reference to the edge compensation diagram shown in fig. 12.

As shown in fig. 12, in the foreground image after the stabilization, the edge 121 is a boundary between a significant pixel and a meaningless pixel, that is, the edge 121 is used as a boundary, the pixel on the left side of fig. 12 is a significant pixel, and the pixel on the right side is a meaningless pixel. The meaningless pixel points may be pixel points forming an edge black border. After the foreground image is subjected to anti-shake processing, meaningless pixel points can be generated at the edge, and the meaningless pixel points are not beneficial to foreground and background fusion. In order to further improve the effect of the fused video image, edge compensation can be performed on the meaningless pixel points, namely, pixel compensation is performed on the edge part of the image.

The q point in fig. 12 is a meaningless pixel point, and pixel supplementation needs to be performed on the q point. The pixel value of the q point can be obtained by the surrounding pixel points through superposition calculation according to the Gaussian weight. The area 122 in fig. 12 is used as a surrounding area of the q-point, and the pixel value of the q-point is obtained according to the pixel value of each pixel point in the area 122. In particular, by

The pixel value of the q point is calculated.

v (q) is the pixel value of q point, P (i, j) is the pixel value of the pixel point of the ith row and the jth column, w (i, j) is the Gaussian weight of the pixel point of the ith row and the jth column, sigma is a hyperparameter, and N is a positive integer.

And traversing each meaningless pixel point, calculating the pixel value of each meaningless pixel point, taking the calculated pixel value as the pixel value of the meaningless pixel point, and obtaining the stability-enhanced foreground image after edge compensation, wherein the pixel value of the meaningless pixel point is unchanged.

In fig. 12, after the structural similarity index is calculated, it is determined whether the structural similarity index falls within a preset similarity threshold interval. And if the structural similarity index does not fall into the preset similarity threshold interval, adjusting the foreground anti-shaking intensity until the structural similarity index falls into the preset similarity threshold interval.

For example, the foreground anti-shake intensity of the current frame image is a, and the background anti-shake intensity is B. And respectively carrying out anti-shake processing on the foreground image and the background image according to the foreground anti-shake intensity and the background anti-shake intensity of the current frame image, and calculating to obtain structural similarity indexes of two edge zone information. At this time, the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval. And multiplying the foreground anti-shake intensity A of the current frame image by 0.9 to obtain the adjusted foreground anti-shake intensity of 0.9A, namely the foreground anti-shake intensity of the next frame image is 0.9A.

And carrying out image fusion on the stability-enhanced foreground image and the stability-enhanced background image of the current frame image to obtain the stability-enhanced image (namely the anti-shake image) of the current frame image, and outputting the stability-enhanced image.

And then, acquiring a next frame of image, and performing foreground and background segmentation on the next frame of image to obtain a foreground image and a background image. And carrying out anti-shake processing on the background image to obtain a stabilized background image of the next frame image, wherein the anti-shake intensity of the background image is B. And performing anti-shake processing on the foreground image to obtain a stability-enhanced foreground image of the next frame of image, wherein the anti-shake intensity of the foreground is 0.9A at the moment. And then, carrying out image fusion on the stability-enhanced background image and the stability-enhanced foreground image of the next frame of image to obtain a stability-enhanced image of the next frame of image, and outputting the stability-enhanced image. And calculating to obtain the structural similarity index of the next frame image according to the first edge band information and the second edge band information of the next frame image. And if the structural similarity index of the next frame of image falls into the preset similarity threshold interval, not adjusting the foreground anti-shake intensity, namely the foreground anti-shake intensity of the next frame of image is 0.9A. And if the structural similarity index of the next frame of image does not fall into the preset similarity threshold interval, obtaining the foreground anti-shake intensity of the next frame of image based on the foreground anti-shake intensity of 0.9A.

In some embodiments, the background anti-shake intensity of the next frame image may also be adjusted according to the structural similarity index of the current frame image, and the following description is given with reference to another specific flow diagram of the video anti-shake process provided in the embodiment of the present application shown in fig. 13.

As shown in fig. 13, after calculating the structural similarity index between the edge band information for the current frame image, the background anti-shake intensity of the next frame image is adjusted according to the structural similarity index, that is, the adjusted background anti-shake intensity acts on the background anti-shake process of the next frame image. Similar parts in fig. 13 and fig. 11 are introduced in fig. 11, and are not described again here.

For example, the foreground anti-shake intensity of the current frame image is a, and the background anti-shake intensity is B. And respectively carrying out anti-shake processing on the foreground image and the background image according to the foreground anti-shake intensity and the background anti-shake intensity of the current frame image, and calculating to obtain structural similarity indexes of two edge zone information. At this time, the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval. And multiplying the background anti-shake intensity B of the current frame image by 0.8 to obtain the adjusted background anti-shake intensity of 0.8B, namely the background anti-shake intensity of the next frame image is 0.8B.

And then, acquiring a next frame of image, and performing foreground and background segmentation on the next frame of image to obtain a foreground image and a background image. And carrying out anti-shake processing on the background image to obtain a stability-enhanced background image of the next frame image, wherein the anti-shake intensity of the background is 0.8B. And carrying out anti-shake processing on the foreground image to obtain a stability-enhanced foreground image of the next frame of image, wherein the anti-shake intensity of the foreground is A. And then, carrying out image fusion on the stability-enhanced background image and the stability-enhanced foreground image of the next frame of image to obtain a stability-enhanced image of the next frame of image, and outputting the stability-enhanced image. And calculating to obtain the structural similarity index of the next frame image according to the first edge band information and the second edge band information of the next frame image. And if the structural similarity index of the next frame of image falls into the preset similarity threshold interval, not adjusting the background anti-shake intensity, namely the background anti-shake intensity of the next frame of image is 0.8B. And if the structural similarity index of the next frame image does not fall into the preset similarity threshold interval, obtaining the background anti-shake intensity of the next frame image based on the background anti-shake intensity of 0.8B.

In some embodiments, the background anti-shake intensity and the foreground anti-shake intensity of the next frame image may also be adjusted simultaneously according to the structural similarity index of the current frame image, and the following description is given with reference to another specific flow diagram of the video anti-shake process provided in the embodiment of the present application shown in fig. 14.

As shown in fig. 14, for the current frame image, according to the structural similarity index, the background anti-shake intensity and the foreground anti-shake intensity of the next frame image are adjusted, that is, the adjusted background anti-shake intensity acts on the background anti-shake process of the next frame image, and the adjusted foreground anti-shake intensity acts on the foreground anti-shake process of the next frame image.

For example, the foreground anti-shake intensity of the current frame image is a, and the background anti-shake intensity is B. And respectively carrying out anti-shake processing on the foreground image and the background image according to the foreground anti-shake intensity and the background anti-shake intensity of the current frame image, and calculating to obtain structural similarity indexes of two edge zone information. At this time, the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval. And multiplying the background anti-shake intensity B of the current frame image by 0.8 to obtain the adjusted background anti-shake intensity of 0.8B, namely the background anti-shake intensity of the next frame image of 0.8B. And multiplying the foreground anti-shake intensity A of the current frame image by 0.9 to obtain the adjusted foreground anti-shake intensity of 0.9A, namely the foreground anti-shake intensity of the next frame image is 0.9A.

And carrying out image fusion on the stability enhancement foreground image and the stability enhancement background image of the current frame image to obtain a stability enhancement image (namely the anti-shake image) of the current frame image, and outputting the stability enhancement image.

And then, acquiring a next frame of image, and performing foreground and background segmentation on the next frame of image to obtain a foreground image and a background image. And carrying out anti-shake processing on the background image to obtain a stabilized background image of the next frame image, wherein the anti-shake intensity of the background image is 0.8B. And performing anti-shake processing on the foreground image to obtain a stability-enhanced foreground image of the next frame of image, wherein the anti-shake intensity of the foreground is 0.9A at the moment. And then, carrying out image fusion on the stability-enhanced background image and the stability-enhanced foreground image of the next frame of image to obtain a stability-enhanced image of the next frame of image, and outputting the stability-enhanced image. And calculating to obtain the structural similarity index of the next frame image according to the first edge band information and the second edge band information of the next frame image. And if the structural similarity index of the next frame of image falls into the preset similarity threshold interval, not adjusting the background anti-shake intensity, namely the background anti-shake intensity of the next frame of image is 0.8B, and the foreground anti-shake intensity is still 0.9A. And if the structural similarity index of the next frame of image does not fall into the preset similarity threshold interval, respectively adjusting based on the background anti-shake intensity of 0.8B and the foreground anti-shake intensity of 0.9A to obtain the background anti-shake intensity and the background anti-shake intensity of the next frame of image.

In some embodiments, for the foreground anti-shake intensity and/or the background anti-shake intensity of the current frame image, specifically, refer to another specific flowchart of the video anti-shake process provided in the embodiment of the present application shown in fig. 15. Similar parts of fig. 15 and 11 are not described in detail herein.

When the anti-shake intensity of the current frame image is adjusted according to the structural similarity index of the current frame image, if the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, the video image obtained by fusion is not output, but the foreground anti-shake intensity and/or the background anti-shake intensity are/is adjusted, and the video image obtained by fusion is not output until the structural similarity index of the current frame image falls within the preset similarity threshold interval.

For example, at this time, the foreground anti-shake intensity of the current frame image is adjusted. In the first calculation, the foreground anti-shake intensity of the current frame image is A, and the background anti-shake intensity is B.

And respectively carrying out anti-shake processing on the foreground image and the background image according to the foreground anti-shake intensity and the background anti-shake intensity of the current frame image, and then calculating to obtain structural similarity indexes of the two pieces of edge band information. At this time, the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval. And multiplying the foreground anti-shake intensity A of the current frame image by 0.9 to obtain the adjusted foreground anti-shake intensity of 0.9A, namely the foreground anti-shake intensity in the next foreground anti-shake process is 0.9A.

And carrying out image fusion on the stability-enhanced foreground image and the stability-enhanced background image of the current frame image to obtain a stability-enhanced image (namely the anti-shake image) calculated for the first time, and not outputting the stability-enhanced image.

The adjusted foreground anti-shake intensity is then used for a second calculation, which may be performed without repeating some of the steps of the first calculation again. At this time, the adjusted foreground anti-shake intensity may be used to perform anti-shake processing on the foreground image again. Namely, the foreground characteristic point path is smoothed by using the adjusted anti-shake intensity, so as to obtain the stability-enhanced foreground image calculated for the second time. And then, obtaining the stability-increasing foreground image obtained by the second calculation and the stability-increasing background image obtained by the first calculation to obtain the stability-increasing image obtained by the second calculation. And calculating a structural similarity index according to the second edge band information calculated for the second time and the first edge band information obtained by the first time. And if the structural similarity index of the time falls into a preset similarity threshold interval, the foreground anti-shake intensity is not adjusted, and the stability-enhanced image calculated for the second time is output, namely the stability-enhanced image calculated for the second time is used as the output video image of the current frame image. And if the structural similarity index does not fall into the preset similarity threshold interval, adjusting the foreground anti-shake intensity process again, and calculating for the third time by using the adjusted foreground anti-shake intensity. And circulating the steps until the structural similarity index calculated at a certain time falls into a preset similarity threshold interval, and taking the stability-increased image at the time as an output video image of the current frame image.

For another example, at this time, the background anti-shake intensity of the current frame image is adjusted. In the first calculation, the foreground anti-shake intensity of the current frame image is A, and the background anti-shake intensity is B. In the first calculation, the structural similarity index does not fall into the preset similarity threshold interval, and the background anti-shake intensity B of the current frame image is multiplied by 0.8 to obtain an adjusted background anti-shake intensity of 0.8B, that is, the background anti-shake intensity of the next background anti-shake process is 0.8B.

And using the adjusted anti-shake intensity to act on the path smoothing of the background image again to obtain the stability-enhanced background image calculated for the second time. And fusing the stability-increasing background image calculated for the second time and the stability-increasing foreground image calculated for the first time to obtain the stability-increasing image calculated for the second time. And calculating a structural similarity index between the second calculated second edge zone information and the first calculated first edge zone information. And if the structural similarity index calculated for the second time falls into the preset similarity threshold interval, taking the stability-enhanced image calculated for the second time as an output video image (namely, an image after anti-shake processing) of the current frame image. And if the structural similarity index calculated for the second time does not fall into the preset similarity threshold interval, adjusting the background anti-shake intensity again, and calculating the next time by using the adjusted background anti-shake intensity until the structural similarity index at a certain time does not fall into the preset similarity threshold interval.

For another example, at this time, the background anti-shake intensity and the foreground anti-shake intensity of the current frame image are adjusted. In the first calculation, the foreground anti-shake intensity of the current frame image is A, and the background anti-shake intensity is B. In the first calculation, the structural similarity index does not fall into the preset similarity threshold interval, and the background anti-shake intensity B of the current frame image is multiplied by 0.8 to obtain the adjusted background anti-shake intensity of 0.8B, namely the background anti-shake intensity of the next background anti-shake process is 0.8B. And multiplying the foreground anti-shake intensity A of the current frame image by 0.9 to obtain the adjusted foreground anti-shake intensity of 0.9A, namely the foreground anti-shake intensity in the next foreground anti-shake process is 0.9A.

And performing second calculation by using the adjusted anti-shake intensity. Specifically, the foreground image feature point path is subjected to path smoothing by using 0.9A, and a second calculated stability-enhanced foreground image is obtained. And smoothing the path of the motion trail curve of the background image by using 0.8B to obtain the stability-enhanced background image calculated for the second time. And fusing the second calculated stability-increasing foreground image and the second calculated stability-increasing background image to obtain a second stability-increasing image. And calculating a second structural similarity index according to the second calculated second edge band information and the first calculated first edge band information. And if the structural similarity index calculated for the second time falls into the preset similarity threshold interval, taking the stability-enhanced image calculated for the second time as an output video image (namely, an image after anti-shake processing) of the current frame image. And if the structural similarity index calculated for the second time does not fall into the preset similarity threshold interval, adjusting the anti-shake intensity again, and calculating the next time by using the adjusted anti-shake intensity until the structural similarity index at a certain time does not fall into the preset similarity threshold interval.

Therefore, according to the similarity index, the foreground anti-shake intensity and/or the background anti-shake intensity of the current frame image or the foreground anti-shake intensity and/or the background anti-shake intensity of the next frame image are/is adjusted, the artifact intensity generated when the foreground and the background are fused can be automatically adjusted, an obvious isolation zone is prevented from being generated when excessive anti-shake is caused to be fused, and the transition at the fusion edge of the foreground and the background is more natural.

The anti-shake intensity may act on the path smoothing process. The larger the anti-shake intensity is, the larger the curve smoothing intensity (or called curve smoothing degree) is, the smaller the anti-shake intensity is, and the smaller the curve smoothing intensity is.

Specifically, when the similarity index is lower than the preset similarity threshold interval, the anti-shake intensity is increased, that is, the curve smoothing intensity (or called path smoothing intensity) is enhanced. When the similarity index is higher than the preset similarity threshold interval, the anti-shake intensity is reduced, that is, the curve smoothing intensity (or called path smoothing intensity) is reduced.

The weight of the path smoothing may be adjusted according to the gaussian weight. From the gaussian distribution, the larger the gaussian weight sigma, the smoother the curve. The anti-jitter strength may be equivalent to a gaussian weight sigma.

The higher the anti-shake intensity is, the better the video anti-shake effect is, the more meaningless pixel points are at the edge of the video image, and the smaller the field angle is. On the contrary, the smaller the anti-shake intensity is, the poorer the video anti-shake effect is, the fewer meaningless pixel points at the edge of the video image are, and the larger the field angle is. And adjusting the anti-shake intensity to enable the field angle and the meaningless pixel points to reach a proper interval.

To better describe the video anti-shaking scheme, the following description will be made with reference to the flowchart.

Referring to fig. 16, a schematic block diagram of a flow chart of a video anti-shake method provided in an embodiment of the present application may include the following steps:

step S1601, the terminal device obtains an image to be processed, wherein the image to be processed is a frame of image in the video to be processed.

In specific application, the manner of acquiring the image to be processed by the terminal device may be arbitrary. For example, referring to fig. 3, the terminal device is a mobile phone 100, when receiving a trigger operation of a user for a control 32, the mobile phone 100 instructs the mobile phone 100 to record a video, and in response to the trigger operation, captures a video image to acquire an image to be processed. After the image to be processed is subjected to anti-shake processing, the image to be processed is displayed on the display screen of the mobile phone 100.

Step S1602, the terminal device performs foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image.

It is to be understood that the foreground and background segmentation method for one frame of image is arbitrary, and the foreground and background segmentation method is not limited herein.

It should be noted that the foreground image in the embodiment of the present application may be a human face or may not be a human face, for example, the foreground image is a foreground shown in fig. 5.

Step S1603, the terminal equipment performs anti-shake processing on the first foreground image to obtain a first foreground stability-increasing image.

The anti-shake mode of the first foreground image can be any. In some embodiments, in order to eliminate the edge black edge in the image after the anti-shake processing, the foreground image may be subjected to anti-shake processing in a full-frame anti-shake manner. Specifically, feature points of the first foreground image are extracted, and then a first motion trajectory curve is obtained according to the feature points of the first foreground image and the feature points of the second foreground image, where the first motion trajectory curve may be a feature point path of the foreground image or a shaking trajectory of the foreground image. The second foreground image is a foreground image of the first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in the image sequence. The first target image is typically a multi-frame image.

For example, in the image sequence, the following image frames exist respectively according to the chronological order: image 1, image 2, image 3, image 4 \8230, and image n, n is a positive integer. At a certain moment, the image to be processed is image 5, and at this moment, the first target image may include image 1, image 2, image 3 and image 4, and image 6, image 7, image 8 and image 9. Foreground images of the images 1, 2, 3 and 4, 6, 7, 8 and 9 are extracted, and feature points of the foreground images are extracted. And obtaining the shaking track of the foreground image based on the foreground characteristic points of the image to be processed and the foreground characteristic points of the first target image.

And after the first motion trail curve is obtained, according to the foreground anti-shake intensity of the image to be processed, performing path smoothing on the first motion trail curve to obtain a second motion trail curve with a smooth path. And obtaining the first foreground image after the stability is increased according to the second motion trail curve and the first foreground image. And finally, performing edge compensation on the stabilized first foreground image to obtain the first foreground stabilized image.

The edge compensation may use a gaussian weight interpolation mode corresponding to fig. 12, and specific contents may refer to the corresponding contents above, which are not described herein again.

In other embodiments, the edge compensation may not be performed in the full-frame anti-shake mode. At this time, after the first foreground image after the stabilization is obtained, the first foreground image after the stabilization is taken as a first foreground stabilization image.

Step 1604, the terminal device performs anti-shake processing on the first background image to obtain a first background stability-enhanced image.

In specific application, the anti-shake processing mode of the background image is also arbitrary. For example, feature points of the first background image are extracted first. And obtaining a fourth motion trail curve according to the characteristic points of the first background image and the second background image. The second background image is a background image of the second target image, the video to be processed includes the second target image, and the second target image and the image to be processed are continuous image frames in the image sequence. The second target image is similar to the first target image, and is a continuous multi-frame image in the image sequence, which is not described herein again. And finally, according to the background anti-shake intensity of the image to be processed, performing path smoothing on the fourth motion trajectory curve to obtain a second motion trajectory curve with a smoothed path. And obtaining a first foreground stability-increasing image according to the second motion trail curve and the first foreground image.

It is understood that the sequence between step S1603 and step S1604 may be arbitrary, and step S1603 and step S1604 may be executed simultaneously, and the execution sequence of these two steps is not limited herein.

Step S1605, the terminal device fuses the first foreground stability augmentation image and the first background stability augmentation image to obtain a first stability augmentation image of the image to be processed.

It should be noted that the first stability-enhanced image obtained by fusion may be used as an output video image of the image to be processed (i.e., a video image after the anti-shake processing). Of course, in some other embodiments, if the anti-shake intensity of the current frame image needs to be adjusted, the first stabilization image may not be the output video image of the image to be processed.

Further, in order to prevent that the image fusion from generating obvious isolation zones during transition anti-shake, user experience is influenced, and anti-shake intensity can be adjusted according to similarity indexes.

In a specific application, first edge band information of the first stability enhancement image may be extracted first, where the first edge band information is pixel information of an edge band between the foreground and the background in the first stability enhancement image. The first edge band information may be extracted during foreground and background segmentation of the image.

And extracting second edge zone information of the image to be processed, wherein the second edge zone information is pixel information of an edge zone between the foreground and the background in the image to be processed. The second edge band information may be derived from the fused image.

And then, adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed according to the first edge zone information and the second edge zone information, wherein the anti-shake intensity comprises foreground anti-shake intensity and/or background anti-shake intensity.

More specifically, the first structural similarity index between the first edge zone information and the second edge zone information may be calculated. And determining whether to adjust the anti-shake intensity according to the first structural similarity index. If the first structural similarity index does not fall into the preset similarity threshold interval, the anti-shake intensity of the next frame of image to be processed or the image to be processed (namely the current frame of image) is adjusted. The specific adjustment process may refer to the corresponding contents above, and is not described herein again.

If the foreground anti-shake intensity of the current frame image is adjusted, multiple calculations may be performed. And when the structural similarity index calculated at a certain time falls into a preset similarity threshold interval, taking the corresponding stability-enhanced image as an output video image of the current frame image.

Specifically, according to the adjusted first anti-shake intensity, a path smoothing is performed on the first motion trajectory curve to obtain a third motion trajectory curve after the path smoothing. The content of the first motion trajectory curve may refer to the above corresponding content, and is not described herein again. And then, obtaining a second foreground stability-increasing image according to the third motion track curve and the first foreground image. And then, fusing the second foreground stability enhancement image and the first background stability enhancement image to obtain a second stability enhancement image of the image to be processed. And determining a second structural similarity index between the second edge zone information and third edge zone information, wherein the third edge zone information is pixel information of an edge zone between the foreground and the background in the second stability-enhanced image.

And if the second structural similarity index does not fall into the preset similarity threshold interval, multiplying the adjusted first anti-shake intensity by a preset numerical value to obtain the adjusted second anti-shake intensity. And taking the adjusted second anti-shake intensity as the adjusted first anti-shake intensity, returning to the step of smoothing the path of the first motion trajectory curve according to the adjusted first anti-shake intensity to obtain a third motion trajectory curve with the smoothed path until the second structural similarity index falls into a preset similarity threshold interval. And circulating according to the above steps until the second structural similarity index falls into the preset similarity threshold interval, and taking the corresponding second stability augmentation image as an output video image of the image to be processed.

Corresponding to the above method embodiments, the present application provides a video anti-shake apparatus, which is applied to a terminal device. Referring to fig. 17, a schematic block diagram of a video anti-shake apparatus provided in an embodiment of the present application may include:

the image obtaining module 171 is configured to obtain an image to be processed, where the image to be processed is a frame of image in a video to be processed.

The foreground and background segmentation module 172 is configured to perform foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image.

And a foreground anti-shake module 173, configured to perform anti-shake processing on the first foreground image to obtain a first foreground stability-enhanced image.

And a background anti-shake module 174, configured to perform anti-shake processing on the first background image to obtain a first background stability-enhanced image.

The image fusion module 175 is configured to fuse the first foreground stability enhancement image and the first background stability enhancement image to obtain a first stability enhancement image of the image to be processed.

In some possible implementations, the foreground anti-shaking module is specifically configured to: extracting feature points of the first foreground image; obtaining a first motion track curve according to the characteristic points of the first foreground image and the characteristic points of the second foreground image, wherein the second foreground image is a foreground image of the first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence; according to the foreground anti-shake intensity of the image to be processed, path smoothing is carried out on the first motion trajectory curve, and a second motion trajectory curve after the path smoothing is obtained; obtaining a first foreground image after stability augmentation according to the second motion trail curve and the first foreground image; and performing edge compensation on the first foreground image after the stability is increased to obtain a first foreground stability increasing image.

In some possible implementations, the foreground anti-shake module is specifically configured to: by passing

taking the target pixel value as the pixel value of the pixel point to be compensated, and keeping the pixel values of other pixel points unchanged to obtain a first foreground stability-enhanced image, wherein the other pixel points are pixel points except the pixel point to be compensated in the first foreground image after stability enhancement; wherein, the first and the second end of the pipe are connected with each other,

In some possible implementations, the apparatus further includes: the anti-shake intensity adjusting module is used for extracting first edge band information of the image to be processed, wherein the first edge band information is pixel information of an edge band between a foreground and a background in the image to be processed; extracting second edge zone information of the first stability enhancement image, wherein the second edge zone information is pixel information of an edge zone between a foreground and a background in the first stability enhancement image; and adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed according to the first edge zone information and the second edge zone information, wherein the anti-shake intensity comprises foreground anti-shake intensity and/or background anti-shake intensity.

In some possible implementations, the anti-shake intensity adjustment module is specifically configured to: determining a first structural similarity index between the first edge band information and the second edge band information; and if the first structural similarity index does not fall into the preset similarity threshold interval, adjusting the anti-shake intensity of the next frame of image to be processed or the image to be processed.

In some possible implementations, the anti-shake intensity adjustment module is specifically configured to: and if the first structural similarity index does not fall into the preset similarity threshold interval, multiplying the anti-shake intensity of the image to be processed by a preset numerical value to obtain an adjusted first anti-shake intensity, and taking the adjusted first anti-shake intensity as the anti-shake intensity of the image to be processed of the next frame or the anti-shake intensity of the image to be processed.

In some possible implementations, the adjusted first anti-shake intensity includes a foreground anti-shake intensity of the image to be processed; the device also comprises a current frame anti-shake result adjusting module used for: according to the adjusted first anti-shake intensity, performing path smoothing on the first motion track curve to obtain a third motion track curve after the path smoothing, wherein the first motion track curve is a motion track curve obtained according to the characteristic points of the first foreground image and the characteristic points of the second foreground image, the second foreground image is a foreground image of the first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence; obtaining a second foreground stability-enhancing image according to the third motion trail curve and the first foreground image; fusing the second foreground stability augmentation image and the first background stability augmentation image to obtain a second stability augmentation image of the image to be processed; determining a second structural similarity index between the second edge zone information and third edge zone information, wherein the third edge zone information is pixel information of an edge zone between a foreground and a background in a second stability augmentation image; if the second structural similarity index does not fall into the preset similarity threshold interval, multiplying the adjusted first anti-shake intensity by a preset numerical value to obtain an adjusted second anti-shake intensity; and taking the adjusted second anti-shake intensity as the adjusted first anti-shake intensity, returning to the step of smoothing the path of the first motion trajectory curve according to the adjusted first anti-shake intensity to obtain a third motion trajectory curve with the smoothed path until the second structural similarity index falls into a preset similarity threshold interval.

In some possible implementations, the background anti-shake module is specifically configured to: extracting feature points of the first background image; obtaining a fourth motion trail curve according to the feature points of the first background image and the feature points of the second background image, wherein the second background image is a background image of a second target image, the video to be processed comprises the second target image, and the second target image and the image to be processed are continuous image frames in an image sequence; according to the background anti-shake intensity of the image to be processed, path smoothing is carried out on the fourth motion trajectory curve to obtain a second motion trajectory curve after the path smoothing; and obtaining a first foreground stability enhancement image according to the second motion track curve and the first foreground image.

The video anti-shake device has the function of realizing the video anti-shake method, the function can be realized by hardware, and can also be realized by executing corresponding software by hardware, the hardware or the software comprises one or more modules corresponding to the function, and the modules can be software and/or hardware.

An embodiment of the present application further provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the video anti-shake method as described in any one of the above is implemented.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps that can be implemented in the foregoing method embodiments.

The embodiments of the present application provide a computer program product, which when running on a terminal device, enables the terminal device to implement the steps in the above method embodiments when executed.

Embodiments of the present application further provide a chip system, where the chip system includes a processor, the processor is coupled with a memory, and the processor executes a computer program stored in the memory to implement the methods according to the above method embodiments. The chip system can be a single chip or a chip module formed by a plurality of chips.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment. It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance. Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather mean "one or more but not all embodiments" unless specifically stated otherwise.

Finally, it should be noted that: the above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope disclosed in the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A video anti-shake method is applied to terminal equipment and is characterized by comprising the following steps:

acquiring an image to be processed, wherein the image to be processed is a frame image in a video to be processed;

performing foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image;

performing anti-shake processing on the first foreground image to obtain a first foreground stability enhancement image;

carrying out anti-shake processing on the first background image to obtain a first background stability-increasing image;

fusing the first foreground stability enhancement image and the first background stability enhancement image to obtain a first stability enhancement image of the image to be processed;

wherein the method further comprises:

extracting first edge band information of the image to be processed, wherein the first edge band information is pixel information of an edge band between a foreground and a background in the image to be processed;

extracting second edge zone information of the first stability augmentation image, wherein the second edge zone information is pixel information of an edge zone between a foreground and a background in the first stability augmentation image;

and adjusting the anti-shake intensity of the next frame of image or the image to be processed according to the first edge zone information and the second edge zone information, wherein the anti-shake intensity comprises foreground anti-shake intensity and/or background anti-shake intensity.

2. The method according to claim 1, wherein the anti-shake processing is performed on the first foreground image to obtain a first foreground stabilization image, and the method comprises:

extracting feature points of the first foreground image;

obtaining a first motion track curve according to the characteristic points of the first foreground image and the characteristic points of a second foreground image, wherein the second foreground image is a foreground image of a first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence;

according to the foreground anti-shake intensity of the image to be processed, performing path smoothing processing on the first motion trajectory curve to obtain a second motion trajectory curve after path smoothing processing;

obtaining a first foreground image after stability augmentation according to the second motion track curve and the first foreground image;

and performing edge compensation on the stabilized first foreground image to obtain the first foreground stability-enhancing image.

3. The method of claim 2, wherein performing edge compensation on the stabilized first foreground image to obtain the first foreground stabilized image comprises:

by passing

taking the target pixel value as the pixel value of the pixel point to be compensated, and obtaining the first foreground stability-enhanced image if the pixel values of other pixel points are unchanged, wherein the other pixel points are pixel points except the pixel point to be compensated in the first foreground image after stability enhancement;

wherein the content of the first and second substances,

v (q) is a target pixel value of a q point, and the q point is the pixel point to be compensated; p (i, j) is the pixel value of the pixel point in the ith row and the jth column, w (i, j) is the Gaussian weight of the pixel point in the ith row and the jth column, sigma is a hyperparameter, and N is a positive integer.

4. The method according to claim 1, wherein adjusting the anti-shake intensity of the next frame image or the image to be processed according to the first edge band information and the second edge band information comprises:

determining a first structural similarity index between the first edge band information and the second edge band information;

and if the first structural similarity index does not fall into a preset similarity threshold interval, adjusting the anti-shake intensity of the next frame of image or the image to be processed.

5. The method according to claim 4, wherein if the first structural similarity index does not fall within a preset similarity threshold interval, adjusting the anti-shake intensity of the next frame of image or the image to be processed comprises:

and if the first structural similarity index does not fall into the preset similarity threshold interval, multiplying the anti-shake intensity of the image to be processed by a preset numerical value to obtain an adjusted first anti-shake intensity, and taking the adjusted first anti-shake intensity as the anti-shake intensity of the next frame of image or the anti-shake intensity of the image to be processed.

6. The method according to claim 5, wherein the adjusted first anti-shake intensity comprises a foreground anti-shake intensity of the image to be processed; the method further comprises the following steps:

according to the adjusted first anti-shake intensity, performing path smoothing processing on a first motion trajectory curve to obtain a third motion trajectory curve after path smoothing processing, wherein the first motion trajectory curve is a motion trajectory curve obtained according to characteristic points of a first foreground image and characteristic points of a second foreground image, the second foreground image is a foreground image of a first target image, the video to be processed comprises the first target image, and the first target image and the image to be processed are continuous image frames in an image sequence;

obtaining a second foreground stability enhancement image according to the third motion trail curve and the first foreground image;

determining a second structural similarity index between the second edge zone information and third edge zone information, wherein the third edge zone information is pixel information of an edge zone between a foreground and a background in the second stability augmentation image;

if the second structural similarity index does not fall into the preset similarity threshold interval, multiplying the adjusted first anti-shake intensity by the preset numerical value to obtain an adjusted second anti-shake intensity;

and taking the adjusted second anti-shake intensity as the adjusted first anti-shake intensity, and returning to execute the step of performing path smoothing processing on the first motion trajectory curve according to the adjusted first anti-shake intensity to obtain a third motion trajectory curve after path smoothing processing until the second structural similarity index falls into the preset similarity threshold interval.

7. The method of claim 1, wherein the anti-shake processing the first background image to obtain a first background-stabilized image comprises:

extracting feature points of the first background image;

obtaining a fourth motion trajectory curve according to the feature points of the first background image and the feature points of a second background image, wherein the second background image is a background image of a second target image, the video to be processed comprises the second target image, and the second target image and the image to be processed are continuous image frames in an image sequence;

according to the background anti-shake intensity of the image to be processed, path smoothing processing is carried out on the fourth motion trajectory curve to obtain a second motion trajectory curve after path smoothing processing;

and obtaining the first foreground stability enhancement image according to the second motion trail curve and the first foreground image.

8. The utility model provides a video anti-shake device, is applied to terminal equipment, its characterized in that includes:

the image fusion module is used for fusing the first foreground stability augmentation image and the first background stability augmentation image to obtain a first stability augmentation image of the image to be processed;

wherein the apparatus further comprises: the anti-shake intensity adjusting module is used for extracting first edge band information of the image to be processed, wherein the first edge band information is pixel information of an edge band between a foreground and a background in the image to be processed; extracting second edge zone information of the first stability augmentation image, wherein the second edge zone information is pixel information of an edge zone between a foreground and a background in the first stability augmentation image; and adjusting the anti-shake intensity of the next frame of image or the image to be processed according to the first edge zone information and the second edge zone information, wherein the anti-shake intensity comprises foreground anti-shake intensity and/or background anti-shake intensity.

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the video anti-shake method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the video anti-shake method according to any one of claims 1 to 7.