WO2022033344A1 - 视频防抖方法、终端设备和计算机可读存储介质 - Google Patents

视频防抖方法、终端设备和计算机可读存储介质 Download PDF

Info

Publication number
WO2022033344A1
WO2022033344A1 PCT/CN2021/110028 CN2021110028W WO2022033344A1 WO 2022033344 A1 WO2022033344 A1 WO 2022033344A1 CN 2021110028 W CN2021110028 W CN 2021110028W WO 2022033344 A1 WO2022033344 A1 WO 2022033344A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
foreground
shake
stabilized
processed
Prior art date
Application number
PCT/CN2021/110028
Other languages
English (en)
French (fr)
Inventor
吴虹
苗磊
贾志平
刘蒙
刘志鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022033344A1 publication Critical patent/WO2022033344A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present application relates to the technical field of image processing, and in particular, to a video anti-shake method, a terminal device, and a computer-readable storage medium.
  • EIS Electronic Image Stabilization
  • OIS Optical Image Stabilization
  • EIS does not require additional components to assist, it is a software compensation algorithm, the principle is to perform post-processing on the video image through the algorithm to stabilize the captured video image.
  • the background motion information of the video image is first obtained through the gyroscope of the terminal device.
  • the background motion information (or background jitter) of the video image may refer to the motion information generated by the movement of the camera of the terminal device.
  • the face features of the video images are collected, and the smooth motion trajectory of the face is calculated according to the face features.
  • the camera smooth trajectory is finally obtained according to the background motion information and the smooth motion trajectory of the face.
  • de-shake compensation is performed on the video image according to the smooth trajectory of the camera to achieve simultaneous image stabilization of the video foreground and background.
  • the weight between the background motion information and the smooth motion trajectory of the face can only be selected as a compromise, resulting in the anti-shake effect of the foreground and the anti-shake effect of the background. Also a compromise.
  • the background image stabilization capability must be sacrificed so that the foreground and background images can be stabilized at the same time.
  • the embodiments of the present application provide a video anti-shake method, a terminal device, and a computer-readable storage medium to solve the problem that in the existing EIS method, the background anti-shake capability needs to be sacrificed to achieve simultaneous image stabilization of the foreground and the background. .
  • an embodiment of the present application provides a video anti-shake method.
  • the method can be applied to a terminal device.
  • the terminal device can be, for example, a mobile phone or a tablet.
  • the method can include: acquiring a to-be-processed image, where the to-be-processed image is A frame of image in the video to be processed; perform foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image; perform anti-shake processing on the first foreground image to obtain a first foreground stabilized image;
  • the background image is subjected to anti-shake processing to obtain a first stabilized background image; the first stabilized foreground image and the first stabilized background image are fused to obtain a first stabilized image of the image to be processed.
  • the foreground image and the background image are obtained by segmenting the video image in the foreground and the background, and then anti-shake processing is performed on the foreground image and the background image respectively. Coupled, and without sacrificing the ability to stabilize the background, it can also achieve simultaneous image stabilization of the foreground and background.
  • the process of performing anti-shake processing on the first foreground image to obtain the first foreground stabilized image may include: extracting feature points of the first foreground image; point and the feature points of the second foreground image to obtain the first motion trajectory curve, the second foreground image is the foreground image of the first target image, the video to be processed includes the first target image, and the first target image and the to-be-processed image are in the image.
  • the sequence is a continuous image frame; according to the foreground anti-shake intensity of the image to be processed, the first motion trajectory curve is path-smoothed to obtain a second motion trajectory curve after the path is smoothed; according to the second motion trajectory curve and the first foreground image , to obtain a stabilized first foreground image; perform edge compensation on the stabilized first foreground image to obtain a first stabilized foreground image.
  • the stabilized first foreground image meaningless pixels will be generated at the edge of the image, which is not conducive to the fusion of the foreground and the foreground.
  • edge compensation is performed on the stabilized first foreground image to make the first foreground image.
  • a foreground stabilized image does not contain black borders, which further improves the effect of the video image obtained by the fusion of subsequent images.
  • performing edge compensation on the stabilized first foreground image to obtain the first foreground stabilized image may include:
  • the first stabilized foreground image is obtained, and the other pixels are in the stabilized first foreground image, except for the pixels to be compensated. outside pixels;
  • V(q) is the target pixel value of point q, and point q is the pixel to be compensated
  • P(i, j) is the pixel value of the pixel in the i-th row and the j-th column
  • w(i, j) is the i-th row
  • the Gaussian weight of the pixel in the jth column, ⁇ is a hyperparameter, and N is a positive integer.
  • the black edges of the image are patched by Gaussian weighted interpolation.
  • the method further includes: extracting first edge band information of the image to be processed, where the first edge band information is the pixels of the edge band between the foreground and the background in the image to be processed information; extract the second edge band information of the first stabilization image, the first edge band information is the pixel information of the edge band between the foreground and the background in the first stabilization image; according to the first edge band information and the second edge band information
  • the edge band information is used to adjust the image stabilization intensity of the next frame to be processed or the image to be processed, and the image stabilization intensity includes the foreground image stabilization intensity and/or the background image stabilization intensity.
  • the anti-shake intensity is adjusted according to the edge band information of the original image (that is, the image to be processed) and the edge band information of the fused image (that is, the first stabilized image), so as to automatically adjust the false image generated when the foreground and background are merged.
  • Image intensity to prevent excessive anti-shake as much as possible resulting in obvious isolation bands during fusion, to ensure a natural transition at the edges of the front and rear background fusion.
  • the anti-shake strength may be adjusted according to a structural similarity index (Structural SIMilarity, SSIM). That is, according to the first edge band information and the second edge band information, the process of adjusting the image stabilization intensity of the next frame to be processed or the image to be processed may include: determining the difference between the first edge band information and the second edge band information If the first structural similarity index does not fall within the preset similarity threshold range, adjust the image stabilization intensity of the next frame to be processed or the image to be processed. If the first structural similarity index does not fall within the preset similarity threshold range, the anti-shake intensity does not need to be adjusted.
  • SSIM structural similarity index
  • the process of adjusting the image stabilization intensity of the next frame to be processed or the image to be processed may be Including: if the first structural similarity index does not fall within the preset similarity threshold range, multiplying the anti-shake intensity of the image to be processed and the preset value to obtain the adjusted first anti-shake intensity, and the adjusted first anti-shake intensity
  • An anti-shake strength is used as the anti-shake strength of the next frame to be processed or the anti-shake strength of the to-be-processed image.
  • the first structural similarity index does not fall within the preset similarity threshold interval can be divided into two cases, one case is: the first structural similarity index is higher than the preset similarity threshold interval, that is, the first When the structural similarity index is greater than the maximum value of the preset similarity threshold interval, in this case, the anti-shake intensity can be reduced. Another situation is: the first structural similarity index is lower than the preset similarity threshold interval, that is, the first structural similarity index is less than the minimum value of the preset similarity threshold interval, in this case, the anti-shake can be increased strength.
  • the preset value is 0.9, that is, the anti-shake intensity of the image to be processed is multiplied by 0.9, and the product obtained is used as the adjusted first anti-shake intensity.
  • the preset value is 1.1, that is, the anti-shake intensity of the image to be processed is multiplied by 1.1, and the obtained product is used as the adjusted first anti-shake intensity.
  • the adjusted first image stabilization intensity can be used as the foreground image stabilization intensity of the next frame of image, and/or the background image stabilization intensity of the next frame image.
  • the adjusted first image stabilization intensity can be used to A frame to be processed is subjected to foreground stabilization or background stabilization.
  • the adjusted first image stabilization intensity can also be used as the foreground image stabilization intensity and/or the background image stabilization intensity of the image to be processed.
  • the foreground image stabilization is performed again according to the adjusted first image stabilization intensity.
  • background image stabilization until the structural similarity index falls into the preset similarity threshold range. Therefore, in some possible implementations of the first aspect, the adjusted first anti-shake intensity includes the foreground anti-shake intensity of the image to be processed; the method may further include:
  • the first motion trajectory curve is based on the feature points of the first foreground image and the second foreground image.
  • the motion trajectory curve obtained by the feature points of the , the second foreground image is the foreground image of the first target image
  • the video to be processed includes the first target image
  • the first target image and the image to be processed are continuous image frames in the image sequence;
  • the third edge band information is the pixel information of the edge band between the foreground and the background in the second stabilized image
  • the process of performing anti-shake processing on the first background image to obtain the first background stabilized image may include: extracting feature points of the first background image; feature points of the background image and feature points of the second background image to obtain a fourth motion trajectory curve, the second background image is the background image of the second target image, the video to be processed includes the second target image, and the second target image
  • the image and the image to be processed are continuous image frames in the image sequence; according to the background anti-shake intensity of the image to be processed, path smoothing is performed on the fourth motion trajectory curve to obtain a second motion trajectory curve after the path is smoothed; according to the second motion trajectory curve
  • the motion trajectory curve and the first foreground image are used to obtain a first foreground stabilization image.
  • an embodiment of the present application provides a video anti-shake apparatus, which is applied to a terminal device, and the apparatus may include:
  • an image acquisition module configured to acquire a to-be-processed image, where the to-be-processed image is a frame of image in the to-be-processed video;
  • a foreground and background segmentation module which is used to perform foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image
  • a foreground anti-shake module configured to perform anti-shake processing on the first foreground image to obtain a first foreground stabilized image
  • a background image stabilization module used for performing image stabilization processing on the first background image to obtain a first stabilized background image
  • the image fusion module is used for fusing the first foreground stabilized image and the first background stabilized image to obtain the first stabilized image of the image to be processed.
  • the foreground anti-shake module is specifically configured to: extract the feature points of the first foreground image; obtain the first motion according to the feature points of the first foreground image and the feature points of the second foreground image trajectory curve, the second foreground image is the foreground image of the first target image, the video to be processed includes the first target image, and the first target image and the image to be processed are continuous image frames in the image sequence; according to the foreground of the image to be processed Anti-shake intensity, perform path smoothing on the first motion trajectory curve to obtain the second motion trajectory curve after the path is smoothed; obtain the stabilized first foreground image according to the second motion trajectory curve and the first foreground image; The first foreground image is then subjected to edge compensation to obtain a first foreground stabilized image.
  • the foreground anti-shake module is specifically used to: Foreground stabilized image, other pixels are pixels in the stabilized first foreground image, except for the pixels to be compensated; wherein, V(q) is the target pixel value of point q, and point q is the pixel to be compensated; P(i, j) is the pixel value of the pixel in the i-th row and the j-th column, and w(i, j) is the i-th row The Gaussian weight of the pixel in the jth column, ⁇ is a hyperparameter, and N is a positive integer.
  • the apparatus further includes: an anti-shake intensity adjustment module, configured to extract the first edge band information of the image to be processed, where the first edge band information is the foreground and the background of the image to be processed.
  • the anti-shake intensity adjustment module is specifically configured to: determine a first structural similarity index between the first edge band information and the second edge band information; If the degree index does not fall within the preset similarity threshold range, adjust the image stabilization intensity of the next frame to be processed or the image to be processed.
  • the anti-shake intensity adjustment module is specifically configured to: if the first structural similarity index does not fall within the preset similarity threshold range, adjust the anti-shake intensity of the image to be processed with the preset similarity threshold. The values are multiplied to obtain the adjusted first anti-shake intensity, and the adjusted first anti-shake intensity is used as the anti-shake intensity of the next frame of the image to be processed or the anti-shake intensity of the to-be-processed image.
  • the adjusted first anti-shake intensity includes the foreground anti-shake intensity of the image to be processed; the apparatus further includes a current frame anti-shake result adjustment module, configured to: according to the adjusted first anti-shake intensity Anti-shake strength, performing path smoothing on the first motion trajectory curve to obtain a smoothed third motion trajectory curve, where the first motion trajectory curve is the motion obtained according to the feature points of the first foreground image and the feature points of the second foreground image trajectory curve, the second foreground image is the foreground image of the first target image, the video to be processed includes the first target image, and the first target image and the image to be processed are continuous image frames in the image sequence; according to the third motion trajectory curve and the first foreground image to obtain a second foreground stabilized image; fuse the second foreground stabilized image and the first background stabilized image to obtain a second stabilized image of the image to be processed; determine the second edge band information and The second structural similarity index between the third edge band information, the third edge band information is
  • the background image stabilization module is specifically configured to: extract the feature points of the first background image; according to the feature points of the first background image and the feature points of the second background image, Obtaining a fourth motion trajectory curve, the second background image is a background image of the second target image, the video to be processed includes the second target image, and the second target image and the image to be processed are continuous image frames in the image sequence; According to the background image stabilization intensity of the image to be processed, the fourth motion trajectory curve is path-smoothed to obtain a second motion trajectory curve after the path is smoothed; according to the second motion trajectory curve and the first foreground image, the first foreground stabilization is obtained image.
  • the above-mentioned video anti-shake device has the function of realizing the video anti-shake method of the above-mentioned first aspect.
  • This function can be realized by hardware, and can also be realized by executing corresponding software through hardware.
  • the hardware or software includes one or more corresponding to the above functions. Modules, which can be software and/or hardware.
  • an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program, the above-mentioned first aspect is implemented
  • the video anti-shake method of any one is implemented.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the video anti-shake method according to any one of the above-mentioned first aspect.
  • an embodiment of the present application provides a chip system, the chip system includes a processor, the processor is coupled to a memory, and the processor executes a computer program stored in the memory to implement any one of the above-mentioned first aspects.
  • the chip system may be a single chip, or a chip module composed of multiple chips.
  • an embodiment of the present application provides a computer program product that, when the computer program product runs on a terminal device, enables the terminal device to execute the video anti-shake method described in any one of the first aspects above.
  • FIG. 1 is a schematic structural diagram of a terminal device 100 provided by an embodiment of the present application.
  • FIG. 2 is a schematic block diagram of a software structure of a terminal device 100 according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a mobile phone video recording interface provided by an embodiment of the present application.
  • FIG. 4 is a schematic block diagram of a video anti-shake process provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an anti-shake processing process of a video image frame provided by an embodiment of the present application.
  • FIG. 6 is another schematic block diagram of a video anti-shake process provided by an embodiment of the present application.
  • FIG. 7 is another schematic diagram of a video anti-shake process provided by an embodiment of the present application.
  • FIG. 8 is another schematic diagram of a video anti-shake process provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a background image stabilization process provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a specific process of a video anti-shake process provided by an embodiment of the present application.
  • FIG. 11 is another specific schematic flowchart of a video anti-shake process provided by an embodiment of the application.
  • FIG. 12 is a schematic diagram of edge compensation provided by an embodiment of the present application.
  • FIG. 13 is another specific schematic flowchart of a video anti-shake process provided by an embodiment of the present application.
  • FIG. 14 is another specific schematic flowchart of a video anti-shake process provided by an embodiment of the present application.
  • 15 is another specific schematic flowchart of a video anti-shake process provided by an embodiment of the present application.
  • 16 is a schematic block diagram of the flow of a video anti-shake method provided by an embodiment of the present application.
  • FIG. 17 is a schematic block diagram of a video anti-shake apparatus provided by an embodiment of the present application.
  • the terminal device provided by the embodiment of the present application is firstly introduced exemplarily below.
  • the video anti-shake solution provided in this embodiment of the present application can be applied to a terminal device.
  • the terminal device can be a terminal device with an image capturing function and a data processing capability.
  • Such a terminal device can generally include a camera. Through the camera, the terminal device can Video images can be obtained by shooting; of course, the terminal device can also be a terminal device that does not have an image shooting function but has data processing capabilities. In this case, the terminal device can receive video images captured by other devices.
  • the terminal device can be a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an augmented reality (AR)/virtual reality (VR) device, a laptop, an ultra-mobile personal computer, UMPC), netbook, and personal digital assistant (personal digital assistant, PDA) and other terminal devices, the embodiments of this application do not impose any restrictions on the specific type of the terminal device.
  • AR augmented reality
  • VR virtual reality
  • PDA personal digital assistant
  • FIG. 1 a schematic structural diagram of a terminal device 100 provided by an embodiment of the present application is shown.
  • the terminal device 100 may include a processor 110 , an external memory interface 120 , an internal memory 121 , a universal serial bus (USB) interface 130 , a charging management module 140 , a power management module 141 , and a battery 142 , Antenna 1, Antenna 2, Mobile Communication Module 150, Wireless Communication Module 160, Audio Module 170, Speaker 170A, Receiver 170B, Microphone 170C, Headphone Interface 170D, Sensor Module 180, Key 190, Motor 191, Indicator 192, Camera 193 , a display screen 194, and a subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
  • a processor 110 may include a processor 110 , an external memory interface 120 , an internal memory 121 , a universal serial bus (USB) interface 130 , a charging management module 140 , a power management module 141 , and a battery 142 , Antenna 1, Antenna 2, Mobile Communication Module 150, Wireless Communication Module 160, Audio Module 1
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the terminal device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural-network processing unit neural-network processing unit
  • the controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby increasing the efficiency of the system.
  • the processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transceiver (universal asynchronous transmitter) receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transceiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the I2C interface is a bidirectional synchronous serial bus that includes a serial data line (SDA) and a serial clock line (SCL).
  • the processor 110 may contain multiple sets of I2C buses.
  • the processor 110 can be respectively coupled to the touch sensor 180K, the charger, the flash, the camera 193 and the like through different I2C bus interfaces.
  • the processor 110 may couple the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate with each other through the I2C bus interface, so as to realize the touch function of the terminal device 100 .
  • the I2S interface can be used for audio communication.
  • the processor 110 may contain multiple sets of I2S buses.
  • the processor 110 may be coupled with the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170 .
  • the audio module 170 can transmit audio signals to the wireless communication module 160 through the I2S interface, so as to realize the function of answering calls through a Bluetooth headset.
  • the PCM interface can also be used for audio communications, sampling, quantizing and encoding analog signals.
  • the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
  • the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • a UART interface is typically used to connect the processor 110 with the wireless communication module 160.
  • the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to implement the Bluetooth function.
  • the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
  • the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 194 and the camera 193 .
  • MIPI interfaces include camera serial interface (CSI), display serial interface (DSI), etc.
  • the processor 110 communicates with the camera 193 through the CSI interface, so as to realize the shooting function of the terminal device 100 .
  • the processor 110 communicates with the display screen 194 through the DSI interface to implement the display function of the terminal device 100 .
  • the GPIO interface can be configured by software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface may be used to connect the processor 110 with the camera 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like.
  • the GPIO interface can also be configured as I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 130 is an interface that conforms to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the terminal device 100, and can also be used to transmit data between the terminal device 100 and peripheral devices. It can also be used to connect headphones to play audio through the headphones. This interface can also be used to connect other terminal devices, such as AR devices.
  • the interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic illustration, and does not constitute a structural limitation of the terminal device 100 .
  • the terminal device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive wireless charging input through the wireless charging coil of the terminal device 100 . While the charging management module 140 charges the battery 142 , it can also supply power to the terminal device through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, and the wireless communication module 160.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance).
  • the power management module 141 may also be provided in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the terminal device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in terminal device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 may provide a wireless communication solution including 2G/3G/4G/5G, etc. applied on the terminal device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), and the like.
  • the mobile communication module 150 can receive electromagnetic waves from the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor, and then turn it into an electromagnetic wave for radiation through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the same device as at least part of the modules of the processor 110 .
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low frequency baseband signal is processed by the baseband processor and passed to the application processor.
  • the application processor outputs sound signals through audio devices (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or videos through the display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent of the processor 110, and may be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide applications on the terminal device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), global navigation satellites Wireless communication solutions such as global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), and infrared technology (IR).
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication
  • IR infrared technology
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 2 .
  • the antenna 1 of the terminal device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the terminal device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code Division Multiple Access (WCDMA), Time Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (GLONASS), a Beidou navigation satellite system (BDS), a quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the terminal device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • Display screen 194 is used to display images, videos, and the like.
  • Display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light).
  • LED diode AMOLED
  • flexible light-emitting diode flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED) and so on.
  • the terminal device 100 may include one or N display screens 194 , where N is a positive integer greater than one.
  • the terminal device 100 can realize the shooting function through the ISP, the camera 193, the video codec, the GPU, the display screen 194 and the application processor.
  • the ISP is used to process the data fed back by the camera 193 .
  • the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin tone.
  • ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be provided in the camera 193 .
  • Camera 193 is used to capture still images or video.
  • the object is projected through the lens to generate an optical image onto the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the terminal device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
  • a digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals. For example, when the terminal device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy, and the like.
  • Video codecs are used to compress or decompress digital video.
  • the terminal device 100 may support one or more video codecs.
  • the terminal device 100 can play or record videos in various encoding formats, for example, moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
  • MPEG moving picture experts group
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • Applications such as intelligent cognition of the terminal device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the terminal device 100 .
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), and the like.
  • the storage data area may store data (such as audio data, phone book, etc.) created during the use of the terminal device 100 and the like.
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (UFS), and the like.
  • the processor 110 executes various functional applications and data processing of the terminal device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
  • the terminal device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • the audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal. Audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
  • Speaker 170A also referred to as a "speaker" is used to convert audio electrical signals into sound signals.
  • the terminal device 100 can listen to music through the speaker 170A, or listen to a hands-free call.
  • the receiver 170B also referred to as "earpiece" is used to convert audio electrical signals into sound signals.
  • the terminal device 100 answers a call or a voice message, the voice can be answered by placing the receiver 170B close to the human ear.
  • the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 170C through a human mouth, and input the sound signal into the microphone 170C.
  • the terminal device 100 may be provided with at least one microphone 170C.
  • the terminal device 100 may be provided with two microphones 170C, which may implement a noise reduction function in addition to collecting sound signals.
  • the terminal device 100 may further be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions.
  • the earphone jack 170D is used to connect wired earphones.
  • the earphone interface 170D may be the USB interface 130, or may be a 3.5mm open mobile terminal platform (OMTP) standard interface, a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the pressure sensor 180A is used to sense pressure signals, and can convert the pressure signals into electrical signals.
  • the pressure sensor 180A may be provided on the display screen 194 .
  • the capacitive pressure sensor may be comprised of at least two parallel plates of conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes.
  • the terminal device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display screen 194, the terminal device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the terminal device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
  • touch operations acting on the same touch position but with different touch operation intensities may correspond to different operation instructions. For example, when a touch operation whose intensity is less than the first pressure threshold acts on the short message application icon, the instruction for viewing the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, the instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the motion attitude of the terminal device 100 .
  • the angular velocity of the end device 100 about three axes ie, the x, y and z axes
  • the gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed or the record button is pressed, the gyro sensor 180B detects the angle of the terminal device 100 shaking, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to offset the shaking of the terminal device 100 through reverse motion. , to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenarios.
  • the air pressure sensor 180C is used to measure air pressure.
  • the terminal device 100 calculates the altitude through the air pressure value measured by the air pressure sensor 180C to assist in positioning and navigation.
  • the magnetic sensor 180D includes a Hall sensor.
  • the terminal device 100 can detect the opening and closing of the flip holster using the magnetic sensor 180D.
  • the terminal device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D. Further, according to the detected opening and closing state of the leather case or the opening and closing state of the flip cover, characteristics such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the terminal device 100 in various directions (generally three axes).
  • the magnitude and direction of gravity can be detected when the terminal device 100 is stationary. It can also be used to identify the posture of terminal devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the terminal device 100 can measure the distance through infrared or laser. In some embodiments, when shooting a scene, the terminal device 100 can use the distance sensor 180F to measure the distance to achieve fast focusing.
  • Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the terminal device 100 emits infrared light to the outside through the light emitting diode.
  • the terminal device 100 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the terminal device 100 . When insufficient reflected light is detected, the terminal device 100 may determine that there is no object near the terminal device 100 .
  • the terminal device 100 can use the proximity light sensor 180G to detect that the user holds the terminal device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
  • Proximity light sensor 180G can also be used in holster mode, pocket mode automatically unlocks and locks the screen.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • the terminal device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the terminal device 100 is in a pocket, so as to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the terminal device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, accessing application locks, taking photos with fingerprints, answering incoming calls with fingerprints, and the like.
  • the temperature sensor 180J is used to detect the temperature.
  • the terminal device 100 uses the temperature detected by the temperature sensor 180J to execute the temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the terminal device 100 reduces the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
  • the terminal device 100 when the temperature is lower than another threshold, the terminal device 100 heats the battery 142 to avoid abnormal shutdown of the terminal device 100 caused by the low temperature.
  • the terminal device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also called “touch device”.
  • the touch sensor 180K may be disposed on the display screen 194 , and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to touch operations may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the terminal device 100 , which is different from the position where the display screen 194 is located.
  • the bone conduction sensor 180M can acquire vibration signals.
  • the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human voice.
  • the bone conduction sensor 180M can also contact the pulse of the human body and receive the blood pressure beating signal.
  • the bone conduction sensor 180M can also be disposed in the earphone, combined with the bone conduction earphone.
  • the audio module 170 can analyze the voice signal based on the vibration signal of the voice vibration bone block obtained by the bone conduction sensor 180M, and realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beat signal obtained by the bone conduction sensor 180M, and realize the function of heart rate detection.
  • the keys 190 include a power-on key, a volume key, and the like. Keys 190 may be mechanical keys. It can also be a touch key.
  • the terminal device 100 may receive key input and generate key signal input related to user settings and function control of the terminal device 100 .
  • Motor 191 can generate vibrating cues.
  • the motor 191 can be used for vibrating alerts for incoming calls, and can also be used for touch vibration feedback.
  • touch operations acting on different applications can correspond to different vibration feedback effects.
  • the motor 191 can also correspond to different vibration feedback effects for touch operations on different areas of the display screen 194 .
  • Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like.
  • the SIM card interface 195 is used to connect a SIM card.
  • the SIM card can be contacted and separated from the terminal device 100 by inserting into the SIM card interface 195 or pulling out from the SIM card interface 195 .
  • the terminal device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • the SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card and so on. Multiple cards can be inserted into the same SIM card interface 195 at the same time. The types of the plurality of cards may be the same or different.
  • the SIM card interface 195 can also be compatible with different types of SIM cards.
  • the SIM card interface 195 is also compatible with external memory cards.
  • the terminal device 100 interacts with the network through the SIM card to realize functions such as calls and data communication.
  • the terminal device 100 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the terminal device 100 and cannot be separated from the terminal device 100 .
  • the software system of the terminal device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the embodiments of the present application take an Android system with a layered architecture as an example to exemplarily describe the software structure of the terminal device 100 .
  • FIG. 2 is a schematic block diagram of a software structure of a terminal device 100 according to an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate with each other through software interfaces.
  • the Android system is divided into four layers, which are, from top to bottom, an application layer, an application framework layer, an Android runtime (Android runtime) and a system library, and a kernel layer.
  • the application layer can include a series of application packages.
  • the application package can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message and so on.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer may include window managers, content providers, view systems, telephony managers, resource managers, notification managers, and the like.
  • a window manager is used to manage window programs.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, take screenshots, etc.
  • Content providers are used to store and retrieve data and make these data accessible to applications.
  • the data may include video, images, audio, calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on. View systems can be used to build applications.
  • a display interface can consist of one or more views.
  • the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.
  • the telephony manager is used to provide the communication function of the terminal device 100 .
  • the management of call status including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localization strings, icons, pictures, layout files, video files and so on.
  • the notification manager enables applications to display notification information in the status bar, which can be used to convey notification-type messages, and can disappear automatically after a brief pause without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc.
  • the notification manager can also display notifications in the status bar at the top of the system in the form of graphs or scroll bar text, such as notifications of applications running in the background, and notifications on the screen in the form of dialog windows. For example, text information is prompted in the status bar, a prompt sound is issued, the terminal device vibrates, and the indicator light flashes.
  • Android Runtime includes core libraries and a virtual machine. Android runtime is responsible for scheduling and management of the Android system.
  • the core library consists of two parts: one is the function functions that the java language needs to call, and the other is the core library of Android.
  • the application layer and the application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, safety and exception management, and garbage collection.
  • a system library can include multiple functional modules. For example: surface manager (surface manager), media library (Media Libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
  • surface manager surface manager
  • media library Media Libraries
  • 3D graphics processing library eg: OpenGL ES
  • 2D graphics engine eg: SGL
  • the Surface Manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display drivers, camera drivers, audio drivers, and sensor drivers.
  • the touch sensor 180K of the terminal device 100 receives a touch operation, and a corresponding hardware interrupt is sent to the kernel layer.
  • the kernel layer processes touch operations into raw input events (including touch coordinates, timestamps of touch operations, etc.). Raw input events are stored at the kernel layer.
  • the application framework layer obtains the original input event from the kernel layer, and identifies the control corresponding to the input event.
  • the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer, Still images or video are captured through the camera 193 .
  • the terminal device 100 When the terminal device 100 receives the user's touch operation on the control 32, the terminal device 100 responds to the touch operation and performs video recording, and the terminal device 100 displays the video image 33 through the display screen 194 through the captured video image.
  • the terminal device 100 captures the video image
  • the background motion information of the video image can also be collected by the gyro sensor 180B.
  • the terminal device 100 may perform anti-shake processing on each frame of the video image by using the video anti-shake method provided in the embodiment of the present application, and after the anti-shake processing, a video in which the foreground and background images are stabilized at the same time is obtained.
  • the video anti-shake process provided by the embodiments of the present application will be described below.
  • the terminal device 100 obtains an input video, and the input video may be the terminal device 100 responding to a user's video recording operation,
  • the video captured by the camera 193, at this time, the camera 193 can be a front camera or a rear camera, that is, the terminal device 100 can perform anti-shake processing on the video captured by the front camera, and can also shoot the rear camera.
  • the video is stabilized.
  • the input video includes multiple frames of images.
  • the terminal device 100 can also obtain the input video by receiving the video that has been shot by other devices.
  • the terminal device 100 receives a video that has been shot by another mobile phone.
  • the terminal device 100 performs image front-background segmentation for each frame of image in the input video to obtain a foreground image and a background image. Then, the terminal device 100 performs anti-shake processing on the foreground image to obtain an anti-shake foreground image; and performs anti-shake processing on the background image to obtain an anti-shake background image. Finally, image fusion is performed on the stabilized foreground image and the stabilized background image to obtain an image stabilized.
  • FIG. 5 is a schematic diagram of an anti-shake processing process of a video image frame provided by an embodiment of the present application.
  • a certain frame of image in the input video acquired by the terminal device 100 is an image 51 , and a foreground image 52 and a background image 53 are obtained after performing a foreground and background segmentation operation on the image 51 .
  • the image stabilization module 54 is passed through to obtain an image stabilized foreground image.
  • the background image 53 after passing through the anti-shake module 55 , an anti-shake background image 56 is obtained. Image fusion is performed on the stabilized background image 56 and the stabilized foreground image, and a video image 57 is output.
  • the terminal device 100 performs anti-shake processing on each frame of the input video image to obtain a frame-by-frame image after anti-shake.
  • the images after frame-by-frame anti-shake constitute a video, and a video in which the foreground and background are stabilized at the same time is obtained.
  • the terminal device 100 may further adjust the anti-shake intensity according to the edge band information of the input video image and the edge band information of the output video image.
  • the edge band information of the input video image may refer to the pixel information of the edge band extracted during the segmentation of the foreground and the background
  • the edge band information of the output video image may refer to the difference between the foreground and the background in the image after image fusion.
  • a fringe band can refer to the portion of the edge between the foreground and background in an image.
  • FIG. 6 it is another schematic block diagram of a video anti-shake process provided by an embodiment of the present application.
  • the terminal device 100 first acquires the input video; then, for each frame of video image, performs front and rear background segmentation to obtain a foreground image and a background image; The background image stabilization process is performed to obtain an image stabilized background image.
  • Anti-shake processing is performed on the foreground image to obtain an anti-shake foreground image; finally, image fusion is performed on the anti-shake background image and the anti-shake foreground image to obtain an anti-shake video image.
  • the pixel information of the edge band between the foreground and the background of the image to be processed can also be extracted, which is recorded as the first edge band information.
  • the pixel information of the edge band between the foreground and the background can also be extracted from the image after stabilization (ie, the image obtained by fusion), which is recorded as the second edge band information.
  • feedback control is performed on the anti-shake range constraint, that is, the magnitude of the anti-shake intensity is adjusted.
  • the anti-shake range may refer to the size of the anti-shake strength.
  • the anti-shake strength may include foreground anti-shake strength and background anti-shake strength, that is, the foreground anti-shake strength and the background anti-shake strength may be adjusted according to the first edge band information and the second edge band information.
  • the foreground anti-shake intensity or the background anti-shake intensity is adjusted according to the first edge band information and the second edge band information.
  • FIG. 7 and FIG. 8 both are still another schematic diagrams of a video anti-shake process provided by an embodiment of the present application.
  • the foreground anti-shake intensity is adjusted according to the first edge band information and the second edge band information.
  • the adjusted foreground anti-shake strength which is used in the foreground anti-shake process.
  • the background image stabilization intensity is adjusted according to the first edge band information and the second edge band information, and the effect of the adjusted background image stabilization intensity and the background image stabilization process.
  • FIG. 7 also exemplarily shows that the way of anti-shake of the foreground is full-frame anti-shake.
  • the full-frame anti-shake process may include processes such as foreground image feature point extraction, path smoothing, and edge compensation.
  • the method for extracting the feature points of the foreground image can be arbitrary, for example, extracting the feature points of the foreground image by the optical flow method.
  • Path smoothing refers to performing path smoothing on the motion trajectory curve of the foreground feature points according to the anti-shake strength.
  • Edge compensation may refer to pixel compensation for the edge part of the image.
  • edge compensation the edge black borders of the foreground image after anti-shake processing can be eliminated or reduced, which is more conducive to the fusion of subsequent foreground images and background images.
  • the way of edge compensation can be arbitrary, for example, the black edge of the image is patched by Gaussian weight interpolation.
  • the anti-shake method of the foreground image and the anti-shake method of the background image can be arbitrary, that is, any video anti-shake method can be used to perform anti-shake processing on the foreground image and the background image.
  • gyroscope sensor data can be used for stabilization. Specifically, when the terminal device 100 records a video, it can read the data output by the gyro sensor 180B, and perform angular integration on the gyro data to obtain the background motion information of the video image. Then, according to the background motion information, de-shake compensation is performed on the background image to achieve stabilization of the background image, that is, anti-shake processing is performed on the background image.
  • 3-dimensional rotation vector estimation, 3-dimensional rotation vector smoothing, motion compensation, and image affine transformation (Warp) output are performed on the data output by the gyroscope. Shake the background image. In this process, the background image stabilization strength can act on the 3D rotation vector smoothing process.
  • FIG. 10 it is a schematic schematic diagram of a specific flow of a video anti-shake process according to an embodiment of the present application.
  • a full-frame image stabilization (or full-frame anti-shake) method is used to perform anti-shake processing to obtain a stabilized foreground image (ie, an anti-shake foreground image).
  • the full-frame image stabilization process in FIG. 10 includes processes such as foreground image feature point extraction, path smoothing, and edge compensation.
  • the multi-frame foreground image is a foreground image of consecutive multi-frame images.
  • the terminal device 100 performs foreground and background segmentation on the current frame image to obtain the foreground image and the background image of the current frame.
  • the terminal device 100 caches 10 consecutive frames of images, and the 10 consecutive frames of images include the current frame image. Taking the current frame image as the limit, take n frames of images forward and backward, respectively, to obtain the extracted continuous images.
  • the foreground images of multiple frames of images then extract the feature points of the multiple foreground images respectively; obtain the foreground feature point path according to the feature points of the multiple foreground images.
  • the foreground feature point path After obtaining the foreground feature point path, use the foreground anti-shake intensity of the current frame image to perform path smoothing on the foreground feature point path to obtain a smoothed foreground feature point path. Then, the stabilized foreground image is obtained according to the smoothed foreground feature point path and the foreground image of the current frame.
  • the background image stabilization intensity of the current frame image is used to perform path smoothing to obtain a stabilized background image (ie, the background image after stabilization).
  • the motion trajectory curve of the feature points of the background image can be obtained by means of feature point extraction. For example, extract the feature points of the background image of the current frame, rely on the feature points of the consecutive multiple frames of background images on the image sequence, obtain the feature point path of the background image, and then use the jitter strength of the background image to determine the The feature point path of the background image is used for path smoothing.
  • the motion trajectory curve of the background image can also be obtained in other ways, for example, the motion trajectory curve of the background image can be obtained through gyroscope data.
  • the pixel information of the edge band can be extracted to obtain the first edge band information.
  • image fusion of the stabilized foreground image and the stabilized background image a video image with both the foreground and the background is simultaneously stabilized.
  • the pixel information of the edge band in the fused video image can also be extracted to obtain the second edge band information.
  • the anti-shake intensity is adjusted or adjusted by calculating the structural similarity index between the first edge band information and the second edge band information, and then according to the structural similarity index. Specifically, after the structural similarity index is calculated, it can be determined whether the structural similarity index falls within the preset similarity threshold range. If it does not fall, and the structural similarity index is greater than the maximum value of the preset similarity threshold interval, reduce the anti-shake intensity until the structural similarity index falls within the preset similarity threshold interval; And if the structural similarity index is smaller than the minimum value of the preset similarity threshold interval, the anti-shake intensity is increased until the structural similarity index falls within the preset similarity threshold interval.
  • the manner of increasing or decreasing the anti-shake intensity may be arbitrary.
  • the current anti-shake strength may be multiplied by a corresponding coefficient to increase or decrease the anti-shake strength.
  • the corresponding coefficient is greater than 1, and when the anti-shake strength needs to be decreased, the corresponding coefficient is less than 1.
  • the current anti-shake strength is A
  • the anti-shake strength needs to be increased
  • the anti-shake strength needs to be decreased, Then multiply the current anti-shake strength A by 0.9, and use the obtained product as the adjusted anti-shake strength.
  • the path smoothing in the foreground anti-shake process and the path smoothing in the background anti-shake process can be applied.
  • the adjusted anti-shake strength may include the foreground anti-shake strength and/or the background anti-shake strength of the current frame image, or the foreground anti-shake strength and/or the background anti-shake strength of the next frame image, correspondingly.
  • the adjusted anti-shake intensity can act on the foreground anti-shake process and/or the background anti-shake process of the current frame image, or act on the foreground anti-shake process and/or the background anti-shake process of the next frame image.
  • FIG. 11 another specific schematic flowchart of a video anti-shake process provided by an embodiment of the present application.
  • use the deep neural network model (Deep Neural Networks, DNN) to segment the input video image in the foreground and background, use the optical flow method to extract the feature points of the foreground image, and use the Gaussian weight interpolation to repair the black borders on the edges to correct the image.
  • Make edge supplements the structural similarity index between the two edge band information is calculated, and the foreground anti-shake intensity of the next frame image is adjusted according to the structural similarity index, that is, the adjusted foreground anti-shake intensity is used to act on the next frame.
  • the image's foreground feature point path smoothing process is used to adjust the image to the image.
  • the foreground and background segmentation method, the foreground image feature point extraction method and the edge compensation method shown in FIG. 11 are all exemplary methods.
  • the Gaussian weighted edge compensation method will be introduced below with reference to the schematic diagram of edge compensation shown in FIG. 12 .
  • the edge 121 is the boundary between the meaningful pixels and the meaningless pixels, that is, the edge 121 is used as the boundary, and the pixels on the left of Figure 12 are meaningful pixels. , the pixels on the right are meaningless pixels. Insignificant pixels may refer to pixels forming black borders.
  • edge compensation may be performed on meaningless pixels, that is, pixel compensation may be performed on the edge portion of the image.
  • Point q in Figure 12 is a meaningless pixel point, and pixel supplementation needs to be performed on point q.
  • the pixel value of the q point can be calculated from the surrounding pixel points according to the superposition and addition of Gaussian weights. Taking the area 122 in FIG. 12 as the surrounding area of the q point, the pixel value of the q point is obtained according to the pixel value of each pixel point in the area 122 . Specifically, by The pixel value of the pixel point, w(i,j) is the Gaussian weight of the pixel point in the i-th row and the j-th column, ⁇ is a hyperparameter, and N is a positive integer.
  • the foreground image stabilization intensity of the current frame image is A
  • the background image stabilization intensity is B
  • the structural similarity index of the two edge band information is calculated.
  • the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval. Multiplying the foreground image stabilization intensity A of the current frame image by 0.9, the adjusted foreground image stabilization intensity is 0.9A, that is, the foreground image stabilization intensity of the next frame image is 0.9A.
  • the stabilized foreground image and the stabilized background image of the current frame image are image-fused to obtain the stabilized image of the current frame image (ie, the image after stabilization), and the stabilized image is output.
  • the next frame of image is acquired, and the foreground image and the background image are obtained by performing foreground and background segmentation on the next frame of image.
  • Anti-shake processing is performed on the background image to obtain a stabilized background image of the next frame of image.
  • the background anti-shake intensity is B.
  • Anti-shake processing is performed on the foreground image to obtain a stabilized foreground image of the next frame of image.
  • the foreground anti-shake intensity is 0.9A.
  • image fusion is performed on the stabilized background image and the stabilized foreground image of the next frame of image to obtain the stabilized image of the next frame of image, and the stabilized image is output.
  • the structural similarity index of the next frame image is calculated and obtained. If the structural similarity index of the next frame of image falls within the preset similarity threshold range, the foreground anti-shake intensity is not adjusted, that is, the foreground anti-shake intensity of the next frame of image is 0.9A. If the structural similarity index of the next frame of image does not fall within the preset similarity threshold range, based on the foreground anti-shake strength of 0.9A, the foreground anti-shake strength of the next frame of image is obtained.
  • the background image stabilization intensity of the next frame image may also be adjusted according to the structural similarity index of the current frame image. Another specific flow diagram is introduced and described.
  • the background image stabilization intensity of the next frame image is adjusted according to the structural similarity index, that is, the adjusted The background image stabilization intensity acts on the background image stabilization process of the next frame of image.
  • FIG. 13 and FIG. 11 please refer to the description of FIG. 11 above, and details are not repeated here.
  • the foreground image stabilization intensity of the current frame image is A
  • the background image stabilization intensity is B
  • the structural similarity index of the two edge band information is calculated.
  • the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval.
  • the stabilized foreground image and the stabilized background image of the current frame image are image-fused to obtain the stabilized image of the current frame image (ie, the image after stabilization), and the stabilized image is output.
  • the next frame of image is acquired, and the foreground image and the background image are obtained by performing foreground and background segmentation on the next frame of image.
  • Anti-shake processing is performed on the background image to obtain a stabilized background image of the next frame of image.
  • the background anti-shake intensity is 0.8B.
  • Anti-shake processing is performed on the foreground image to obtain a stabilized foreground image of the next frame of image.
  • the foreground anti-shake intensity is A.
  • image fusion is performed on the stabilized background image and the stabilized foreground image of the next frame of image to obtain the stabilized image of the next frame of image, and the stabilized image is output.
  • the structural similarity index of the next frame image is calculated and obtained. If the structural similarity index of the next frame image falls within the preset similarity threshold range, the background image stabilization intensity is not adjusted, that is, the background image stabilization intensity of the next frame image is 0.8B. If the structural similarity index of the next frame of image does not fall within the preset similarity threshold range, based on the background image stabilization intensity of 0.8B, the background image stabilization intensity of the next frame image is obtained.
  • the background image stabilization intensity and foreground image stabilization intensity of the next frame image can also be adjusted according to the structural similarity index of the current frame image, which is provided below in conjunction with the embodiment of the present application shown in FIG. 14 Another specific flow diagram of the video anti-shake process is introduced and explained.
  • the background image stabilization intensity and foreground image stabilization intensity of the next frame image are adjusted, that is, the adjusted background image stabilization intensity acts on the next frame.
  • the adjusted foreground image stabilization intensity acts on the foreground image stabilization process of the next frame of image.
  • the foreground image stabilization intensity of the current frame image is A
  • the background image stabilization intensity is B
  • the structural similarity index of the two edge band information is calculated.
  • the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval.
  • Multiplying the background image stabilization intensity B of the current frame image by 0.8 the adjusted background image stabilization intensity is 0.8B, that is, the background image stabilization intensity of the next frame image is 0.8B.
  • the adjusted foreground image stabilization intensity is 0.9A
  • the foreground image stabilization intensity of the next frame image is 0.9A.
  • the stabilized foreground image and the stabilized background image of the current frame image are image-fused to obtain the stabilized image of the current frame image (ie, the image after stabilization), and the stabilized image is output.
  • the next frame of image is acquired, and the foreground image and the background image are obtained by performing foreground and background segmentation on the next frame of image.
  • Anti-shake processing is performed on the background image to obtain a stabilized background image of the next frame of image.
  • the background anti-shake intensity is 0.8B.
  • Anti-shake processing is performed on the foreground image to obtain a stabilized foreground image of the next frame of image.
  • the foreground anti-shake intensity is 0.9A.
  • image fusion is performed on the stabilized background image and the stabilized foreground image of the next frame of image to obtain the stabilized image of the next frame of image, and the stabilized image is output.
  • the structural similarity index of the next frame image is calculated and obtained. If the structural similarity index of the next frame image falls within the preset similarity threshold range, the background image stabilization intensity will not be adjusted, that is, the background image stabilization intensity of the next frame image is 0.8B, and the foreground image stabilization intensity is still is 0.9A. If the structural similarity index of the next frame of image does not fall within the preset similarity threshold range, then based on the background anti-shake intensity of 0.8B and the foreground anti-shake intensity of 0.9A, adjust respectively to obtain the back image of the next frame of image. The background image stabilization intensity and the background image stabilization intensity.
  • the foreground anti-shake intensity and/or the background anti-shake intensity of the current frame image can also be determined.
  • the fusion result is not output. Instead, adjust the foreground anti-shake intensity and/or the background anti-shake intensity, and output the fused video image until the structural similarity index of the current frame image falls within the preset similarity threshold range.
  • the foreground anti-shake strength of the current frame image is adjusted.
  • the foreground image stabilization intensity of the current frame image is A
  • the background image stabilization intensity is B.
  • the structural similarity index of the two edge band information is calculated.
  • the structural similarity index of the current frame image does not fall within the preset similarity threshold interval, and the structural similarity index is higher than the maximum value of the preset similarity threshold interval. Multiplying the foreground image stabilization intensity A of the current frame image by 0.9, the adjusted foreground image stabilization intensity is 0.9A, that is, the foreground image stabilization intensity of the next foreground image stabilization process is 0.9A.
  • Image fusion is performed on the stabilized foreground image and the stabilized background image of the current frame image to obtain the stabilized image (ie, the image after stabilization) calculated for the first time, and the stabilized image is not output.
  • the adjusted foreground image stabilization intensity can be used to perform stabilization processing on the foreground image again. That is, use the adjusted anti-shake intensity to smooth the foreground feature point path to obtain the stabilized foreground image calculated for the second time.
  • the stabilized foreground image obtained by the second calculation and the stabilized background image obtained by the first calculation are used to obtain the stabilized image obtained by the second calculation. According to the second edge band information calculated for the second time and the first edge band information obtained by the first calculation, the structural similarity index is calculated.
  • the foreground anti-shake intensity will not be adjusted, and the stabilized image calculated for the second time will be output, that is, the stabilized image calculated for the second time will be used as the current frame image. the output video image. If the structural similarity index this time does not fall within the preset similarity threshold range, adjust the foreground anti-shake intensity process again, and use the adjusted foreground anti-shake intensity for a third calculation. This cycle is repeated until the structural similarity index calculated for a certain time falls within the preset similarity threshold range, and the stabilized image of this time is used as the output video image of the current frame image.
  • the background image stabilization intensity of the current frame image is adjusted.
  • the foreground image stabilization intensity of the current frame image is A
  • the background image stabilization intensity is B
  • the structural similarity index does not fall into the preset similarity threshold range
  • the background image stabilization intensity B of the current frame image is multiplied by 0.8
  • the adjusted background image stabilization intensity is 0.8B, that is The background image stabilization intensity of the next background image stabilization process is 0.8B.
  • the stabilized background image calculated for the second time and the stabilized foreground image calculated for the first time are used for fusion to obtain the stabilized image calculated for the second time.
  • a structural similarity index between the second edge band information calculated for the second time and the first edge band information calculated for the first time is calculated. If the structural similarity index calculated for the second time falls within the preset similarity threshold range, the stabilized image calculated for the second time is used as the output video image of the current frame image (ie, the image after anti-shake processing).
  • the background image stabilization intensity and the foreground image stabilization intensity of the current frame image are adjusted.
  • the foreground image stabilization intensity of the current frame image is A
  • the background image stabilization intensity is B
  • the structural similarity index does not fall into the preset similarity threshold range
  • the background image stabilization intensity B of the current frame image is multiplied by 0.8
  • the adjusted background image stabilization intensity is 0.8B, that is The background image stabilization intensity of the next background image stabilization process is 0.8B.
  • a second calculation is performed using the adjusted IS strength. Specifically, 0.9A is used to perform path smoothing on the feature point path of the foreground image to obtain the stabilized foreground image of the second calculation. Use 0.8B to smooth the motion trajectory curve of the background image, and obtain the stabilized background image for the second calculation. The stabilized foreground image calculated for the second time and the stabilized background image calculated for the second time are fused to obtain the stabilized image for the second time. According to the second edge band information calculated for the second time and the first edge band information calculated for the first time, the structure similarity index of the second time is calculated.
  • the stabilized image calculated for the second time is used as the output video image of the current frame image (ie, the image after anti-shake processing). If the structural similarity index calculated for the second time does not fall within the preset similarity threshold range, adjust the anti-shake intensity again, and use the adjusted anti-shake intensity for the next calculation until a certain structural similarity The degree index does not fall into the preset similarity threshold range.
  • the foreground image stabilization intensity and/or the background image stabilization intensity of the current frame image can be adjusted, or the foreground image stabilization intensity and/or the background image stabilization intensity of the next frame image can be adjusted automatically.
  • the intensity of the artifacts generated when the foreground and background are fused to prevent excessive anti-shake from causing obvious isolation bands during fusion, making the transition at the edge of the foreground and background fusion more natural.
  • the anti-shake strength can be used in the path smoothing process.
  • the anti-shake strength is increased, that is, the curve smoothing strength (or the path smoothing strength) is enhanced.
  • the anti-shake strength is decreased, that is, the curve smoothing strength (or the path smoothing strength) is decreased.
  • the weight of the path smoothing can be adjusted according to the Gaussian weight. It can be seen from the Gaussian distribution that the larger the Gaussian weight sigma, the smoother the curve.
  • the anti-shake strength can be equivalent to the Gaussian weight sigma.
  • FIG. 16 is a schematic block diagram of the flow of a video anti-shake method provided by an embodiment of the present application, the method may include the following steps:
  • Step S1601 The terminal device acquires an image to be processed, and the image to be processed is a frame of image in the video to be processed.
  • the manner in which the terminal device obtains the image to be processed may be arbitrary.
  • the terminal device is a mobile phone 100.
  • the trigger operation is used to instruct the mobile phone 100 to record a video, and in response to the trigger operation, a video image is captured to obtain to the image to be processed.
  • the image to be processed is subjected to anti-shake processing, it is displayed on the display screen of the mobile phone 100 .
  • Step S1602 the terminal device performs front and rear background segmentation on the image to be processed, to obtain a first background image and a first foreground image.
  • the foreground image in this embodiment of the present application may be a human face, or may not be a human face, for example, the foreground image is the foreground shown in FIG. 5 .
  • Step S1603 The terminal device performs anti-shake processing on the first foreground image to obtain a first stabilized foreground image.
  • the anti-shake manner of the first foreground image may be arbitrary.
  • a full-frame anti-shake method may be used to perform anti-shake processing on the foreground image.
  • the feature points of the first foreground image are first extracted, and then a first motion trajectory curve is obtained according to the feature points of the first foreground image and the feature points of the second foreground image, and the first motion trajectory curve may be a feature of the foreground image.
  • the second foreground image is a foreground image of the first target image
  • the video to be processed includes the first target image
  • the first target image and the to-be-processed image are consecutive image frames in an image sequence.
  • the first target image is generally a multi-frame image.
  • the image sequence there are the following image frames: image 1, image 2, image 3, image 4...image n, where n is a positive integer.
  • the image to be processed is image 5
  • the first target image may include image 1 , image 2 , image 3 and image 4 , as well as image 6 , image 7 , image 8 and image 9 .
  • the shaking trajectory of the foreground image is obtained.
  • path smoothing is performed on the first motion trajectory curve according to the foreground anti-shake strength of the image to be processed, to obtain a second motion trajectory curve after the path is smoothed. Then, according to the second motion trajectory curve and the first foreground image, a stabilized first foreground image is obtained. Finally, edge compensation is performed on the stabilized first foreground image to obtain the first stabilized foreground image.
  • the edge compensation may use the Gaussian weight interpolation method corresponding to FIG. 12 , and the specific content may refer to the corresponding content above, which will not be repeated here.
  • edge compensation may not be performed.
  • the stabilized first foreground image is used as the first foreground stabilized image.
  • Step S1604 The terminal device performs anti-shake processing on the first background image to obtain a first stabilized background image.
  • the anti-shake processing method of the background image is also arbitrary. For example, first extract the feature points of the first background image. Then, according to the feature points of the first background image and the feature points of the second background image, a fourth motion trajectory curve is obtained.
  • the second background image is a background image of the second target image
  • the video to be processed includes the second target image
  • the second target image and the to-be-processed image are consecutive image frames in the image sequence.
  • the second target image is similar to the above-mentioned first target image, and both are consecutive multiple frames of images in the image sequence, and details are not described herein again.
  • the fourth motion trajectory curve is subjected to path smoothing to obtain a second motion trajectory curve after the path is smoothed. Then, according to the second motion trajectory curve and the first foreground image, a first foreground stabilization image is obtained.
  • steps S1603 and S1604 can be arbitrary, and steps S1603 and S1604 can also be executed simultaneously, and the execution sequence of these two steps is not limited here.
  • Step S1605 The terminal device fuses the first stabilized foreground image and the first stabilized background image to obtain a first stabilized image of the image to be processed.
  • the first stabilized image obtained by fusion can be used as the output video image of the image to be processed (ie, the video image after anti-shake processing).
  • the first stabilized image may not be the output video image of the image to be processed.
  • the anti-shake intensity can be adjusted according to the similarity index.
  • first edge band information of the first stabilized image may be extracted first, and the first edge band information is pixel information of an edge band between the foreground and the background in the first stabilized image.
  • the first edge band information may be extracted during the segmentation of the front and back of the image.
  • the second edge band information can be obtained from the fused image.
  • the image stabilization intensity of the next frame to be processed or the image to be processed is adjusted, and the image stabilization intensity includes foreground image stabilization intensity and/or background image stabilization intensity.
  • the first structural similarity index between the first edge band information and the second edge band information can be calculated. Whether to adjust the anti-shake intensity is determined according to the first structural similarity index. Wherein, if the first structural similarity index does not fall within the preset similarity threshold range, the anti-shake intensity of the next frame to be processed or the to-be-processed image (ie, the current frame image) is adjusted. For the specific adjustment process, please refer to the corresponding content above, which will not be repeated here.
  • the stabilization image corresponding to this time is used as the output video image of the current frame image.
  • path smoothing is performed on the first motion trajectory curve to obtain a third motion trajectory curve after the path is smoothed.
  • a third motion trajectory curve For the content of the first motion trajectory curve, reference may be made to the corresponding content above, which will not be repeated here.
  • a second foreground stabilization image is obtained.
  • the second stabilized foreground image and the first stabilized background image are fused to obtain a second stabilized image of the image to be processed.
  • a second structural similarity index between the second edge band information and the third edge band information is determined, where the third edge band information is pixel information of an edge band between the foreground and the background in the second stabilized image.
  • the adjusted first anti-shake intensity is multiplied by the preset value to obtain the adjusted second anti-shake intensity. Taking the adjusted second anti-shake intensity as the adjusted first anti-shake intensity, and returning to perform path smoothing on the first motion trajectory curve according to the adjusted first anti-shake strength to obtain a third motion trajectory curve after the path is smoothed until the second structural similarity index falls into the preset similarity threshold interval. This cycle is repeated until the second structural similarity index falls within the preset similarity threshold range, and the corresponding second stabilized image is used as the output video image of the image to be processed.
  • the embodiments of the present application provide a video anti-shake apparatus, which is applied to a terminal device.
  • the apparatus may include:
  • the image acquisition module 171 is configured to acquire a to-be-processed image, where the to-be-processed image is a frame of image in the to-be-processed video.
  • the foreground and background segmentation module 172 is configured to perform foreground and background segmentation on the image to be processed to obtain a first background image and a first foreground image.
  • the foreground anti-shake module 173 is configured to perform anti-shake processing on the first foreground image to obtain a first stabilized foreground image.
  • the background anti-shake module 174 is configured to perform anti-shake processing on the first background image to obtain a first stabilized background image.
  • the image fusion module 175 is configured to fuse the first stabilized foreground image and the first stabilized background image to obtain a first stabilized image of the image to be processed.
  • the foreground anti-shake module is specifically used to: extract the feature points of the first foreground image; obtain the first motion trajectory curve according to the feature points of the first foreground image and the feature points of the second foreground image, and the first
  • the second foreground image is the foreground image of the first target image
  • the video to be processed includes the first target image
  • the first target image and the image to be processed are continuous image frames in the image sequence
  • according to the foreground anti-shake intensity of the image to be processed Perform path smoothing on the first motion trajectory curve to obtain a second motion trajectory curve after the path is smoothed; obtain a stabilized first foreground image according to the second motion trajectory curve and the first foreground image
  • the foreground image is subjected to edge compensation to obtain a first foreground stabilized image.
  • the foreground anti-shake module is specifically used for: Calculate the target pixel value of each pixel to be compensated in the stabilized first foreground image;
  • the first foreground stabilized image is obtained, and the other pixels are in the stabilized first foreground image, except for the pixels to be compensated.
  • V(q) is the target pixel value of point q
  • point q is the pixel to be compensated
  • P(i, j) is the pixel value of the pixel in the i-th row and the j-th column
  • w(i, j) is the i-th row
  • the Gaussian weight of the pixel in the jth column, ⁇ is a hyperparameter, and N is a positive integer.
  • the apparatus further includes: an anti-shake intensity adjustment module, configured to extract first edge band information of the image to be processed, where the first edge band information is the difference between the foreground and the background in the image to be processed.
  • the pixel information of the edge band extract the second edge band information of the first stabilization image, and the second edge band information is the pixel information of the edge band between the foreground and the background in the first stabilization image; according to the first edge band
  • the information and the second edge band information are used to adjust the image stabilization intensity of the next frame to be processed or the image to be processed, and the image stabilization intensity includes the foreground image stabilization intensity and/or the background image stabilization intensity.
  • the anti-shake intensity adjustment module is specifically configured to: determine a first structural similarity index between the first edge band information and the second edge band information; if the first structural similarity index does not fall Enter the preset similarity threshold range, and adjust the image stabilization intensity of the next frame to be processed or the image to be processed.
  • the anti-shake intensity adjustment module is specifically configured to: if the first structural similarity index does not fall within the preset similarity threshold range, multiply the anti-shake intensity of the image to be processed by the preset value, The adjusted first anti-shake intensity is obtained, and the adjusted first anti-shake intensity is used as the anti-shake intensity of the next frame of the image to be processed or the anti-shake intensity of the to-be-processed image.
  • the adjusted first anti-shake intensity includes the foreground anti-shake intensity of the image to be processed; the device further includes a current frame anti-shake result adjustment module, configured to: according to the adjusted first anti-shake intensity, Perform path smoothing on the first motion trajectory curve to obtain a third motion trajectory curve after the path is smoothed.
  • the first motion trajectory curve is a motion trajectory curve obtained according to the feature points of the first foreground image and the feature points of the second foreground image, and the third motion trajectory curve is obtained.
  • the second foreground image is the foreground image of the first target image
  • the video to be processed includes the first target image
  • the first target image and the image to be processed are continuous image frames in the image sequence
  • according to the third motion trajectory curve and the first foreground image to obtain a second foreground stabilized image
  • fuse the second foreground stabilized image with the first background stabilized image to obtain a second stabilized image of the image to be processed
  • the second structural similarity index between the information, and the third edge band information is the pixel information of the edge band between the foreground and the background in the second stabilized image
  • Set the similarity threshold interval multiply the adjusted first anti-shake intensity by the preset value to obtain the adjusted second anti-shake intensity
  • take the adjusted second anti-shake intensity as the adjusted first anti-shake intensity returning to the step of performing path smoothing on the first motion trajectory curve according to the adjusted first anti-shake intensity to obtain a third motion trajectory curve after the path is smoothed, until the second structural similar
  • the background image stabilization module is specifically used to: extract the feature points of the first background image; obtain the fourth motion according to the feature points of the first background image and the feature points of the second background image trajectory curve, the second background image is the background image of the second target image, the video to be processed includes the second target image, and the second target image and the image to be processed are continuous image frames in the image sequence; according to the image to be processed According to the background image stabilization intensity, the fourth motion trajectory curve is path-smoothed to obtain a second motion trajectory curve after the path is smoothed; the first foreground stabilization image is obtained according to the second motion trajectory curve and the first foreground image.
  • the above-mentioned video anti-shake device has the function of realizing the above-mentioned video anti-shake method, and this function can be realized by hardware, and can also be realized by executing corresponding software through hardware.
  • the hardware or software includes one or more modules corresponding to the above functions. is software and/or hardware.
  • Embodiments of the present application further provide a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor implements any of the above video anti-shake when executing the computer program method.
  • Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.
  • the embodiments of the present application provide a computer program product, when the computer program product runs on a terminal device, so that the terminal device can implement the steps in the foregoing method embodiments when executed.
  • An embodiment of the present application further provides a chip system, where the chip system includes a processor, the processor is coupled to a memory, and the processor executes a computer program stored in the memory, so as to implement the methods described in the foregoing method embodiments. method.
  • the chip system may be a single chip, or a chip module composed of multiple chips.
  • references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Studio Devices (AREA)

Abstract

本申请实施例公开了一种视频防抖方法、终端设备和计算机可读存储介质,该方法包括: 获取待处理图像,待处理图像为待处理视频中的一帧图像; 对待处理图像进行前后景分割,得到第一后景图像和第一前景图像; 对第一前景图像进行防抖处理,得到第一前景增稳图像; 对第一后景图像进行防抖处理,得到第一后景增稳图像; 将第一前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第一增稳图像。本申请实施例对视频图像进行前后景分割,得到前景图像和后景图像,再分别对前景图像和后景图像进行防抖处理,即前景和后景的防抖处理过程是分离的,不耦合的,进而不用牺牲背景防抖能力,也能实现前景和后景的同时稳像。

Description

视频防抖方法、终端设备和计算机可读存储介质
本申请要求于2020年08月13日提交国家知识产权局、申请号为202010811800.7、申请名称为“视频防抖方法、终端设备和计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种视频防抖方法、终端设备和计算机可读存储介质。
背景技术
目前,常见的视频防抖可以分为电子防抖(Electric Image Stabilization,EIS)和光学防抖(Optical Image Stabilization,OIS)。
一般情况下,EIS无需额外的元器件辅助,其是一种软件补偿算法,原理是通过算法对视频图像进行后期处理,以对所采集的视频图像进行防抖。现有的EIS方法中,首先通过终端设备的陀螺仪,获得视频图像的背景运动信息,视频图像背景运动信息(或称背景抖动)可以是指终端设备的摄像头运动而产生的运动信息。同时,采集视频图像的人脸特征,根据人脸特征计算人脸的平滑运动轨迹。然后,基于背景运动信息和人脸的平滑运动轨迹之间的权重,根据背景运动信息和人脸的平滑运动轨迹,最终得到相机平滑轨迹。最后,根据该相机平滑轨迹对视频图像进行消抖补偿,实现视频前景和后景的同时稳像。
但是,由于背景运动方向和人脸运动方向是相反的,使得背景运动信息和人脸的平滑运动轨迹之间的权重只能折中选取,进而导致前景的防抖效果和后景的防抖效果也是一个折中效果。换句话说,如果需要实现前景防抖,必须牺牲背景防抖能力,进而才能实现前景和后景同时稳像。
发明内容
有鉴于此,本申请实施例提供一种视频防抖方法、终端设备和计算机可读存储介质,以解决现有EIS方法中,需要牺牲背景防抖能力才能实现前景和后景同时稳像的问题。
第一方面,本申请实施例提供一种视频防抖方法,该方法可以应用于终端设备,该终端设备可以例如为手机和平板等,该方法可以包括:获取待处理图像,该待处理图像为待处理视频中的一帧图像;对待处理图像进行前后景分割,得到第一后景图像和第一前景图像;对第一前景图像进行防抖处理,得到第一前景增稳图像;对第一后景图像进行防抖处理,得到第一后景增稳图像;将第一前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第一增稳图像。
本申请实施例通过对视频图像进行前后景分割,得到前景图像和后景图像,再分别对前景图像和后景图像进行防抖处理,即前景和后景的防抖处理过程是分离的,不耦合的,进而不用牺牲背景防抖能力,也能实现前景和后景的同时稳像。
在第一方面的一些可能的实现方式中,对第一前景图像进行防抖处理,得到第一 前景增稳图像的过程可以包括:提取第一前景图像的特征点;根据第一前景图像的特征点和第二前景图像的特征点,得到第一运动轨迹曲线,第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和待处理图像在图像序列上是连续的图像帧;根据待处理图像的前景防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线;根据第二运动轨迹曲线和第一前景图像,得到增稳后的第一前景图像;对增稳后的第一前景图像进行边缘补偿,得到第一前景增稳图像。
增稳后的第一前景图像中,图像边缘会有无意义的像素点生成,进而不利于前后景融合,而该实现方式中,通过对增稳后的第一前景图像进行边缘补偿,使得第一前景增稳图像中不包含黑边,进一步提高后续图像融合得到的视频图像效果。
在第一方面的一些可能的实现方式中,对所述增稳后的第一前景图像进行边缘补偿,得到所述第一前景增稳图像的过程可以包括:
通过
Figure PCTCN2021110028-appb-000001
计算增稳后的第一前景图像中每个待补偿像素点的目标像素值;
将目标像素值作为待补偿像素点的像素值,其它像素点的像素值不变,得到第一前景增稳图像,其它像素点为增稳后的第一前景图像中,除了待补偿像素点之外的像素点;
其中,
Figure PCTCN2021110028-appb-000002
V(q)为q点的目标像素值,q点为待补偿像素点;P(i,j)为第i行第j列的像素点的像素值,w(i,j)为第i行第j列的像素点的高斯权重,σ为超参数,N为正整数。
在该实现方式中,通过高斯权重插值修补图像的边缘黑边。
在第一方面的一些可能的实现方式中,该方法还包括:提取待处理图像的第一边缘带信息,第一边缘带信息为待处理图像中,前景和后景之间的边缘带的像素信息;提取第一增稳图像的第二边缘带信息,第一边缘带信息为第一增稳图像中,前景和后景之间的边缘带的像素信息;根据第一边缘带信息和第二边缘带信息,对下一帧待处理图像或者待处理图像的防抖强度进行调整,防抖强度包括前景防抖强度和/或后景防抖强度。
在该实现方式中,根据原图像(即待处理图像)的边缘带信息和融合图像(即第一增稳图像)的边缘带信息,调整防抖强度,进而自动调节前后景融合时产生的伪像强度,以尽可能地防止过度防抖导致融合时产生明显的隔离带,确保前后景融合边缘处自然过渡。
在第一方面的一些可能的实现方式中,可以根据结构性相似度指标(Structural SIMilarity,SSIM),来调节防抖强度。即根据第一边缘带信息和第二边缘带信息,对下一帧待处理图像或者待处理图像的防抖强度进行调整的过程可以包括:确定第一边缘带信息和第二边缘带信息之间的第一结构性相似度指标;若第一结构性相似度指标未落入预设相似度阈值区间,对下一帧待处理图像或者待处理图像的防抖强度进行调整。若第一结构性相似度指标未落入预设相似度阈值区间,则不用对防抖强度进行调整。
在第一方面的一些可能的实现方式中,若第一结构性相似度指标未落入预设相似度阈值区间,对下一帧待处理图像或者待处理图像的防抖强度进行调整的过程可以包括:若第一结构性相似度指标未落入预设相似度阈值区间,将待处理图像的防抖强度和预设数值相乘,得到调整后的第一防抖强度,将调整后的第一防抖强度作为下一帧待处理图像的防抖强度或者待处理图像的防抖强度。
进一步地,第一结构性相似度指标未落入预设相似度阈值区间可以分为两种情况,一种情况为:第一结构性相似度指标高于预设相似度阈值区间,即第一结构性相似度指标大于预设相似度阈值区间的最大值,此时,则可以减小防抖强度。另一种情况为:第一结构性相似度指标低于预设相似度阈值区间,即第一结构性相似度指标小于预设相似度阈值区间的最小值,此时,则可以增大防抖强度。
示例性的,当需要减小防抖强度时,预设数值为0.9,即待处理图像的防抖强度和0.9相乘,得到的乘积作为调整后的第一防抖强度。当需要增大防抖强度时,预设数值为1.1,即待处理图像的防抖强度和1.1相乘,得到的乘积作为调整后的第一防抖强度。
调整后的第一防抖强度可以作为下一帧图像的前景防抖强度,和/或下一帧图像的后景防抖强度,此时,可以使用调整后的第一防抖强度,对下一帧待处理图像进行前景防抖或者后景防抖。当然,调整后的第一防抖强度也可以作为待处理图像的前景防抖强度和/或后景防抖强度,此时,则根据调整后的第一防抖强度,重新进行前景防抖,或者后景防抖,直到结构性相似度指标落入预设相似度阈值区间。故在第一方面的一些可能的实现方式中,调整后的第一防抖强度包括待处理图像的前景防抖强度;该方法还可以包括:
根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线,第一运动轨迹曲线为根据第一前景图像的特征点和第二前景图像的特征点得到的运动轨迹曲线,第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和待处理图像在图像序列上是连续的图像帧;
根据第三运动轨迹曲线和第一前景图像,得到第二前景增稳图像;
将第二前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第二增稳图像;
确定第二边缘带信息和第三边缘带信息之间的第二结构性相似度指标,第三边缘带信息为第二增稳图像中,前景和后景之间的边缘带的像素信息;
若第二结构性相似度指标未落入预设相似度阈值区间,将调整后的第一防抖强度和预设数值相乘,得到调整后的第二防抖强度;
将调整后的第二防抖强度作为调整后的第一防抖强度,返回根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线的步骤,直到第二结构性相似度指标落入预设相似度阈值区间。
在第一方面的一些可能的实现方式中,对第一后景图像进行防抖处理,得到第一后景增稳图像的过程可以包括:提取第一后景图像的特征点;根据第一后景图像的特征点和第二后景图像的特征点,得到第四运动轨迹曲线,第二后景图像为第二目标图像的后景图像,待处理视频包括第二目标图像,且第二目标图像和待处理图像在图像 序列上是连续的图像帧;根据待处理图像的后景防抖强度,对第四运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线;根据第二运动轨迹曲线和第一前景图像,得到第一前景增稳图像。
第二方面,本申请实施例提供一种视频防抖装置,应用于终端设备,该装置可以包括:
图像获取模块,用于获取待处理图像,待处理图像为待处理视频中的一帧图像;
前后景分割模块,用于对待处理图像进行前后景分割,得到第一后景图像和第一前景图像;
前景防抖模块,用于对第一前景图像进行防抖处理,得到第一前景增稳图像;
后景防抖模块,用于对第一后景图像进行防抖处理,得到第一后景增稳图像;
图像融合模块,用于将第一前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第一增稳图像。
在第二方面的一些可能的实现方式中,前景防抖模块具体用于:提取第一前景图像的特征点;根据第一前景图像的特征点和第二前景图像的特征点,得到第一运动轨迹曲线,第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和待处理图像在图像序列上是连续的图像帧;根据待处理图像的前景防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线;根据第二运动轨迹曲线和第一前景图像,得到增稳后的第一前景图像;对增稳后的第一前景图像进行边缘补偿,得到第一前景增稳图像。
在第二方面的一些可能的实现方式中,前景防抖模块具体用于:通过
Figure PCTCN2021110028-appb-000003
前景增稳图像,其它像素点为增稳后的第一前景图像中,除了待补偿像素点之外的像素点;其中,
Figure PCTCN2021110028-appb-000004
V(q)为q点的目标像素值,q点为待补偿像素点;P(i,j)为第i行第j列的像素点的像素值,w(i,j)为第i行第j列的像素点的高斯权重,σ为超参数,N为正整数。
在第二方面的一些可能的实现方式中,该装置还包括:防抖强度调整模块,用于提取待处理图像的第一边缘带信息,第一边缘带信息为待处理图像中,前景和后景之间的边缘带的像素信息;提取第一增稳图像的第二边缘带信息,第二边缘带信息为第一增稳图像中,前景和后景之间的边缘带的像素信息;根据第一边缘带信息和第二边缘带信息,对下一帧待处理图像或者待处理图像的防抖强度进行调整,防抖强度包括前景防抖强度和/或后景防抖强度。
在第二方面的一些可能的实现方式中,防抖强度调整模块具体用于:确定第一边缘带信息和第二边缘带信息之间的第一结构性相似度指标;若第一结构性相似度指标未落入预设相似度阈值区间,对下一帧待处理图像或者待处理图像的防抖强度进行调整。
在第二方面的一些可能的实现方式中,防抖强度调整模块具体用于:若第一结构性相似度指标未落入预设相似度阈值区间,将待处理图像的防抖强度和预设数值相乘,得到调整后的第一防抖强度,将调整后的第一防抖强度作为下一帧待处理图像的防抖 强度或者待处理图像的防抖强度。
在第二方面的一些可能的实现方式中,调整后的第一防抖强度包括待处理图像的前景防抖强度;装置还包括当前帧防抖结果调整模块,用于:根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线,第一运动轨迹曲线为根据第一前景图像的特征点和第二前景图像的特征点得到的运动轨迹曲线,第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和待处理图像在图像序列上是连续的图像帧;根据第三运动轨迹曲线和第一前景图像,得到第二前景增稳图像;将第二前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第二增稳图像;确定第二边缘带信息和第三边缘带信息之间的第二结构性相似度指标,第三边缘带信息为第二增稳图像中,前景和后景之间的边缘带的像素信息;若第二结构性相似度指标未落入预设相似度阈值区间,将调整后的第一防抖强度和预设数值相乘,得到调整后的第二防抖强度;将调整后的第二防抖强度作为调整后的第一防抖强度,返回根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线的步骤,直到第二结构性相似度指标落入预设相似度阈值区间。
在第二方面的一些可能的实现方式中,后景防抖模块具体用于:提取第一后景图像的特征点;根据第一后景图像的特征点和第二后景图像的特征点,得到第四运动轨迹曲线,第二后景图像为第二目标图像的后景图像,待处理视频包括第二目标图像,且第二目标图像和待处理图像在图像序列上是连续的图像帧;根据待处理图像的后景防抖强度,对第四运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线;根据第二运动轨迹曲线和第一前景图像,得到第一前景增稳图像。
上述视频防抖装置具有实现上述第一方面的视频防抖方法的功能,该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现,硬件或软件包括一个或多个与上述功能相对应的模块,模块可以是软件和/或硬件。
第三方面,本申请实施例提供一种终端设备,包括存储器、处理器以及存储在存储器中并可在所述处理器上运行的计算机程序,该处理器执行计算机程序时实现如上述第一方面任一项所述的视频防抖方法。
第四方面,本申请实施例提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时实现如上述第一方面任一项的视频防抖方法。
第五方面,本申请实施例提供一种芯片系统,该芯片系统包括处理器,该处理器与存储器耦合,该处理器执行存储器中存储的计算机程序,以实现如上述第一方面任一项的视频防抖方法。该芯片系统可以为单个芯片,或者多个芯片组成的芯片模组。
第六方面,本申请实施例提供一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行上述第一方面任一项所述的视频防抖方法。
可以理解的是,上述第二方面至第六方面的有益效果可以参见上述第一方面中的相关描述,在此不再赘述。
附图说明
图1为本申请实施例提供的终端设备100的结构示意图;
图2为本申请实施例的终端设备100的软件结构示意框图;
图3为本申请实施例提供的手机录像界面示意图;
图4为本申请实施例提供的视频防抖过程示意框图;
图5为本申请实施例提供的视频图像帧的防抖处理过程示意图;
图6为本申请实施例提供的视频防抖过程的另一种示意框图;
图7为本申请实施例提供的视频防抖过程的又一种示意图;
图8为本申请实施例提供的视频防抖过程的又一种示意图;
图9为本申请实施例提供的后景防抖过程示意图;
图10为本申请实施例提供的视频防抖过程的一种具体流程示意图;
图11为本申请实施例提供的视频防抖过程的另一种具体流程示意图;
图12为本申请实施例提供的边缘补偿示意图;
图13为本申请实施例提供的视频防抖过程的又一种具体流程示意图;
图14为本申请实施例提供的视频防抖过程的又一种具体流程示意图;
图15为本申请实施例提供的视频防抖过程的又一种具体流程示意图;
图16为本申请实施例提供的视频防抖方法的流程示意框图;
图17为本申请实施例提供的视频防抖装置的示意框图。
具体实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。
下面首先对本申请实施例提供的终端设备进行示例性介绍。
本申请实施例提供的视频防抖方案可以应用于终端设备,该终端设备可以是具备图像拍摄功能,且具备数据处理能力的终端设备,这类终端设备一般可以包括摄像头,通过该摄像头,终端设备可以拍摄得到视频图像;当然,该终端设备也可以是不具备图像拍摄功能,但具备数据处理能力的终端设备,这种情况下,终端设备可以接收其它设备拍摄的视频图像。
该终端设备可以是手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、以及个人数字助理(personal digital assistant,PDA)等终端设备,本申请实施例对终端设备的具体类型不作任何限制。
示例性的,参见图1示出本申请实施例提供的终端设备100的结构示意图。
如图1所示,终端设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对终端设备100的具体限定。在本申请的另一些实施例中,终端设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器110可以通过不同的I2C总线接口分别耦合触摸传感器180K,充电器,闪光灯,摄像头193等。例如:处理器110可以通过I2C接口耦合触摸传感器180K,使处理器110与触摸传感器180K通过I2C总线接口通信,实现终端设备100的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。处理器110可以通过I2S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。在一些实施例中,音频模块170可以通过I2S接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块160可以通过PCM总线接口耦合。在一些实施例中,音频模块170也可以通过PCM接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接 口通常被用于连接处理器110与无线通信模块160。例如:处理器110通过UART接口与无线通信模块160中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块170可以通过UART接口向无线通信模块160传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器110与显示屏194,摄像头193等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器110和摄像头193通过CSI接口通信,实现终端设备100的拍摄功能。处理器110和显示屏194通过DSI接口通信,实现终端设备100的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头193,显示屏194,无线通信模块160,音频模块170,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为终端设备100充电,也可以用于终端设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他终端设备,例如AR设备等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对终端设备100的结构限定。在本申请另一些实施例中,终端设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过终端设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为终端设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
终端设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。终端设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在终端设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低 噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在终端设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,终端设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得终端设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
终端设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic  light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,终端设备100可以包括1个或N个显示屏194,N为大于1的正整数。
终端设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,终端设备100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当终端设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。终端设备100可以支持一种或多种视频编解码器。这样,终端设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现终端设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展终端设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储终端设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行终端设备100的各种功能应用以及数据处理。
终端设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。终端设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当终端设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。终端设备100可以设置至少一个麦克风170C。在另一些实施例中,终端设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,终端设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动终端设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。终端设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,终端设备100根据压力传感器180A检测所述触摸操作强度。终端设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定终端设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定终端设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门或者按下录像按钮,陀螺仪传感器180B检测终端设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消终端设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,终端设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器180D包括霍尔传感器。终端设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当终端设备100是翻盖机时,终端设备100可以根据磁传感器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器180E可检测终端设备100在各个方向上(一般为三轴)加速度的大小。当终端设备100静止时可检测出重力的大小及方向。还可以用于识别终端设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。终端设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,终端设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。终端设备100通过发光二极管向外发射红外光。终端设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定终端设备100附近有物体。当检测到不充分的反射光时,终端设备100可以确定终端设备100附近没有物体。终端设备100可以利用接近光传感器180G检测用户手持终端设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。终端设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测终端设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。终端设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,终端设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,终端设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,终端设备100对电池142加热,以避免低温导致终端设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,终端设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控器件”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于终端设备100的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于所述骨传导传感器180M获取的声部振动骨 块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。终端设备100可以接收按键输入,产生与终端设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和终端设备100的接触和分离。终端设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。终端设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,终端设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在终端设备100中,不能和终端设备100分离。
终端设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明终端设备100的软件结构。
图2是本申请实施例的终端设备100的软件结构示意框图。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。
应用程序层可以包括一系列应用程序包。
如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供终端设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,终端设备振动,指示灯闪烁等。
Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
下面结合捕获拍照场景,示例性说明终端设备100软件以及硬件的工作流程。
参见图3示出的本申请实施例提供的手机录像界面示意图,终端设备100的触摸传感器180K接收到触摸操作,相应的硬件中断被发给内核层。内核层将触摸操作加工成原始输入事件(包括触摸坐标,触摸操作的时间戳等信息)。原始输入事件被存储在内核层。应用程序框架层从内核层获取原始输入事件,识别该输入事件所对应的控件。以该触摸操作是触摸单击操作,该单击操作所对应的控件为相机应用图标的控件31为例,相机应用调用应用框架层的接口,启动相机应用,进而通过调用内核层启动摄像头驱动,通过摄像头193捕获静态图像或视频。
当终端设备100接收到用户针对控件32的触摸操作时,终端设备100响应于该触 摸操作,进行录像,终端设备100将捕获到视频图像通过显示屏194显示视频图像33。当然,终端设备100在捕获视频图像的时候,也可以通过陀螺仪传感器180B采集视频图像的背景运动信息。
终端设备100捕获到视频图像之后,可以对每一帧视频图像均采用本申请实施例提供的视频防抖方法进行防抖处理,防抖处理之后,得到前景和后景同时稳像的视频。下面将对本申请实施例提供的视频防抖过程进行介绍说明。
参见图4示出的本申请实施例提供的视频防抖过程示意框图,如图4所示,首先,终端设备100获取到输入视频,该输入视频可以是终端设备100响应于用户的录像操作,通过摄像头193捕获的视频,此时,该摄像头193可以是前置摄像头,也可以是后置摄像头,即终端设备100可以对前置摄像头拍摄的视频进行防抖处理,也可以对后置摄像头拍摄的视频进行防抖处理。该输入视频包括多帧图像。
当然,在其它一些实施例中,终端设备100也可以通过接收其它设备已拍摄好的视频,以获取到输入视频。例如,终端设备100接收其它一台手机已拍摄好的视频。
终端设备100针对输入视频中的每一帧图像,进行图像前后景分割,得到前景图像和后景图像。然后,终端设备100对前景图像进行防抖处理,得到防抖后的前景图像;对后景图像进行防抖处理,得到防抖后的后景图像。最后,将防抖后的前景图像和防抖后的后景图像进行图像融合,得到防抖后的图像。
示例性地,参见图5,为本申请实施例提供的视频图像帧的防抖处理过程示意图。如图5所示,终端设备100所获取到输入视频中的某一帧图像为图像51,将图像51进行前后景分割操作之后,得到前景图像52和后景图像53。针对前景图像52,先进行特征提取,然后经过防抖模块54,得到防抖后的前景图像。针对后景图像53,经过防抖模块55后,得到防抖后的后景图像56。将防抖后的后景图像56和防抖后的前景图像进行图像融合,输出视频图像57,视频图像57是指前后景同时稳像的图像。
终端设备100针对每一帧输入的视频图像均进行防抖处理,得到一帧帧防抖后的图像。一帧帧防抖后的图像组成视频,得到前后景同时稳像的视频。
在一些实施例中,在获取到防抖后的视频图像之后,终端设备100还可以根据输入视频图像的边缘带信息和输出视频图像的边缘带信息,对防抖强度进行调节。其中,输入视频图像的边缘带信息可以是指在前后景分割时,提取出的边缘带的像素信息,而输出视频图像的边缘带信息可以是指图像融合后的图像中,前景和后景之间的边缘带的像素信息。边缘带可以是指图像中前景和后景之间的边缘部分。
参见图6,为本申请实施例提供的视频防抖过程的另一种示意框图。如图6所示,与图4类似的是,终端设备100首先获取输入视频;然后,针对每一帧视频图像,进行前后景分割,得到前景图像和后景图像;接着,对后景图像进行后景防抖处理,得到防抖后的后景图像。对前景图像进行防抖处理,得到防抖后的前景图像;最后,将防抖后的后景图像和防抖后的前景图像进行图像融合,得到防抖后的视频图像。
与图4不同的是,图6在前后景分割时,还可以提取出待处理图像的前后景之间的边缘带的像素信息,记为第一边缘带信息。另外,还可以从防抖后的视频图像(即融合得到的图像)中,提取前后景之间的边缘带的像素信息,记为第二边缘带信息。根据第一边缘带信息和第二边缘带信息,对防抖范围约束进行反馈控制,即对防抖强度的 大小进行调整。防抖范围可以是指防抖强度的大小。防抖强度可以包括前景防抖强度和后景防抖强度,即可以根据第一边缘带信息和第二边缘带信息,对前景防抖强度和后景防抖强度进行调整。
在其它一些实施例中,根据第一边缘带信息和第二边缘带信息,对前景防抖强度或后景防抖强度进行调整。参见图7和图8,均为本申请实施例提供的视频防抖过程的又一种示意图。在图7中,根据第一边缘带信息和第二边缘带信息,对前景防抖强度进行调整。调整后的前景防抖强度,作用于前景防抖过程。在图8中,根据第一边缘带信息和第二边缘带信息,对后景防抖强度进行调整,调整后的后景防抖强度作用与后景防抖过程。
此时,图7中还示例性地示出了前景防抖的方式是全帧防抖。示例性地,该全帧防抖过程可以包括前景图像特征点提取、路径平滑和边缘补偿等过程。其中,前景图像的特征点提取方法可以是任意的,例如,通过光流法对前景图像进行特征点提取。路径平滑是指根据防抖强度,对前景特征点的运动轨迹曲线进行路径平滑。边缘补偿可以是指对图像边缘部分进行像素补偿,通过边缘补偿,可以消除或减少防抖处理后的前景图像的边缘黑边,更利于后续前景图像和后景图像的融合。边缘补偿的方式可以是任意的,例如,通过高斯权重插值来修补图像边缘黑边。
值得指出的是,前景图像的防抖方法和后景图像的防抖方法均可以是任意的,即可以采用任意的视频防抖方法,对前景图像和后景图像进行防抖处理。例如,对于后景图像,可以使用陀螺仪传感器数据进行防抖处理。具体地,终端设备100录像时,可以读取陀螺仪传感器180B输出的数据,并对陀螺仪数据进行角度积分,得到视频图像的背景运动信息。然后再根据背景运动信息,对后景图像进行消抖补偿,以实现后景稳像,即对后景进行防抖处理。参见图9示出的后景防抖过程示意图,对陀螺仪输出的数据进行3维旋转矢量估计,3维旋转矢量平滑,运动补偿量,以及图像仿射变换(Warp)输出等过程,得到防抖后的后景图像。在该过程中,后景防抖强度可以作用于3维旋转矢量平滑过程。
为了更好地介绍本申请实施例提供的视频防抖方案,下面将结合图10进行介绍说明。
参见图10,为本申请实施例提供的视频防抖过程的一种具体流程示意图。如图10所示,针对前景图像,采用全帧稳像(或称全帧防抖)方式进行防抖处理,得到增稳前景图像(即防抖后的前景图像)。其中,图10中的全帧稳像过程包括前景图像特征点提取,路径平滑,边缘补偿等过程。具体地,可以对多帧前景图像的特征点进行提取,根据多帧前景图像的特征点,得到前景特征点的运动轨迹曲线(或称前景特征点路径)。该多帧前景图像是连续多帧图像的前景图像。例如,终端设备100对当前帧图像进行前后景分割,得到当前帧的前景图像和后景图像。当前时刻,终端设备100缓存有10帧连续的图像,该10帧连续的图像中包括当前帧图像,以当前帧图像为界限,往前和往后分别取n帧图像,获得取出来的连续的多帧图像的前景图像;再分别提取这多张前景图像的特征点;根据多张前景图像的特征点,得到前景特征点路径。
得到前景特征点路径之后,使用当前帧图像的前景防抖强度,对前景特征点路径进行路径平滑,得到平滑后的前景特征点路径。然后,再根据路径平滑后的前景特征 点路径和当前帧的前景图像,得到增稳前景图像。
针对后景图像,使用当前帧图像的后景防抖强度,进行路径平滑,得到增稳后景图像(即防抖后的后景图像)。其中,可以通过特征点提取的方式,得到后景图像特征点的运动轨迹曲线。例如,提取当前帧的后景图像的特征点,依赖于在图像序列上连续的多帧后景图像的特征点,得到后景图像的特征点路径,再使用后景图像的抖动强度,对该后景图像的特征点路径进行路径平滑。当然,后景图像的运动轨迹曲线还可以通过其它方式得到,例如,可以通过陀螺仪数据得到后景图像的运动轨迹曲线。
在视频图像前后景分割时,可以提取出边缘带的像素信息,得到第一边缘带信息。而通过将增稳前景图像和增稳后景图像进行图像融合,得到前后景同时稳像的视频图像。同时,还可以提取出融合得到的视频图像中的边缘带的像素信息,得到第二边缘带信息。
计算第一边缘带信息和第二边缘带信息之间的相似度,以通过相似度高低,评价输入视频图像的前后景边缘部分和融合得到视频图像的前后景边缘部分之间的差异,根据差异大小,调节当前帧图像的防抖强度或者下一帧图像的防抖强度。
具体应用中,通过计算第一边缘带信息和第二边缘带信息之间的结构性相似度指标,再根据结构性相似度指标,对防抖强度进行调整或调节。具体来说,在计算出结构性相似度指标之后,可以判断该结构性相似度指标是否落入预设相似度阈值区间。如果没有落入,且结构性相似度指标大于预设相似度阈值区间的最大值,则减小防抖强度,直到结构性相似度指标落入该预设相似度阈值区间;如果没有落入,且结构性相似度指标小于预设相似度阈值区间的最小值,则增大防抖强度,直到结构性相似度指标落入该预设相似度阈值区间。
其中,实现增大或减小防抖强度的方式可以是任意的。在一些实施例中,可以将当前的防抖强度乘以对应系数,以增大或减小防抖强度。一般情况下,需要增大防抖强度时,对应系数大于1,需要减小防抖强度时,对应系数小于1。例如,当前的防抖强度为A,需要增大防抖强度时,则将当前的防抖强度A乘以1.1,将得到的乘积作为调整后的防抖强度;需要减小防抖强度时,则将当前的防抖强度A乘以0.9,将得到的乘积作为调整后的防抖强度。
在图10中,示出了防抖强度调节后,可以作用于前景防抖过程中的路径平滑,以及后景防抖过程中的路径平滑。实际应用中,调整后的防抖强度可以包括当前帧图像的前景防抖强度和/或后景防抖强度,或者下一帧图像的前景防抖强度和/或后景防抖强度,相应地,调整后的防抖强度可以作用于当前帧图像的前景防抖过程和/后景防抖过程,或者,作用于下一帧图像的前景防抖过程和/或后景防抖过程。下面将结合附图对各种情况进行介绍说明。
参见图11,为本申请实施例提供的视频防抖过程的另一种具体流程示意图。如图11所示,使用深度神经网络模型(Deep Neural Networks,DNN)对输入视频图像进行前后景分割,使用光流法提取前景图像的特征点,使用高斯权重插值修补边缘黑边,以对图像进行边缘补充。另外,计算两个边缘带信息之间的结构性相似度指标,根据结构性相似度指标对下一帧图像的前景防抖强度进行调节,即使用调节后的前景防抖强度作用于下一帧图像的前景特征点路径平滑过程。
可以理解的是,图11中示出的前后景分割方式、前景图像特征点提取方式和边缘补偿方式,均是一种示例方式。下面结合图12示出的边缘补偿示意图,对高斯权重边缘补偿方式进行介绍。
如图12所示,增稳后的前景图像中,边缘121为有意义像素点和无意义像素点之间的界线,即以边缘121为分界线,图12左边的像素点为有意义像素点,右边的像素点为无意义像素点。无意义像素点可以是指形成边缘黑边的像素点。对前景图像进行防抖处理后,边缘会生成无意义像素点,无意义像素点不利于前后景融合。为了进一步提高融合后的视频图像的效果,可以对无意义像素点进行边缘补偿,即对图像边缘部分进行像素补偿。
图12中的q点为无意义像素点,需要对q点进行像素补充。该q点的像素值可以由周围的像素点,根据高斯权重叠加计算得到。以图12中的区域122作为q点的周围区域,根据区域122中的各个像素点的像素值,得到q点的像素值。具体地,通过
Figure PCTCN2021110028-appb-000005
素点的像素值,w(i,j)为第i行第j列的像素点的高斯权重,σ为超参数,N为正整数。
依此遍历每个无意义像素点,计算出每个无意义像素点的像素值,将所计算出来的像素值作为无意义像素点的像素值,有意义像素点的像素值不变,即可得到边缘补偿后的增稳前景图像。
在图12中,计算出结构性相似度指标之后,确定该结构性相似度指标是否落入预设相似度阈值区间。如果没有落入,则对前景防抖强度进行调节,直到结构性相似度指标落入预设相似度阈值区间。
举例来说,当前帧图像的前景防抖强度为A,后景防抖强度为B。根据当前帧图像的前景防抖强度和后景防抖强度,分别对前景图像和后景图像进行防抖处理后,计算得到两个边缘带信息的结构性相似度指标。此时,当前帧图像的结构性相似度指标没有落入预设相似度阈值区间,且该结构性相似度指标高于预设相似度阈值区间的最大值。将当前帧图像的前景防抖强度A乘以0.9,得到调整后的前景防抖强度为0.9A,即下一帧图像的前景防抖强度为0.9A。
将当前帧图像的增稳前景图像和增稳后景图像进行图像融合,得到当前帧图像的增稳图像(即防抖后的图像),输出该增稳图像。
然后,获取下一帧图像,并对该下一帧图像进行前后景分割,得到前景图像和后景图像。对后景图像进行防抖处理,得到下一帧图像的增稳后景图像,此时,后景防抖强度为B。对前景图像进行防抖处理,得到下一帧图像的增稳前景图像,此时,前景防抖强度为0.9A。接着,再将下一帧图像的增稳后景图像和增稳前景图像进行图像融合,得到下一帧图像的增稳图像,并输出该增稳图像。根据下一帧图像的第一边缘带信息和第二边缘带信息,计算得到下一帧图像的结构性相似度指标。如果下一帧图像的结构性相似度指标落入预设相似度阈值区间,则不对前景防抖强度调节,即下下一帧图像的前景防抖强度为0.9A。如果下一帧图像的结构性相似度指标没有落入预设相似度阈值区间,则基于前景防抖强度为0.9A,得到下下一帧图像的前景防抖强度。
在一些实施例中,还可以根据当前帧图像的结构相似度指标,对下一帧图像的后景防抖强度进行调节,下面结合图13示出的本申请实施例提供的视频防抖过程的又一种具体流程示意图进行介绍说明。
如图13所示,针对当前帧图像,计算出边缘带信息之间的结构性相似度指标之后,根据该结构性相似度指标,调节下一帧图像的后景防抖强度,即调整后的后景防抖强度作用于下一帧图像的后景防抖过程。图13和图11类似的部分请参见上文图11的介绍,在此不再赘述。
举例来说,当前帧图像的前景防抖强度为A,后景防抖强度为B。根据当前帧图像的前景防抖强度和后景防抖强度,分别对前景图像和后景图像进行防抖处理后,计算得到两个边缘带信息的结构性相似度指标。此时,当前帧图像的结构性相似度指标没有落入预设相似度阈值区间,且该结构性相似度指标高于预设相似度阈值区间的最大值。将当前帧图像的后景防抖强度B乘以0.8,得到调整后的后景防抖强度为0.8B,即下一帧图像的后景防抖强度为0.8B。
将当前帧图像的增稳前景图像和增稳后景图像进行图像融合,得到当前帧图像的增稳图像(即防抖后的图像),输出该增稳图像。
然后,获取下一帧图像,并对该下一帧图像进行前后景分割,得到前景图像和后景图像。对后景图像进行防抖处理,得到下一帧图像的增稳后景图像,此时,后景防抖强度为0.8B。对前景图像进行防抖处理,得到下一帧图像的增稳前景图像,此时,前景防抖强度为A。接着,再将下一帧图像的增稳后景图像和增稳前景图像进行图像融合,得到下一帧图像的增稳图像,并输出该增稳图像。根据下一帧图像的第一边缘带信息和第二边缘带信息,计算得到下一帧图像的结构性相似度指标。如果下一帧图像的结构性相似度指标落入预设相似度阈值区间,则不对后景防抖强度调节,即下下一帧图像的后景防抖强度为0.8B。如果下一帧图像的结构性相似度指标没有落入预设相似度阈值区间,则基于后景防抖强度为0.8B,得到下下一帧图像的后景防抖强度。
在一些实施例中,还可以根据当前帧图像的结构相似度指标,同时对下一帧图像的后景防抖强度和前景防抖强度进行调节,下面结合图14示出的本申请实施例提供的视频防抖过程的又一种具体流程示意图进行介绍说明。
如图14所示,针对当前帧图像,据该结构性相似度指标,调节下一帧图像的后景防抖强度和前景防抖强度,即调整后的后景防抖强度作用于下一帧图像的后景防抖过程,调整后的前景防抖强度作用于下一帧图像的前景防抖过程。
举例来说,当前帧图像的前景防抖强度为A,后景防抖强度为B。根据当前帧图像的前景防抖强度和后景防抖强度,分别对前景图像和后景图像进行防抖处理后,计算得到两个边缘带信息的结构性相似度指标。此时,当前帧图像的结构性相似度指标没有落入预设相似度阈值区间,且该结构性相似度指标高于预设相似度阈值区间的最大值。将当前帧图像的后景防抖强度B乘以0.8,得到调整后的后景防抖强度为0.8B,即下一帧图像的后景防抖强度为0.8B。将当前帧图像的前景防抖强度A乘以0.9,得到调整后的前景防抖强度为0.9A,即下一帧图像的前景防抖强度为0.9A。
将当前帧图像的增稳前景图像和增稳后景图像进行图像融合,得到当前帧图像的增稳图像(即防抖后的图像),输出该增稳图像。
然后,获取下一帧图像,并对该下一帧图像进行前后景分割,得到前景图像和后景图像。对后景图像进行防抖处理,得到下一帧图像的增稳后景图像,此时,后景防抖强度为0.8B。对前景图像进行防抖处理,得到下一帧图像的增稳前景图像,此时,前景防抖强度为0.9A。接着,再将下一帧图像的增稳后景图像和增稳前景图像进行图像融合,得到下一帧图像的增稳图像,并输出该增稳图像。根据下一帧图像的第一边缘带信息和第二边缘带信息,计算得到下一帧图像的结构性相似度指标。如果下一帧图像的结构性相似度指标落入预设相似度阈值区间,则不对后景防抖强度调节,即下下一帧图像的后景防抖强度为0.8B,前景防抖强度仍然为0.9A。如果下一帧图像的结构性相似度指标没有落入预设相似度阈值区间,则基于后景防抖强度0.8B和前景防抖强度0.9A,分别进行调整,得到下下一帧图像的后景防抖强度和后景防抖强度。
在一些实施例中,还可以对当前帧图像的前景防抖强度和/或后景防抖强度,具体可以参见图15示出的本申请实施例提供的视频防抖过程的又一种具体流程示意图。图15和图11的类似部分在此不再赘述。
当根据当前帧图像的结构性相似度指标,对当前帧图像的防抖强度进行调整时,如果当前帧图像的结构性相似度指标没有落入预设相似度阈值区间内,则不输出融合得到的视频图像,而是调整前景防抖强度和/或后景防抖强度,直到当前帧图像的结构相似度指标落入预设相似度阈值区间内时,才输出融合得到的视频图像。
举例来说,此时,对当前帧图像的前景防抖强度进行调节。第一次计算中,当前帧图像的前景防抖强度为A,后景防抖强度为B。
根据当前帧图像的前景防抖强度和后景防抖强度,分别对前景图像和后景图像进行防抖处理后,计算得到两个边缘带信息的结构性相似度指标。此时,当前帧图像的结构性相似度指标没有落入预设相似度阈值区间,且该结构性相似度指标高于预设相似度阈值区间的最大值。将当前帧图像的前景防抖强度A乘以0.9,得到调整后的前景防抖强度为0.9A,即下一次的前景防抖过程的前景防抖强度为0.9A。
将当前帧图像的增稳前景图像和增稳后景图像进行图像融合,得到第一次计算的增稳图像(即防抖后的图像),不输出该增稳图像。
然后,使用调整后的前景防抖强度进行第二次计算,第二次计算可以不用再次重复第一次计算过程中的一些步骤。此时,可以使用调整后的前景防抖强度,再次对前景图像进行防抖处理。即使用调整后的防抖强度对前景特征点路径进行平滑处理,以得到第二次计算的增稳前景图像。接着,将第二次计算得到增稳前景图像和第一次计算得到的增稳后景图像,得到第二次计算的增稳图像。根据第二次计算的第二边缘带信息和第一次计算得到的第一边缘带信息,计算结构性相似度指标。如果此次的结构性相似度指标落入预设相似度阈值区间,则不对前景防抖强度进行调整,输出第二次计算的增稳图像,即将第二次计算的增稳图像作为当前帧图像的输出视频图像。如果此次的结构性相似度指标没有落入预设相似度阈值区间,则再次对前景防抖强度过程进行调整,使用调整后的前景防抖强度进行第三次计算。依此循环,直到某一次计算的结构性相似度指标落入预设相似度阈值区间,则将该次的增稳图像作为当前帧图像的输出视频图像。
又举例来说,此时,对当前帧图像的后景防抖强度进行调节。第一次计算中,当 前帧图像的前景防抖强度为A,后景防抖强度为B。第一次计算中,结构性相似度指标没有落入预设相似度阈值区间,将当前帧图像的后景防抖强度B乘以0.8,得到调整后的后景防抖强度为0.8B,即下一次的后景防抖过程的后景防抖强度为0.8B。
使用调整后的防抖强度再次作用于后景图像的路径平滑,得到第二次计算的增稳后景图像。使用第二次计算的增稳后景图像和第一次计算的增稳前景图像进行融合,得到第二次计算的增稳图像。计算第二次计算的第二边缘带信息和第一次计算的第一边缘带信息之间的结构性相似度指标。如果第二次计算的结构性相似度指标落入预设相似度阈值区间,则将第二次计算的增稳图像作为当前帧图像的输出视频图像(即防抖处理后的图像)。如果第二次计算的结构性相似度指标没有落入预设相似度阈值区间,则再次对后景防抖强度进行调整,并使用调整后的后景防抖强度进行下一次计算,直到某一次的结构性相似度指标没有落入预设相似度阈值区间。
又举例来说,此时,对当前帧图像的后景防抖强度和前景防抖强度进行调节。第一次计算中,当前帧图像的前景防抖强度为A,后景防抖强度为B。第一次计算中,结构性相似度指标没有落入预设相似度阈值区间,将当前帧图像的后景防抖强度B乘以0.8,得到调整后的后景防抖强度为0.8B,即下一次的后景防抖过程的后景防抖强度为0.8B。将当前帧图像的前景防抖强度A乘以0.9,得到调整后的前景防抖强度为0.9A,即下一次的前景防抖过程的前景防抖强度为0.9A。
使用调整后的防抖强度进行第二次计算。具体地,使用0.9A对前景图像特征点路径进行路径平滑,得到第二计算的增稳前景图像。使用0.8B对后景图像的运动轨迹曲线进行路径平滑,得到第二次计算的增稳后景图像。将第二计算的增稳前景图像和第二次计算的增稳后景图像进行融合,得到第二次的增稳图像。根据第二次计算的第二边缘带信息和第一次计算的第一边缘带信息,计算第二次的结构相似度指标。如果第二次计算的结构性相似度指标落入预设相似度阈值区间,则将第二次计算的增稳图像作为当前帧图像的输出视频图像(即防抖处理后的图像)。如果第二次计算的结构性相似度指标没有落入预设相似度阈值区间,则再次对防抖强度进行调整,并使用调整后的防抖强度进行下一次计算,直到某一次的结构性相似度指标没有落入预设相似度阈值区间。
由上可见,根据相似度指标,调节当前帧图像的前景防抖强度和/或后景防抖强度,或者调节下一帧图像的前景防抖强度和/或后景防抖强度,可以自动调节前后景融合时产生的伪像强度,防止过度防抖导致融合时产生明显的隔离带,让前后景融合边缘处的过渡更加自然。
防抖强度可以作用于路径平滑过程。防抖强度越大,曲线平滑强度(或称曲线平滑程度)越大,防抖强度越小,曲线平滑强度越小。
具体地,当相似度指标低于预设相似度阈值区间时,增大防抖强度,即增强曲线平滑强度(或称路径平滑强度)。当相似度指标高于预设相似度阈值区间时,则减小防抖强度,即减小曲线平滑强度(或称路径平滑强度)。
路径平滑的权重可以根据高斯权重进行调节。由高斯分布可知,高斯权重sigma越大,曲线越平滑。防抖强度可以等同于高斯权重sigma。
防抖强度越大,视频防抖效果越好,视频图像边缘无意义像素点越多,视场角越 小。反之,防抖强度越小,视频防抖效果越差,视频图像边缘无意义像素点越少,视场角越大。调节防抖强度,使得视场角和无意义像素点达到一个合适的区间。
为了更好地介绍视频防抖方案,下面将结合流程图进行介绍说明。
参见图16,为本申请实施例提供的视频防抖方法的流程示意框图,该方法可以包括以下步骤:
步骤S1601、终端设备获取待处理图像,待处理图像为待处理视频中的一帧图像。
具体应用中,终端设备获取待处理图像的方式可以是任意的。例如,参见图3,终端设备为手机100,手机100在接收到用户针对控件32的触发操作时,该触发操作用于指示手机100进行录像,响应于该触发操作,捕获到视频图像,以获取到待处理图像。对该待处理图像进行防抖处理后,再显示在手机100的显示屏上。
步骤S1602、终端设备对待处理图像进行前后景分割,得到第一后景图像和第一前景图像。
可以理解的是,对一帧图像进行前后景分割的方式是任意的,在此不对前后景分割方式作限定。
需要指出的是,本申请实施例的前景图像可以是人脸,也可以不是人脸,例如,前景图像为图5示出的前景。
步骤S1603、终端设备对第一前景图像进行防抖处理,得到第一前景增稳图像。
其中,第一前景图像的防抖方式可以任意的。而在一些实施例中,为了消除防抖处理后的图像中的边缘黑边,可以使用全帧防抖的方式对前景图像进行防抖处理。具体地,先提取第一前景图像的特征点,再根据第一前景图像的特征点和第二前景图像的特征点,得到第一运动轨迹曲线,该第一运动轨迹曲线可以为前景图像的特征点路径,或者前景图像的抖动轨迹。该第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和所述待处理图像在图像序列上是连续的图像帧。第一目标图像一般是多帧图像。
例如,在图像序列上,依据时间先后顺序,分别存在以下图像帧:图像1、图像2、图像3、图像4…图像n,n为正整数。某个时刻,待处理图像为图像5,此时,第一目标图像可以包括图像1、图像2、图像3和图像4,以及图像6,图像7、图像8以及图像9。分别提取图像1、图像2、图像3和图像4,以及图像6,图像7、图像8以及图像9的前景图像,再提取出这些前景图像的特征点。基于待处理图像的前景特征点和第一目标图像的前景特征点,得到前景图像的抖动轨迹。
得到第一运动轨迹曲线之后,根据待处理图像的前景防抖强度,对所述第一运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线。再根据第二运动轨迹曲线和所述第一前景图像,得到增稳后的第一前景图像。最后,对增稳后的第一前景图像进行边缘补偿,得到所述第一前景增稳图像。
其中,边缘补偿可以使用图12对应的高斯权重插补的方式,具体内容可以参见上文相应内容,在此不再赘述。
在另一些实施例中,全帧防抖方式中,也可以不进行边缘补偿。此时,当得到上述的增稳后的第一前景图像后,将该增稳后的第一前景图像作为第一前景增稳图像。
步骤S1604、终端设备对第一后景图像进行防抖处理,得到第一后景增稳图像。
具体应用中,后景图像的防抖处理方式也是任意的。例如,先提取第一后景图像的特征点。再根据第一后景图像的特征点和第二后景图像的特征点,得到第四运动轨迹曲线。第二后景图像为第二目标图像的后景图像,待处理视频包括第二目标图像,且第二目标图像和待处理图像在图像序列上是连续的图像帧。第二目标图像和上述第一目标图像类似,均是图像序列上连续的多帧图像,在此不再赘述。最后,根据待处理图像的后景防抖强度,对第四运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线。再根据第二运动轨迹曲线和第一前景图像,得到第一前景增稳图像。
可以理解的是,步骤S1603和步骤S1604之间的先后顺序可以是任意的,步骤S1603和步骤S1604也可以同时执行,在此不对这两个步骤的执行顺序作限制。
步骤S1605、终端设备将第一前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第一增稳图像。
需要说明的是,融合得到的第一增稳图像可以作为待处理图像的输出视频图像(即防抖处理后的视频图像)。当然,在其它一些实施例中,如果需要对当前帧图像的防抖强度进行调节,该第一增稳图像可能不是待处理图像的输出视频图像。
进一步地,为了防止过渡防抖导致图像融合时产生明显的隔离带,影响用户体验,可以根据相似度指标,来对防抖强度进行调节。
具体应用中,可以先提取第一增稳图像的第一边缘带信息,第一边缘带信息为第一增稳图像中,前景和后景之间的边缘带的像素信息。该第一边缘带信息可以是在图像前后景分割过程中提取出的。
再提取待处理图像的第二边缘带信息,第二边缘带信息为待处理图像中,前景和后景之间的边缘带的像素信息。该第二边缘带信息可以从融合后的图像中得到。
接着,根据第一边缘带信息和第二边缘带信息,对下一帧待处理图像或者待处理图像的防抖强度进行调整,防抖强度包括前景防抖强度和/或后景防抖强度。
更具体地,可以通过计算第一边缘带信息和第二边缘带信息之间的第一结构性相似度指标。根据第一结构性相似度指标,来确定是否对防抖强度进行调节。其中,若第一结构性相似度指标未落入预设相似度阈值区间,对下一帧待处理图像或者待处理图像(即当前帧图像)的防抖强度进行调整。具体调整过程可以参见上文的相应内容,在此不再赘述。
其中,如果对当前帧图像的前景防抖强度进行调节的过程中,可能进行多次计算。当某次计算的结构性相似度指标落入预设相似度阈值区间时,才将该次对应的增稳图像作为当前帧图像的输出视频图像。
具体地,根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线。第一运动轨迹曲线的内容可以参见上文相应内容,在此不再赘述。然后,根据第三运动轨迹曲线和第一前景图像,得到第二前景增稳图像。接着,将第二前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第二增稳图像。确定第二边缘带信息和第三边缘带信息之间的第二结构性相似度指标,第三边缘带信息为第二增稳图像中,前景和后景之间的边缘带的像素信息。
若第二结构性相似度指标未落入预设相似度阈值区间,将调整后的第一防抖强度和预设数值相乘,得到调整后的第二防抖强度。将调整后的第二防抖强度作为调整后 的第一防抖强度,返回根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线的步骤,直到第二结构性相似度指标落入预设相似度阈值区间。依此循环,直到第二结构性相似度指标落入预设相似度阈值区间,将对应的第二增稳图像作为待处理图像的输出视频图像。
对应于上文的方法实施例,本申请实施例提供一种视频防抖装置,应用于终端设备。参见图17示出的本申请实施例提供的视频防抖装置的示意框图,该装置可以包括:
图像获取模块171,用于获取待处理图像,待处理图像为待处理视频中的一帧图像。
前后景分割模块172,用于对待处理图像进行前后景分割,得到第一后景图像和第一前景图像。
前景防抖模块173,用于对第一前景图像进行防抖处理,得到第一前景增稳图像。
后景防抖模块174,用于对第一后景图像进行防抖处理,得到第一后景增稳图像。
图像融合模块175,用于将第一前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第一增稳图像。
在一些可能的实现方式中,前景防抖模块具体用于:提取第一前景图像的特征点;根据第一前景图像的特征点和第二前景图像的特征点,得到第一运动轨迹曲线,第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和待处理图像在图像序列上是连续的图像帧;根据待处理图像的前景防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线;根据第二运动轨迹曲线和第一前景图像,得到增稳后的第一前景图像;对增稳后的第一前景图像进行边缘补偿,得到第一前景增稳图像。
在一些可能的实现方式中,前景防抖模块具体用于:通过
Figure PCTCN2021110028-appb-000006
计算增稳后的第一前景图像中每个待补偿像素点的目标像素值;
将目标像素值作为待补偿像素点的像素值,其它像素点的像素值不变,得到第一前景增稳图像,其它像素点为增稳后的第一前景图像中,除了待补偿像素点之外的像素点;其中,
Figure PCTCN2021110028-appb-000007
V(q)为q点的目标像素值,q点为待补偿像素点;P(i,j)为第i行第j列的像素点的像素值,w(i,j)为第i行第j列的像素点的高斯权重,σ为超参数,N为正整数。
在一些可能的实现方式中,该装置还包括:防抖强度调整模块,用于提取待处理图像的第一边缘带信息,第一边缘带信息为待处理图像中,前景和后景之间的边缘带的像素信息;提取第一增稳图像的第二边缘带信息,第二边缘带信息为第一增稳图像中,前景和后景之间的边缘带的像素信息;根据第一边缘带信息和第二边缘带信息,对下一帧待处理图像或者待处理图像的防抖强度进行调整,防抖强度包括前景防抖强度和/或后景防抖强度。
在一些可能的实现方式中,防抖强度调整模块具体用于:确定第一边缘带信息和第二边缘带信息之间的第一结构性相似度指标;若第一结构性相似度指标未落入预设相似度阈值区间,对下一帧待处理图像或者待处理图像的防抖强度进行调整。
在一些可能的实现方式中,防抖强度调整模块具体用于:若第一结构性相似度指标未落入预设相似度阈值区间,将待处理图像的防抖强度和预设数值相乘,得到调整 后的第一防抖强度,将调整后的第一防抖强度作为下一帧待处理图像的防抖强度或者待处理图像的防抖强度。
在一些可能的实现方式中,调整后的第一防抖强度包括待处理图像的前景防抖强度;装置还包括当前帧防抖结果调整模块,用于:根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线,第一运动轨迹曲线为根据第一前景图像的特征点和第二前景图像的特征点得到的运动轨迹曲线,第二前景图像为第一目标图像的前景图像,待处理视频包括第一目标图像,且第一目标图像和待处理图像在图像序列上是连续的图像帧;根据第三运动轨迹曲线和第一前景图像,得到第二前景增稳图像;将第二前景增稳图像和第一后景增稳图像进行融合,得到待处理图像的第二增稳图像;确定第二边缘带信息和第三边缘带信息之间的第二结构性相似度指标,第三边缘带信息为第二增稳图像中,前景和后景之间的边缘带的像素信息;若第二结构性相似度指标未落入预设相似度阈值区间,将调整后的第一防抖强度和预设数值相乘,得到调整后的第二防抖强度;将调整后的第二防抖强度作为调整后的第一防抖强度,返回根据调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑,得到路径平滑后的第三运动轨迹曲线的步骤,直到第二结构性相似度指标落入预设相似度阈值区间。
在一些可能的实现方式中,后景防抖模块具体用于:提取第一后景图像的特征点;根据第一后景图像的特征点和第二后景图像的特征点,得到第四运动轨迹曲线,第二后景图像为第二目标图像的后景图像,待处理视频包括第二目标图像,且第二目标图像和待处理图像在图像序列上是连续的图像帧;根据待处理图像的后景防抖强度,对第四运动轨迹曲线进行路径平滑,得到路径平滑后的第二运动轨迹曲线;根据第二运动轨迹曲线和第一前景图像,得到第一前景增稳图像。
上述视频防抖装置具有实现上述视频防抖方法的功能,该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现,硬件或软件包括一个或多个与上述功能相对应的模块,模块可以是软件和/或硬件。
本申请实施例还提供了一种终端设备,包括存储器、处理器以及存储在存储器中并可在处理器上运行的计算机程序,该处理器执行计算机程序时实现如上述任一项的视频防抖方法。
本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。
本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行时实现可实现上述各个方法实施例中的步骤。
本申请实施例还提供一种芯片系统,所述芯片系统包括处理器,所述处理器与存储器耦合,所述处理器执行存储器中存储的计算机程序,以实现如上述各个方法实施例所述的方法。所述芯片系统可以为单个芯片,或者多个芯片组成的芯片模组。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。此外,在本申请说明书和所附权利要 求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (17)

  1. 一种视频防抖方法,应用于终端设备,其特征在于,包括:
    获取待处理图像,所述待处理图像为待处理视频中的一帧图像;
    对所述待处理图像进行前后景分割,得到第一后景图像和第一前景图像;
    对所述第一前景图像进行防抖处理,得到第一前景增稳图像;
    对所述第一后景图像进行防抖处理,得到第一后景增稳图像;
    将所述第一前景增稳图像和所述第一后景增稳图像进行融合,得到所述待处理图像的第一增稳图像。
  2. 根据权利要求1所述的方法,其特征在于,对所述第一前景图像进行防抖处理,得到第一前景增稳图像,包括:
    提取所述第一前景图像的特征点;
    根据所述第一前景图像的特征点和第二前景图像的特征点,得到第一运动轨迹曲线,所述第二前景图像为第一目标图像的前景图像,所述待处理视频包括所述第一目标图像,且所述第一目标图像和所述待处理图像在图像序列上是连续的图像帧;
    根据所述待处理图像的前景防抖强度,对所述第一运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第二运动轨迹曲线;
    根据所述第二运动轨迹曲线和所述第一前景图像,得到增稳后的第一前景图像;
    对所述增稳后的第一前景图像进行边缘补偿,得到所述第一前景增稳图像。
  3. 根据权利要求2所述的方法,其特征在于,对所述增稳后的第一前景图像进行边缘补偿,得到所述第一前景增稳图像,包括:
    通过
    Figure PCTCN2021110028-appb-100001
    计算所述增稳后的第一前景图像中每个待补偿像素点的目标像素值;
    将所述目标像素值作为所述待补偿像素点的像素值,其它像素点的像素值不变,得到所述第一前景增稳图像,所述其它像素点为所述增稳后的第一前景图像中,除了所述待补偿像素点之外的像素点;
    其中,
    Figure PCTCN2021110028-appb-100002
    V(q)为q点的目标像素值,q点为所述待补偿像素点;P(i,j)为第i行第j列的像素点的像素值,w(i,j)为第i行第j列的像素点的高斯权重,σ为超参数,N为正整数。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述方法还包括:
    提取所述待处理图像的第一边缘带信息,所述第一边缘带信息为所述待处理图像中,前景和后景之间的边缘带的像素信息;
    提取所述第一增稳图像的第二边缘带信息,所述第二边缘带信息为所述第一增稳图像中,前景和后景之间的边缘带的像素信息;
    根据所述第一边缘带信息和所述第二边缘带信息,对下一帧图像或者所述待处理图像的防抖强度进行调整,所述防抖强度包括前景防抖强度和/或后景防抖强度。
  5. 根据权利要求4所述的方法,其特征在于,根据所述第一边缘带信息和所述第二边缘带信息,对下一帧图像或者所述待处理图像的防抖强度进行调整,包括:
    确定所述第一边缘带信息和所述第二边缘带信息之间的第一结构性相似度指标;
    若所述第一结构性相似度指标未落入预设相似度阈值区间,对下一帧图像或者所述待处理图像的防抖强度进行调整。
  6. 根据权利要求5所述的方法,其特征在于,若所述第一结构性相似度指标未落入预设相似度阈值区间,对下一帧图像或者所述待处理图像的防抖强度进行调整,包括:
    若所述第一结构性相似度指标未落入所述预设相似度阈值区间,将所述待处理图像的防抖强度和预设数值相乘,得到调整后的第一防抖强度,将所述调整后的第一防抖强度作为所述下一帧图像的防抖强度或者所述待处理图像的防抖强度。
  7. 根据权利要求6所述的方法,其特征在于,所述调整后的第一防抖强度包括所述待处理图像的前景防抖强度;所述方法还包括:
    根据所述调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第三运动轨迹曲线,所述第一运动轨迹曲线为根据所述第一前景图像的特征点和第二前景图像的特征点得到的运动轨迹曲线,所述第二前景图像为第一目标图像的前景图像,所述待处理视频包括所述第一目标图像,且所述第一目标图像和所述待处理图像在图像序列上是连续的图像帧;
    根据所述第三运动轨迹曲线和所述第一前景图像,得到第二前景增稳图像;
    将所述第二前景增稳图像和所述第一后景增稳图像进行融合,得到所述待处理图像的第二增稳图像;
    确定所述第二边缘带信息和第三边缘带信息之间的第二结构性相似度指标,所述第三边缘带信息为所述第二增稳图像中,前景和后景之间的边缘带的像素信息;
    若所述第二结构性相似度指标未落入所述预设相似度阈值区间,将所述调整后的第一防抖强度和所述预设数值相乘,得到调整后的第二防抖强度;
    将所述调整后的第二防抖强度作为调整后的第一防抖强度,并返回执行根据所述调整后的第一防抖强度,对所述第一运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第三运动轨迹曲线的步骤,直到所述第二结构性相似度指标落入所述预设相似度阈值区间。
  8. 根据权利要求1所述的方法,其特征在于,对所述第一后景图像进行防抖处理,得到第一后景增稳图像,包括:
    提取所述第一后景图像的特征点;
    根据所述第一后景图像的特征点和第二后景图像的特征点,得到第四运动轨迹曲线,所述第二后景图像为第二目标图像的后景图像,所述待处理视频包括所述第二目标图像,且所述第二目标图像和所述待处理图像在图像序列上是连续的图像帧;
    根据所述待处理图像的后景防抖强度,对所述第四运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第二运动轨迹曲线;
    根据所述第二运动轨迹曲线和所述第一前景图像,得到所述第一前景增稳图像。
  9. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如下步骤:
    获取待处理图像,所述待处理图像为待处理视频中的一帧图像;
    对所述待处理图像进行前后景分割,得到第一后景图像和第一前景图像;
    对所述第一前景图像进行防抖处理,得到第一前景增稳图像;
    对所述第一后景图像进行防抖处理,得到第一后景增稳图像;
    将所述第一前景增稳图像和所述第一后景增稳图像进行融合,得到所述待处理图像的第一增稳图像。
  10. 根据权利要求9所述的终端设备,其特征在于,所述处理器执行所述计算机程序时具体实现如下步骤:
    提取所述第一前景图像的特征点;
    根据所述第一前景图像的特征点和第二前景图像的特征点,得到第一运动轨迹曲线,所述第二前景图像为第一目标图像的前景图像,所述待处理视频包括所述第一目标图像,且所述第一目标图像和所述待处理图像在图像序列上是连续的图像帧;
    根据所述待处理图像的前景防抖强度,对所述第一运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第二运动轨迹曲线;
    根据所述第二运动轨迹曲线和所述第一前景图像,得到增稳后的第一前景图像;
    对所述增稳后的第一前景图像进行边缘补偿,得到所述第一前景增稳图像。
  11. 根据权利要求10所述的终端设备,其特征在于,所述处理器执行所述计算机程序时具体实现如下步骤:
    通过
    Figure PCTCN2021110028-appb-100003
    计算所述增稳后的第一前景图像中每个待补偿像素点的目标像素值;
    将所述目标像素值作为所述待补偿像素点的像素值,其它像素点的像素值不变,得到所述第一前景增稳图像,所述其它像素点为所述增稳后的第一前景图像中,除了所述待补偿像素点之外的像素点;
    其中,
    Figure PCTCN2021110028-appb-100004
    V(q)为q点的目标像素值,q点为所述待补偿像素点;P(i,j)为第i行第j列的像素点的像素值,w(i,j)为第i行第j列的像素点的高斯权重,σ为超参数,N为正整数。
  12. 根据权利要求9至11任一项所述的终端设备,其特征在于,所述处理器执行所述计算机程序时还实现如下步骤:
    提取所述待处理图像的第一边缘带信息,所述第一边缘带信息为所述待处理图像中,前景和后景之间的边缘带的像素信息;
    提取所述第一增稳图像的第二边缘带信息,所述第二边缘带信息为所述第一增稳图像中,前景和后景之间的边缘带的像素信息;
    根据所述第一边缘带信息和所述第二边缘带信息,对下一帧图像或者所述待处理图像的防抖强度进行调整,所述防抖强度包括前景防抖强度和/或后景防抖强度。
  13. 根据权利要求12所述的终端设备,其特征在于,所述处理器执行所述计算机程序时具体实现如下步骤:
    确定所述第一边缘带信息和所述第二边缘带信息之间的第一结构性相似度指标;
    若所述第一结构性相似度指标未落入预设相似度阈值区间,对下一帧图像或者所 述待处理图像的防抖强度进行调整。
  14. 根据权利要求13所述的终端设备,其特征在于,所述处理器执行所述计算机程序时具体实现如下步骤:
    若所述第一结构性相似度指标未落入所述预设相似度阈值区间,将所述待处理图像的防抖强度和预设数值相乘,得到调整后的第一防抖强度,将所述调整后的第一防抖强度作为所述下一帧图像的防抖强度或者所述待处理图像的防抖强度。
  15. 根据权利要求13所述的终端设备,其特征在于,所述处理器执行所述计算机程序时还实现如下步骤:
    根据所述调整后的第一防抖强度,对第一运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第三运动轨迹曲线,所述第一运动轨迹曲线为根据所述第一前景图像的特征点和第二前景图像的特征点得到的运动轨迹曲线,所述第二前景图像为第一目标图像的前景图像,所述待处理视频包括所述第一目标图像,且所述第一目标图像和所述待处理图像在图像序列上是连续的图像帧;
    根据所述第三运动轨迹曲线和所述第一前景图像,得到第二前景增稳图像;
    将所述第二前景增稳图像和所述第一后景增稳图像进行融合,得到所述待处理图像的第二增稳图像;
    确定所述第二边缘带信息和第三边缘带信息之间的第二结构性相似度指标,所述第三边缘带信息为所述第二增稳图像中,前景和后景之间的边缘带的像素信息;
    若所述第二结构性相似度指标未落入所述预设相似度阈值区间,将所述调整后的第一防抖强度和所述预设数值相乘,得到调整后的第二防抖强度;
    将所述调整后的第二防抖强度作为调整后的第一防抖强度,并返回执行根据所述调整后的第一防抖强度,对所述第一运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第三运动轨迹曲线的步骤,直到所述第二结构性相似度指标落入所述预设相似度阈值区间。
  16. 根据权利要求9所述的终端设备,其特征在于,所述处理器执行所述计算机程序时具体实现如下步骤:
    提取所述第一后景图像的特征点;
    根据所述第一后景图像的特征点和第二后景图像的特征点,得到第四运动轨迹曲线,所述第二后景图像为第二目标图像的后景图像,所述待处理视频包括所述第二目标图像,且所述第二目标图像和所述待处理图像在图像序列上是连续的图像帧;
    根据所述待处理图像的后景防抖强度,对所述第四运动轨迹曲线进行路径平滑处理,得到路径平滑处理后的第二运动轨迹曲线;
    根据所述第二运动轨迹曲线和所述第一前景图像,得到所述第一前景增稳图像。
  17. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至8任一项所述的视频防抖方法。
PCT/CN2021/110028 2020-08-13 2021-08-02 视频防抖方法、终端设备和计算机可读存储介质 WO2022033344A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010811800.7 2020-08-13
CN202010811800.7A CN114079725B (zh) 2020-08-13 2020-08-13 视频防抖方法、终端设备和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2022033344A1 true WO2022033344A1 (zh) 2022-02-17

Family

ID=80247706

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/110028 WO2022033344A1 (zh) 2020-08-13 2021-08-02 视频防抖方法、终端设备和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN114079725B (zh)
WO (1) WO2022033344A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116193275A (zh) * 2022-12-15 2023-05-30 荣耀终端有限公司 视频处理方法及相关设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104093014A (zh) * 2014-07-21 2014-10-08 宇龙计算机通信科技(深圳)有限公司 图像处理方法和图像处理装置
US20160269526A1 (en) * 2014-08-25 2016-09-15 John G. Posa Portable electronic devices with integrated image/video compositing
CN109688329A (zh) * 2018-12-24 2019-04-26 天津天地伟业信息系统集成有限公司 一种针对高精度全景视频的防抖方法
CN112637500A (zh) * 2020-12-22 2021-04-09 维沃移动通信有限公司 图像处理方法及装置
CN112738398A (zh) * 2020-12-29 2021-04-30 维沃移动通信(杭州)有限公司 一种图像防抖方法、装置和电子设备
CN112995678A (zh) * 2021-02-22 2021-06-18 深圳创维-Rgb电子有限公司 一种视频运动补偿方法、装置及计算机设备

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4783252B2 (ja) * 2006-04-18 2011-09-28 富士通株式会社 手ぶれ補正機能付き撮像装置、手ぶれ補正方法、手ぶれ補正処理の前処理プログラム、および、保存画像決定プログラム
JPWO2010100677A1 (ja) * 2009-03-05 2012-09-06 富士通株式会社 画像処理装置およびぶれ量算出方法
CN104408743A (zh) * 2014-11-05 2015-03-11 百度在线网络技术(北京)有限公司 图像分割方法和装置
CN107094230A (zh) * 2016-02-17 2017-08-25 北京金迈捷科技有限公司 一种利用多空域数据融合技术获取图像和视频的方法
US10506248B2 (en) * 2016-06-30 2019-12-10 Facebook, Inc. Foreground detection for video stabilization
WO2018201097A2 (en) * 2017-04-28 2018-11-01 FLIR Belgium BVBA Video and image chart fusion systems and methods
CN107370958B (zh) * 2017-08-29 2019-03-29 Oppo广东移动通信有限公司 图像虚化处理方法、装置及拍摄终端
CN108900769B (zh) * 2018-07-16 2020-01-10 Oppo广东移动通信有限公司 图像处理方法、装置、移动终端及计算机可读存储介质
CN110035141B (zh) * 2019-02-22 2021-07-09 华为技术有限公司 一种拍摄方法及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104093014A (zh) * 2014-07-21 2014-10-08 宇龙计算机通信科技(深圳)有限公司 图像处理方法和图像处理装置
US20160269526A1 (en) * 2014-08-25 2016-09-15 John G. Posa Portable electronic devices with integrated image/video compositing
CN109688329A (zh) * 2018-12-24 2019-04-26 天津天地伟业信息系统集成有限公司 一种针对高精度全景视频的防抖方法
CN112637500A (zh) * 2020-12-22 2021-04-09 维沃移动通信有限公司 图像处理方法及装置
CN112738398A (zh) * 2020-12-29 2021-04-30 维沃移动通信(杭州)有限公司 一种图像防抖方法、装置和电子设备
CN112995678A (zh) * 2021-02-22 2021-06-18 深圳创维-Rgb电子有限公司 一种视频运动补偿方法、装置及计算机设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116193275A (zh) * 2022-12-15 2023-05-30 荣耀终端有限公司 视频处理方法及相关设备
CN116193275B (zh) * 2022-12-15 2023-10-20 荣耀终端有限公司 视频处理方法及相关设备

Also Published As

Publication number Publication date
CN114079725B (zh) 2023-02-07
CN114079725A (zh) 2022-02-22

Similar Documents

Publication Publication Date Title
WO2021052232A1 (zh) 一种延时摄影的拍摄方法及设备
CN113475057B (zh) 一种录像帧率的控制方法及相关装置
WO2022127787A1 (zh) 一种图像显示的方法及电子设备
WO2021052111A1 (zh) 图像处理方法及电子装置
WO2022007862A1 (zh) 图像处理方法、系统、电子设备及计算机可读存储介质
WO2022143128A1 (zh) 基于虚拟形象的视频通话方法、装置和终端
WO2023273323A9 (zh) 一种对焦方法和电子设备
CN113170037B (zh) 一种拍摄长曝光图像的方法和电子设备
WO2022100685A1 (zh) 一种绘制命令处理方法及其相关设备
WO2021057626A1 (zh) 图像处理方法、装置、设备及计算机存储介质
CN112087649B (zh) 一种设备搜寻方法以及电子设备
CN113542580B (zh) 去除眼镜光斑的方法、装置及电子设备
WO2022001258A1 (zh) 多屏显示方法、装置、终端设备及存储介质
CN115150542B (zh) 一种视频防抖方法及相关设备
WO2022033344A1 (zh) 视频防抖方法、终端设备和计算机可读存储介质
CN116708751B (zh) 一种拍照时长的确定方法、装置和电子设备
CN113542574A (zh) 变焦下的拍摄预览方法、终端、存储介质及电子设备
CN116051351B (zh) 一种特效处理方法和电子设备
CN114283195B (zh) 生成动态图像的方法、电子设备及可读存储介质
WO2022161006A1 (zh) 合拍的方法、装置、电子设备和可读存储介质
WO2022062985A1 (zh) 视频特效添加方法、装置及终端设备
CN115297269B (zh) 曝光参数的确定方法及电子设备
CN117729420A (zh) 一种连拍方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21855397

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21855397

Country of ref document: EP

Kind code of ref document: A1