WO2022089341A1 - 一种图像处理方法及相关装置 - Google Patents

一种图像处理方法及相关装置 Download PDF

Info

Publication number
WO2022089341A1
WO2022089341A1 PCT/CN2021/125974 CN2021125974W WO2022089341A1 WO 2022089341 A1 WO2022089341 A1 WO 2022089341A1 CN 2021125974 W CN2021125974 W CN 2021125974W WO 2022089341 A1 WO2022089341 A1 WO 2022089341A1
Authority
WO
WIPO (PCT)
Prior art keywords
images
image
terminal
acquisition device
further configured
Prior art date
Application number
PCT/CN2021/125974
Other languages
English (en)
French (fr)
Inventor
任津雄
赖昌材
杨长久
郑士胜
胡红旗
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022089341A1 publication Critical patent/WO2022089341A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules

Definitions

  • the present application relates to the technical field of image processing, and in particular, to an image processing method and related apparatus.
  • a cropping ratio is usually set in advance, the input video is trimmed, and a video of a smaller size is output, thereby realizing video anti-shake.
  • the degree of shaking of the video image will exceed the range that can be handled by the cropping ratio, resulting in an imageless area in the video image, which affects the quality of the video.
  • the present application provides an image processing method and related device.
  • N images are selected as output from the M images according to the corresponding shaking amplitude of each of the M images, where N is less than M
  • N is less than M
  • the positive integer of so as to filter out the images with large jitter, as far as possible to ensure that the jitter of the video picture does not exceed the range that can be handled by the cropping ratio, avoid the imageless area of the video picture, and improve the quality of the video.
  • a first aspect of the present application provides an image processing method, including: a terminal acquiring a first image sequence, where the first image sequence includes M images, where M is a positive integer, that is, the first image sequence may be an image acquisition A set of consecutive images acquired by the device within a period of time, such as 0.5 seconds or 1 second.
  • the terminal determines a jitter amplitude corresponding to each of the M images, where the jitter amplitude is used to represent an offset of a pixel in the image compared to a pixel corresponding to the pixel in the reference image.
  • the reference image may be an image captured when the image capturing device does not shake, and the reference image is an image captured by the image capturing device before the M images are captured.
  • the terminal determines N images in the M images, where N is less than the M, and the N is a positive integer, that is, it is determined in the M images N images with smaller jitter amplitudes, so as to screen out images with larger jitter amplitudes.
  • the terminal outputs a second image sequence, where the second image sequence includes the N images.
  • N images are selected from the M images as output according to the corresponding jitter amplitude of each of the M images, and N is a positive integer less than M, so that the jitter amplitude is Larger images are filtered out to ensure that the degree of shaking of the video image will not exceed the range that can be handled by the cropping ratio, avoiding image-free areas in the video image, and improving the quality of the video.
  • the determining N images among the M images according to the shaking amplitude corresponding to each of the M images includes: according to the shaking amplitude, according to the shaking amplitude from small From the M images to the largest order, N images are determined, and the value of N is the first threshold. That is, the terminal selects N images among the M images, and M-N unselected images remain.
  • the N images determined by the terminal are the N images with the smallest shaking amplitude among the M images, and the shaking amplitude corresponding to any one of the N images is smaller than the shaking amplitude corresponding to the M-N unselected images.
  • N images from the M images in the order of the jitter amplitude from small to large one or more images with the largest jitter amplitude in the input image sequence can be filtered out, thereby ensuring the image quality of the images used for subsequent anti-shake processing.
  • the degree of jitter will not exceed the range that can be handled by the cropping ratio, avoiding the imageless area of the video screen and improving the quality of the video.
  • the determining N images among the M images according to the shaking amplitude corresponding to each of the M images includes: according to the shaking amplitude and the constraint condition, according to In the order of the jitter amplitude from small to large, N images are determined among the M images, and the value of N is the first threshold; wherein, the constraint condition is that two adjacent images in the obtained N images
  • the intervals in the first sequence of images are less than a second threshold. That is to say, in the process of selecting N images among the M images, in addition to selecting images in the order of the jitter amplitude from small to large, the terminal also needs to ensure that the selected two adjacent images are in the first image sequence. The interval is not greater than the second threshold.
  • the determining N images among the M images according to the shaking amplitude corresponding to each of the M images includes: according to the shaking amplitude, Among the images, N images whose jitter amplitude is less than the third threshold are determined.
  • a third threshold may be preset in the terminal, and the terminal may determine, according to the magnitude relationship between the jitter amplitude corresponding to each of the M images and the third threshold, that the image to be selected has a jitter amplitude less than the third threshold.
  • the third threshold may be a threshold determined according to a cropping ratio, and the cropping ratio is preset in the terminal and is a ratio for cropping an image during image stabilization processing. By determining the third threshold according to the cropping ratio, it can be ensured that no image area will appear when the image whose shaking amplitude is smaller than the third threshold is processed by using the cropping ratio.
  • the method further includes: when it is determined that the image capture device shakes, sending an instruction to the image capture device, where the instruction is used to instruct the image capture device to use the first frame
  • the image acquisition device uses a second frame rate to acquire images when no shaking occurs, and the second frame rate is smaller than the first frame rate.
  • the terminal instructs the image acquisition device to increase the frame rate of the acquired image, which can ensure that the terminal acquires more input images than the terminal outputs. , which is convenient for the terminal to filter and remove images with large jitter. Only when the image capture device shakes, the image capture device needs to increase the frame rate of the captured image, so that the image capture device can be prevented from using a higher frame rate to capture images all the time, and the energy consumption of the image capture device can be reduced.
  • the method further includes: acquiring angular velocity information of the image acquisition device at S moments in a first time period, where S is an integer greater than 1; determining the S moments When the variance is greater than a fourth threshold, it is determined that the image acquisition device shakes; when the variance is less than or equal to the fourth threshold, it is determined that the image acquisition device does not shake.
  • the variance refers to the average of the square values of the difference between each angular velocity information and the average of the overall angular velocity information, and is used to measure the difference between each angular velocity information and the overall angular velocity information average.
  • large variance it can be considered that the difference between the angular velocity information and the mean value of the overall angular velocity information is large, that is, the angular velocity fluctuates greatly near the mean value of the overall angular velocity, so it can be considered that the image acquisition device shakes.
  • the jitter amplitudes corresponding to the M images include offsets corresponding to the M images; and the terminal determining the jitter amplitudes corresponding to the M images includes: acquiring, by the terminal, image collection The angular velocity information of the device at P moments in the second time period, where P is an integer greater than 1, and the image acquisition device is used to acquire the first image sequence; the terminal determines according to the angular velocity information at the P moments pose information when the image acquisition device collects the M images; the terminal determines an offset corresponding to each of the M images according to the pose information.
  • the terminal determines the pose information of the image acquisition device when collecting the M images according to the angular velocity information at the P moments, including: the terminal determines, according to the angular velocity information at the P moments and all At the acquisition moments of the M images, the pose information of the image acquisition device when the M images are acquired is determined by a linear interpolation method.
  • the terminal determining the offset corresponding to the M images according to the pose information includes: the terminal determining the offset corresponding to the M images according to the pose of the image collecting device when collecting the M images The information determines the rotation matrix corresponding to each of the M images; the terminal determines the offset corresponding to the M images according to the rotation matrix corresponding to the M images.
  • the method further includes: the terminal acquires an image selection ratio, where the image selection ratio is the number of image inputs and the number of images output The terminal determines the value of N according to the M images and the image selection ratio; wherein, the ratio between the M and the N is the same as the image selection ratio.
  • the method before the terminal outputs the second image sequence, the method further includes: the terminal performs anti-shake processing on the N images according to an anti-shake algorithm to obtain processed N images; The terminal outputs the second image sequence, where the second image sequence includes the processed N images.
  • a second aspect of the present application provides a terminal, including: an acquisition unit and a processing unit; the acquisition unit is configured to acquire a first image sequence, where the first image sequence includes M images, and M is a positive integer; The processing unit is used to determine the corresponding jitter amplitude of each of the M images, where the jitter amplitude is used to represent the offset of the pixels in the image compared to the reference image; the processing unit is also used to According to the shaking amplitude, N images are determined from the M images, where N is smaller than the M, and N is a positive integer; the processing unit is further configured to output a second image sequence, the The second sequence of images includes the N images.
  • the processing unit is further configured to, according to the jitter amplitude, determine N images from the M images in an order of the jitter amplitude from small to large, and the value of N is is the first threshold.
  • the processing unit is further configured to, according to the jitter amplitude and the constraint conditions, determine N images from the M images in an order of the jitter amplitude from small to large, and the N images The value of is the first threshold; wherein, the constraint condition is that the interval between two adjacent images in the first image sequence in the obtained N images is smaller than the second threshold.
  • the processing unit is further configured to, according to the shaking amplitude, determine, among the M images, N images whose shaking amplitude is less than a third threshold.
  • the processing unit is further configured to send an instruction to the image acquisition device when it is determined that the image acquisition device shakes, where the instruction is used to instruct the image acquisition device to adopt the first A frame rate is used to collect images; wherein, the image collection device uses a second frame rate to collect images when no shaking occurs, and the second frame rate is smaller than the first frame rate.
  • the acquisition unit is further configured to acquire the angular velocity information of the image acquisition device at S moments in the first time period, where S is an integer greater than 1; the processing unit , and is also used to determine the variance of the angular velocity information at the S moments; when the variance is greater than a fourth threshold, it is determined that the image acquisition device shakes.
  • the jitter amplitudes corresponding to the M images include offsets corresponding to the M images; the acquiring unit is further configured to acquire the P of the image acquisition device in the second time period angular velocity information at times, the P is an integer greater than 1, the image acquisition device is configured to collect the first image sequence; the processing unit is further configured to determine the angular velocity information according to the angular velocity information at the P times pose information when the image acquisition device collects the M images; the processing unit is further configured to determine an offset corresponding to each of the M images according to the pose information.
  • the processing unit is further configured to, according to the angular velocity information of the P moments and the acquisition moments of the M images, determine through a linear interpolation method that the image acquisition device is collecting the Pose information for M images.
  • the processing unit is further configured to determine a rotation matrix corresponding to each of the M images according to the pose information of the image acquisition device when acquiring the M images;
  • the processing unit is further configured to determine the offset corresponding to the M images according to the rotation matrices corresponding to the M images.
  • the obtaining unit is further configured to obtain an image selection ratio, where the image selection ratio is a ratio between the number of image inputs and the number of image outputs; the processing unit is further configured to obtain an image selection ratio according to the The value of N is determined according to the M images and the image selection ratio; wherein, the ratio between the M and the N is the same as the image selection ratio.
  • the processing unit is further configured to perform anti-shake processing on the N images according to an anti-shake algorithm to obtain N images after processing; the processing unit is further configured to output The second sequence of images includes the processed N images.
  • a third aspect of an embodiment of the present application provides a terminal, including: one or more processors and a memory; wherein, computer-readable instructions are stored in the memory; the one or more processors read the memory
  • the computer-readable instructions in are to cause the terminal to implement the method according to any one of the above-mentioned first aspect and various possible implementation manners.
  • the terminal may for example comprise a headset.
  • a fourth aspect of the embodiments of the present application provides a computer program product including instructions, which, when run on a computer, enables the computer to execute the first aspect and any one of various possible implementation manners. method.
  • a fifth aspect of the embodiments of the present application provides a computer-readable storage medium, including instructions, when the instructions are executed on a computer, the computer is made to execute the above-mentioned first aspect and any one of various possible implementation manners. method described.
  • a sixth aspect of the embodiments of the present application provides a chip, including a processor.
  • the processor is configured to read and execute the computer program stored in the memory to perform the method in any possible implementation manner of any of the above aspects.
  • the chip includes a memory, and the memory and the processor are connected to the memory through a circuit or a wire.
  • the chip further includes a communication interface, and the processor is connected to the communication interface.
  • the communication interface is used for receiving data and/or information to be processed, the processor obtains the data and/or information from the communication interface, processes the data and/or information, and outputs the processing result through the communication interface.
  • the communication interface may be an input-output interface.
  • FIG. 1 is a schematic diagram of cropping a video picture provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image processing method 300 provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of an image processing method 400 provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a rotation model when a terminal shakes according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of selecting an image according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an image selection provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of image data comparison before path smoothing provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a terminal 100 according to an embodiment of the present application.
  • the video anti-shake algorithm includes two steps of motion path smoothing and motion compensation.
  • Motion path smoothing refers to smoothing the original motion path of the terminal by using a low-pass filter or algorithm, eliminating the jittering part that occurs during the motion process, and obtaining a smoothed motion path.
  • Motion compensation refers to obtaining motion compensation information according to the mapping relationship between the original motion path of the terminal and the smoothed motion path, so as to correct the current video frame and obtain a new stable video frame.
  • a cropping ratio needs to be set in advance, and the processed image is cropped to ensure the stability of the video image.
  • FIG. 1 is a schematic diagram of cropping a video picture provided by an embodiment of the present application.
  • a cropping window with a fixed size and proportion is required to crop the image, and the cropped image is used as the output.
  • the actual position where the terminal captures each frame of image may be different. In this way, the position of the main object (such as the person in FIG. 1 ) may be different in each frame of image.
  • the position of the cropping window can be adjusted based on the compensation information of the motion, so that a relatively stable video can be obtained by cropping.
  • the position adjustment amount of the cropping window is relatively large, which may cause part of the cropping window to be outside the image, for example, the fourth frame image in FIG. 1 . In this way, no image areas, ie black borders, will appear in the cropped image, thereby affecting the quality of the video.
  • the embodiment of the present application provides an image processing method, which is applied to video anti-shake, by selecting N among the M images according to the shaking amplitude corresponding to each of the M images when the M images are acquired.
  • N is a positive integer less than M, so as to filter out the images with large jitter, to ensure that the jitter of the video image will not exceed the range that can be handled by the cropping ratio, to avoid the imageless area of the video image, and to improve the video quality. the quality of.
  • FIG. 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • the image processing method provided by the embodiment of the present application may be applied to a terminal, where an image acquisition device capable of shooting video is installed.
  • the terminal vibrates, the video image captured by the image capturing device changes, and the amount of change of the video image is related to the jitter amplitude of the terminal.
  • the terminal is also called user equipment (UE), mobile station (MS), mobile terminal (MT), etc., and is a device equipped with an image capture device capable of shooting videos.
  • UE user equipment
  • MS mobile station
  • MT mobile terminal
  • some examples of terminals are: mobile phone (mobile phone), tablet computer, notebook computer, PDA, surveillance camera, mobile internet device (mobile internet device, MID), wearable device, virtual reality (virtual reality, VR) device , Augmented reality (AR) devices, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical surgery, smart grids wireless terminal in grid), wireless terminal in transportation safety, wireless terminal in smart city, wireless terminal in smart home, etc.
  • the image acquisition device in the terminal is used to convert optical signals into electrical signals to generate image signals.
  • the image acquisition device may be, for example, an image sensor, and the image sensor may be, for example, a charge coupled device (Charge Coupled Device, CCD) or a complementary metal oxide semiconductor (Complementary Metal Oxide Semiconductor, CMOS).
  • CCD Charge Coupled Device
  • CMOS complementary metal oxide semiconductor
  • a device for measuring the motion of the terminal may also be installed in the terminal, such as an inertial measurement unit (Inertial measurement unit, IMU).
  • IMU is a device that measures the angular rate and acceleration of an object in three axes.
  • an IMU contains three single-axis accelerometers and three single-axis gyroscopes; the accelerometer is used to detect the acceleration signal of the object in the independent three-axis of the carrier coordinate system, and the gyroscope detects the carrier relative to the navigation coordinate system.
  • angular velocity signal By measuring the angular velocity and acceleration of the object in three-dimensional space through the IMU, the attitude of the object can be calculated.
  • FIG. 3 is a schematic flowchart of an image processing method 300 provided by an embodiment of the present application. As shown in Figure 3, the image processing method 300 includes the following steps:
  • Step 301 Acquire a first image sequence, where the first image sequence includes M images, where M is a positive integer.
  • the image acquisition device in the terminal continues to acquire images, and the terminal can acquire the first image sequence acquired by the image acquisition device.
  • the first image sequence may be a group of consecutive images acquired by the image acquisition device within a period of time, eg, 0.5 seconds or 1 second.
  • the first image sequence includes M images collected by the image collection device, and the size of M is related to the frame rate of the images collected by the image collection device and the collection time corresponding to the first image sequence. For example, when the frame rate of images collected by the image acquisition device is 60 images per second, and the acquisition time corresponding to the first image sequence is 0.2 seconds, M is 12; the frame rate of images collected by the image acquisition device is: When there are 30 images and the acquisition time corresponding to the first image sequence is 0.5 seconds, M is 15.
  • the terminal may send an instruction to the image acquisition device when it is determined that the image acquisition device shakes, where the instruction is used to instruct the image acquisition device to use the first frame rate to acquire images; wherein , the image acquisition device uses a second frame rate to acquire images when no shaking occurs, and the second frame rate is smaller than the first frame rate.
  • the terminal uses a second frame rate to collect images, and the second frame rate may be, for example, 30 images per second; if the terminal determines that the image acquisition device shakes, the terminal sends the image
  • the acquisition device sends an instruction instructing it to acquire images at the first frame rate, so that the image acquisition device increases the frame rate of the acquired images. For example, when the first frame rate is to acquire 60 images per second, the image acquisition device starts from The frame rate for capturing 30 frames has been increased to 60 frames per second.
  • the frame rate of the video output by the terminal when the frame rate of the video output by the terminal is fixed, when the terminal determines that the image acquisition device is shaking, it instructs the image acquisition device to increase the frame rate of the acquired image, which can ensure that the terminal obtains more input images. For the image output by the terminal, it is convenient for the terminal to filter and eliminate the image with large jitter.
  • the frame rate of the image captured by the image capturing device may be the same as the frame rate of the image output by the terminal. Only when the image capture device shakes, the image capture device needs to increase the frame rate of the captured image, so that the image capture device can be prevented from using a higher frame rate to capture images all the time, and the energy consumption of the image capture device can be reduced.
  • the process in which the terminal determines that the image acquisition device shakes may include: the terminal acquires the angular velocity information of the image acquisition device at S moments in the first time period, where S is greater than 1 Integer; the terminal determines the variance of the angular velocity information at the S moments, and when the variance is greater than a fourth threshold, determines that the image acquisition device shakes, and when the variance is less than or equal to the fourth threshold, determines The image acquisition device shakes.
  • the time length of the first time period can be determined according to the frequency at which the IMU collects the angular velocity.
  • the first time period can be 0.1 second.
  • the S times in the first time period are also fixed.
  • the S times may refer to 10 times.
  • the variance refers to the average of the square values of the difference between each angular velocity information and the average of the overall angular velocity information, and is used to measure the difference between each angular velocity information and the overall angular velocity information average.
  • the difference between the angular velocity information and the mean value of the overall angular velocity information is large, that is, the angular velocity fluctuates greatly near the mean value of the overall angular velocity, so it can be considered that the image acquisition device shakes.
  • Step 302 Determine a shaking amplitude corresponding to each of the M images, where the shaking amplitude is used to represent the offset of a pixel in the image compared to the reference image.
  • each image collected by the image acquisition device has a corresponding shaking amplitude, and the shaking amplitude is used to indicate that the pixel in the image is compared with the pixel in the reference image.
  • the offset of the corresponding pixel may be an image captured when the image capturing device does not shake, and the reference image is an image captured by the image capturing device before the M images are captured.
  • the images captured by the image capture device at multiple times are actually the same, that is, the positions of the pixels used to represent the same object in different images remain unchanged;
  • the position of the image capture device changes relative to when the image capture device does not shake, then the position of each object in the scene in the image captured by the image capture device also changes, that is, in different images.
  • the positions of the pixels used to represent the same object have changed.
  • the terminal may acquire one or more images collected by the image acquisition device before the shake occurs, and select one of the images as the reference. image. Then, for any one of the M images, a pixel in the image (for example, a pixel located at a center point) can find a corresponding pixel in the reference image (ie, a pixel representing the same object in the reference image). The terminal determines the shaking magnitude of the image by determining the offset of the pixels of any one of the M images compared to the corresponding pixels in the reference image.
  • the process of determining the jitter amplitude corresponding to the M images by the terminal may include:
  • the terminal obtains the angular velocity information of the image acquisition device at P moments in a certain period of time through the IMU, where P is an integer greater than 1, and the time interval between two adjacent moments in the P moments may be equal. For example, when the frequency of the IMU is 100 Hz, the terminal can obtain the angular velocity information of the image acquisition device at 10 moments within 0.1 second through the IMU, and the time interval between every two moments within 0.1 second is 0.01 Second.
  • the terminal may determine the pose information of the image acquisition device when collecting the M images according to the angular velocity information at the P times.
  • the image acquisition device can be determined by the angular velocity information at each moment and the time interval between the two adjacent moments. The amount of pose change during this time interval. By superimposing the pose changes in each time interval, the angular velocity information of the image acquisition device at P moments can be obtained.
  • the pose information of the image acquisition device at time 1 can be obtained based on the angular velocity information at time 1 and the time interval t;
  • the pose information of the device at time 1, the angular velocity information at time 2, and the time interval t can be superimposed to obtain the pose information of the image acquisition device at time 2; similarly, based on the pose information corresponding to time 2, the angular velocity information at time 3, and
  • the time interval t can be superimposed to obtain the pose information of the image acquisition device at time 3 .
  • the pose information corresponding to the time when the image acquisition device collects the image can also be determined based on the pose information corresponding to the time when the IMU collects the angular velocity information .
  • the terminal may determine the pose information of the image acquisition device when acquiring the M images by using a linear interpolation method according to the angular velocity information of the P moments and the acquisition moments of the M images.
  • the linear interpolation method is based on the straight line passing through the two interpolation nodes to approximate the replacement of the original function, so as to determine the value corresponding to any point on the straight line.
  • the image acquisition device collects an image
  • the pose information corresponding to each image is obtained.
  • the terminal may determine the offset corresponding to the M images according to the pose information corresponding to the M images, and the offset may actually be the offset of the pixels in the images.
  • the terminal may perform motion estimation on the image acquisition device, for example, based on the Rodrigues formula, and respectively determine the rotation matrices corresponding to the M images according to the pose information corresponding to the M images.
  • the terminal then performs transformation processing on the coordinate points in the M images according to the rotation matrix to obtain M transformed coordinate points, and calculates the offset between the transformed coordinate points in each image and the coordinate points before the transformation. , determine the offset corresponding to the M images, and the offset is the shaking amplitude of the image.
  • the terminal may also acquire the angular velocity information of the image acquisition device when each image is acquired through the IMU, and then calculate the pose information of the image acquisition device when each image is acquired.
  • Step 303 Determine N images from the M images according to the jitter amplitude, where N is smaller than the M, and N is a positive integer.
  • N images with a smaller shaking amplitude may be determined among the M images according to the shaking amplitude of each of the M images, In order to filter out images with large jitter.
  • the terminal may determine the N images among the M images.
  • the terminal may determine N images from the M images according to the shaking amplitudes corresponding to the M images and in the order of the shaking amplitudes from small to large, where the value of N is the first threshold.
  • the terminal selects N images among the M images, and M-N unselected images remain.
  • the N images determined by the terminal are the N images with the smallest shaking amplitude among the M images, and the shaking amplitude corresponding to any one of the N images is smaller than the shaking amplitude corresponding to the M-N unselected images.
  • the value of N is a first threshold, and the first threshold may be determined by the terminal before selecting an image.
  • the terminal may, based on a preset proportional relationship between M and N, The number of (ie M) determines N. Exemplarily, when M is 60 and the proportional relationship between M and N is 2 to 1, the terminal may determine that N is 30.
  • the terminal needs to determine 3 of the 5 images (eg, image A1, image A2, image A3, image A4, and image A5), and the 5 images correspond to The jitter amplitudes are 1, 2, 3, 4, and 5, respectively.
  • the terminal can determine the 3 images with the smallest shaking amplitude among the 5 images, that is, the image A1 with the shaking amplitude 1, the image A2 with the shaking amplitude 2, and the image A3 with the shaking amplitude 3.
  • N images from the M images in the order of the jitter amplitude from small to large one or more images with the largest jitter amplitude in the input image sequence can be filtered out, thereby ensuring the image quality of the images used for subsequent anti-shake processing.
  • the degree of jitter will not exceed the range that can be handled by the cropping ratio, avoiding the imageless area of the video screen and improving the quality of the video.
  • Mode 2 The terminal determines N images among the M images according to the shaking amplitudes corresponding to the M images and the constraint conditions, and in the order of the jitter amplitudes from small to large, where N is a first threshold; wherein, the constraint The condition is that the interval between two adjacent images in the first image sequence in the obtained N images is smaller than the second threshold.
  • the terminal in the process of selecting N images among M images, the terminal not only selects images in the order of the jitter amplitude from small to large, but also needs to ensure that the selected two adjacent images are in the first image sequence.
  • the interval in is not greater than the second threshold.
  • the value of N is a first threshold, and the first threshold may be determined by the terminal before selecting an image.
  • the terminal may, based on a preset proportional relationship between M and N, The number of (ie M) determines N. Exemplarily, when M is 60 and the proportional relationship between M and N is 2 to 1, the terminal may determine that N is 30.
  • the M images in the first image sequence are sequentially acquired by the image acquisition device in time sequence, and the time interval between every two adjacent images is fixed. Therefore, among the selected N images, if the interval between two adjacent images in the first image sequence is relatively large, the time interval between the two images is also relatively large. In this way, when the moving objects in the images move at a relatively high speed, the positions of the moving objects in the two images may deviate greatly, resulting in screen freezes in the video composed of the two images. phenomenon that affects the viewing experience.
  • the value of the second threshold may be determined according to the time interval during which the image acquisition device acquires images. For example, when the time interval for the image acquisition device to collect images is relatively large, the value of the second threshold may be a relatively small value to ensure that the time interval between the two selected images is within a certain range; When the time interval at which the device collects images is relatively small, the value of the second threshold may be a relatively large value. Exemplarily, when the time interval at which the image capturing device collects images is 0.02 seconds, the second threshold may be 2 or 3; when the time interval at which the image capturing device captures images is 0.01 seconds, the second threshold may be 4 or 5.
  • the jitter amplitudes corresponding to the first image sequence are ⁇ 1, 5, 4, 3, 2, 1 ⁇ respectively, and the terminal needs to be in Three images are selected in the first image sequence (that is, M is 6 and N is 3), and the constraint condition is that the interval between the selected two adjacent ones in the first image sequence is not greater than 2. It can be seen that when the terminal only selects 3 images in the order of the jitter amplitude from small to large, the terminal can select the image B1, the image B5 and the image B6 with the jitter amplitudes of 1, 2 and 1 respectively.
  • the terminal selects 3 images from the 6 images according to the corresponding shaking amplitudes and constraints of the 6 images, the terminal can select images B1, B4 and B6 that meet the conditions.
  • the interval between the image B1 and the image B4 is not greater than 2
  • the interval between the image B4 and the image B6 is not greater than 2 either.
  • one or more images with the largest jitter amplitude in the input image sequence can be filtered out, and ensure that the selected image
  • the interval between two adjacent images can prevent the video picture from appearing in no image area, and at the same time ensure that the video picture will not be stuck, and improve the quality of the video.
  • a third threshold may be preset in the terminal, and the terminal may determine, according to the magnitude relationship between the jitter amplitude corresponding to each of the M images and the third threshold, that the image to be selected has a jitter amplitude less than the third threshold.
  • the third threshold may be a threshold determined according to a cropping ratio, and the cropping ratio is preset in the terminal and is a ratio for cropping an image during image stabilization processing. By determining the third threshold according to the cropping ratio, it can be ensured that no image area will appear when the image whose shaking amplitude is smaller than the third threshold is processed by using the cropping ratio.
  • the third threshold may be, for example, 5 pixels (ie, the offset is a distance of 5 pixels).
  • the terminal may acquire an image selection ratio, where the image selection ratio is a ratio between the number of image inputs and the number of image outputs, and the image selection ratio may be preset in the terminal, for example. After acquiring the M images in the first image sequence, the terminal determines to select and output N images from the M images according to the image selection ratio; The image selection ratios are the same.
  • the image selection ratio is preset in the terminal to be 2 to 1, and after acquiring 10 images in the first image sequence, the terminal may determine to output 5 images according to the image selection ratio.
  • Step 304 Output a second image sequence, where the second image sequence includes the N images.
  • the N images can constitute a new image sequence, that is, the second image sequence, and the terminal outputs the second image sequence to realize the video Output.
  • the ordering of the N images in the second image sequence is the same as the ordering of the N images in the first image sequence. That is to say, the second image sequence can be understood as an image sequence obtained by removing M-N images from the first image sequence.
  • the terminal may further perform anti-shake processing on the second image sequence.
  • the terminal may perform anti-shake processing on the N images according to an anti-shake algorithm to obtain N images after processing; the terminal outputs the second image sequence, where the second image sequence includes the processed images. of N images.
  • the manner in which the terminal performs anti-shake processing on the image may include, for example, performing motion path smoothing on the image.
  • the terminal may perform smoothing processing (eg Gaussian smoothing) on the N images according to the rotation matrix corresponding to each of the N images, thereby obtaining a stable video.
  • smoothing processing eg Gaussian smoothing
  • the terminal determines to output N images from among the M images. Therefore, the image output by the terminal will have a certain delay, and the delay time is related to the value of M.
  • the value of M can be adjusted according to the needs of the delay time. For example, in the case of high real-time requirements, the value of M can be a small value, so that the image output delay is small; in the case of low real-time requirements, the value of M can be relatively small large value.
  • FIG. 4 is a schematic flowchart of an image processing method 400 provided by an embodiment of the present application. As shown in Figure 4, the image processing method includes the following steps:
  • Step 401 detecting the motion state of the terminal.
  • the terminal can acquire the angular velocity information measured by the gyroscope installed in the terminal in real time. Then, the terminal performs variance calculation based on the acquired angular velocity information to determine the motion state of the terminal.
  • the rate of the terminal can be calculated by Equation 1 below:
  • the terminal can calculate the historical rate sequence of the gyroscope in the time period (t 0 , t N ) And the terminal can further calculate the variance of the historical rate sequence to determine whether the terminal jitters.
  • the terminal can calculate the variance of the historical rate series. If the variance is greater than the threshold g thre , it can be determined that the terminal is shaking at the current time t N ; if the variance is not greater than the threshold g thre , it can be determined that the terminal is not shaking at the current t N time.
  • the terminal may send an instruction to the image capture device to instruct the image capture device to increase the frame rate of the captured image, thereby enabling the high frame rate mode.
  • the image acquisition device can increase the frame rate of the acquired images to acquire 60 images per second, then the current frame rate mode is 60 frames/sec.
  • the terminal can acquire the images acquired by the image acquisition device in real time, and the terminal can determine motion information corresponding to each image based on the angular velocity information measured in real time by the gyroscope.
  • the gyroscope data sequence corresponding to the time (t 0 , t N ) can be obtained based on the gyroscope in the terminal
  • Equation 2 the terminal pose information at time t N It can be shown as Equation 2:
  • the time synchronization between the gyroscope data and the image data can be performed to obtain the corresponding terminal when the image acquisition device collects each image. pose information.
  • the acquisition moment of the image is t f
  • t a and t b are the acquisition moments of the gyroscope data
  • t b -t a t d
  • the terminal pose information corresponding to time t a and time t b obtained based on formula 2 are expressed as and Then, based on the above linear interpolation method, the terminal pose information corresponding to time t f can be determined It can be expressed as Equation 3 as follows:
  • FIG. 5 is a schematic diagram of a rotation model when a terminal shakes according to an embodiment of the present application.
  • each image collected by the image collection device is on a different plane, and can be associated with each other through a rotation matrix R. Therefore, according to the Rodrigues formula, the motion estimation of the terminal can be performed, that is, based on the pose information corresponding to the image at time t f Get its corresponding rotation matrix R.
  • the images can be sent to the buffer queue one by one, so that the images in the buffer queue can be uniformly processed after the number of images in the buffer queue reaches the set number.
  • the length of the buffer queue determines the delay degree of the image output. The shorter the buffer queue, the lower the delay; the longer the buffer queue, the higher the delay.
  • the terminal needs to select N images according to the jitter amplitude of the images among the M images in the buffer queue.
  • the interval between two adjacent images in the N images in the buffer queue is less than the maximum frame interval X.
  • the maximum frame interval X can be, for example,
  • a rotation matrix can be used to convert all the images into the same coordinate system to calculate the degree of their deviation.
  • the terminal can determine the center point (x m , y m ) of the image m in the buffer queue (the image m can be any one of the M images), and then use the rotation matrix R m corresponding to the image m to perform coordinate transformation to obtain the transformation
  • the back coordinates (x' m , y' m ) are calculated, and the Euclidean distance between the original coordinates (x m , y m ) and the transformed coordinates (x' m , y' m ) is calculated, which is recorded as the offset cm .
  • the unit of the offset cm is the number of pixels, that is, the offset cm can indicate that a certain pixel in the current image (that is, the pixel at the center point) is relative to the corresponding pixel in the image without shaking offset.
  • the pixel at the center point in image 1 is pixel 1
  • there is a pixel 2 corresponding to this pixel 1 in image 2 that is, pixel 1 and pixel 2 both represent the same object in the same scene.
  • the image 1 is the image collected by the image acquisition device when shaking occurs
  • the image 2 is the image collected when the image acquisition device does not shake.
  • the offset of image 1 can be calculated by calculating how many pixels pixel point 1 is offset relative to pixel point 2 (that is, how many pixels are the difference between the position of pixel point 1 and the position of pixel point 2). get.
  • the jitter amplitude corresponding to the image is represented by the offset cm
  • the terminal may also acquire the angular velocity information of the image acquisition device when each image is acquired by the IMU, and then calculate the pose information of the image acquisition device when each image is acquired, and then calculate the pose information of the image acquisition device when acquiring each image through the IMU.
  • the pose information of any image and the pose information of the image acquisition device when the reference image is collected, and the pose change amount between the two pose information is calculated to determine the shaking amplitude of each image in the M images.
  • the jitter amplitude corresponding to the image f- L can be set.
  • c -N is 0.
  • the jitter amplitude ⁇ c -L+1 , ..., c -1 ⁇ corresponding to the image sequence ⁇ f -L+1 , ..., f -1 ⁇ can be set to positive infinity to ensure that the image sequence ⁇ f -L+1 , ..., f -1 ⁇ will not be selected.
  • FIG. 6 is a schematic diagram of selecting an image according to an embodiment of the present application.
  • the image sequence S new ⁇ f -3 , . . . , f -1 , f 1 , . . . , f M ⁇ .
  • the terminal may select N+1 images from S new .
  • the first image selected by the terminal is the last image selected in the previous round of buffer queues, and this image is used as the output sequence of the previous round of buffer queues; the last N images selected by the terminal are used as the current round of buffer queues output sequence.
  • the terminal can use a dynamic programming algorithm to solve the problem. That is to say, the terminal can select N+1 images from the S new queue based on the dynamic programming algorithm according to the jitter amplitude cm of each image, so that the sum of the jitter amplitudes of the N +1 images is the smallest, while satisfying the phase The distance between two adjacent images is not greater than X. In this way, the last N images in the N+1 images obtained by selection are the selected images corresponding to the buffer queue of the current round.
  • FIG. 7 is a schematic diagram of an image selection provided by an embodiment of the present application.
  • M 4
  • N 3
  • the maximum frame interval is 2.
  • the jitter amplitudes of V 0 and V 1 are both smaller than those of V 2 and V 3 .
  • the four images V 2 and V 3 select V 0 and V 1 with smaller jitter amplitudes as the output images of this round of buffer queues.
  • the motion path of the image acquisition device needs to be smoothed, and the image is corrected according to the pose corresponding to the image acquisition device on the smoothed motion path.
  • FIG. 8 is a schematic diagram of image data comparison before path smoothing provided by an embodiment of the present application.
  • the abscissa is time (unit is millisecond)
  • the ordinate is the deflection angle corresponding to the image acquisition device.
  • the wavy solid line is the motion path of the original image acquisition device, and the video image is shaken.
  • i 1, . . . , n), thereby forming a virtual motion path.
  • the virtual camera route is a smooth dashed line segment in the middle, and the path noise and jitter are basically eliminated.
  • the angle that should be corrected for each image can be obtained, that is, from the attitude yi to the attitude y′ i .
  • its corrected image can be obtained from Equation 4:
  • R' i is the rotation matrix of the posture of the image acquisition device after correction
  • R i is the rotation matrix of the corresponding posture of the image acquisition device before correction
  • K is the internal parameter matrix of the image acquisition device.
  • the video frame insertion algorithm or the video repair method can be used for repairing.
  • the image can be repaired through the video frame insertion algorithm, that is, the frame insertion processing is performed by using the previous complete image and the next complete image adjacent to the image to obtain the repaired image; in addition, you can also It is to repair the image through the video repair method, that is, to use a plurality of adjacent images to predict and fill the image picture, so as to obtain the repaired image.
  • FIG. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the terminal includes: an acquisition unit 901 and a processing unit 902; the acquisition unit 901 is configured to acquire a first image sequence, where the first image sequence includes M images, and M is a positive integer ; the processing unit 902 is used to determine the corresponding jitter amplitude of each image in the M images, and the jitter amplitude is used to represent the offset of the pixels in the image compared to the reference image; the processing unit 902 , and is further configured to determine N images among the M images according to the jitter amplitude, where the N is smaller than the M, and the N is a positive integer; the processing unit 902 is further configured to output a second a sequence of images, the second sequence of images comprising the N images.
  • the processing unit 902 is further configured to, according to the jitter amplitude, determine N images from the M images in order of the jitter amplitude from small to large, and the value of N is value is the first threshold.
  • the processing unit 902 is further configured to determine N images from the M images in the order of the jitter amplitude from small to large according to the jitter amplitude and the constraint conditions, and the The value of N is the first threshold; wherein, the constraint condition is that the interval between two adjacent images in the first image sequence in the obtained N images is smaller than the second threshold.
  • the processing unit 902 is further configured to, according to the shaking amplitude, determine, among the M images, N images whose shaking amplitude is less than a third threshold.
  • the processing unit 902 is further configured to, when it is determined that the image capture device shakes, send an instruction to the image capture device, where the instruction is used to instruct the image capture device to use The image is collected at the first frame rate; wherein the image collection device uses a second frame rate to collect the image when no shaking occurs, and the second frame rate is smaller than the first frame rate.
  • the acquiring unit 901 is further configured to acquire the angular velocity information of the image acquisition device at S moments in the first time period, where S is an integer greater than 1; the processing The unit 902 is further configured to determine the variance of the angular velocity information at the S moments; when the variance is greater than a fourth threshold, determine that the image acquisition device shakes; when the variance is less than or equal to the fourth threshold , it is determined that the image acquisition device does not shake.
  • the jitter amplitudes corresponding to the M images include offsets corresponding to the M images; the acquiring unit 901 is further configured to acquire the image acquisition device in the second time period. Angular velocity information at P times, where P is an integer greater than 1, and the image acquisition device is used to collect the first image sequence; the processing unit 902 is further configured to determine according to the angular velocity information at the P times pose information when the image acquisition device collects the M images; the processing unit 902 is further configured to determine, according to the pose information, an offset corresponding to each of the M images.
  • the processing unit 902 is further configured to determine, according to the angular velocity information of the P moments and the acquisition moments of the M images, by linear interpolation The pose information when describing the M images.
  • the processing unit 902 is further configured to determine a rotation matrix corresponding to each of the M images according to the pose information of the image acquisition device when acquiring the M images ; the processing unit 902 is further configured to determine the offset corresponding to the M images according to the rotation matrices corresponding to the M images.
  • the obtaining unit 901 is further configured to obtain an image selection ratio, where the image selection ratio is a ratio between the number of image inputs and the number of images output; the processing unit 902 is further configured to The value of N is determined according to the M images and the image selection ratio; wherein, the ratio between the M and the N is the same as the image selection ratio.
  • the processing unit 902 is further configured to perform anti-shake processing on the N images according to the anti-shake algorithm to obtain the N images after processing; the processing unit 902 is further configured to use for outputting the second image sequence, the second image sequence including the processed N images.
  • FIG. 10 is a schematic structural diagram of a terminal 100 according to an embodiment of the present application.
  • the terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, Antenna 1, Antenna 2, Mobile Communication Module 150, Wireless Communication Module 160, Audio Module 170, Speaker 170A, Receiver 170B, Microphone 170C, Headphone Interface 170D, Sensor Module 180, Key 190, Motor 191, Indicator 192, Camera 193, Display screen 194, and subscriber identification module (subscriber identification module, SIM) card interface 195 and so on.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the terminal 100 may include more or less components than shown, or some components may be combined, or some components may be separated, or different component arrangements.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processor
  • graphics processor graphics processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the controller may be the nerve center and command center of the terminal 100 .
  • the controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby increasing the efficiency of the system.
  • the processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I1C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I1S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transceiver (universal asynchronous transmitter) receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / or universal serial bus (universal serial bus, USB) interface, etc.
  • I1C integrated circuit
  • I1S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transceiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic illustration, and does not constitute a structural limitation of the terminal 100 .
  • the terminal 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140 and supplies power to the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the wireless communication function of the terminal 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like.
  • the terminal 100 may communicate with other devices using a wireless communication function.
  • the terminal 100 may communicate with the second electronic device, the terminal 100 establishes a screen projection connection with the second electronic device, and the terminal 100 outputs the screen projection data to the second electronic device.
  • the screen projection data output by the terminal 100 may be audio and video data.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in terminal 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 may provide a wireless communication solution including 1G/3G/4G/5G, etc. applied on the terminal 100.
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves from the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor, and then convert it into electromagnetic waves and radiate it out through the antenna 2 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the same device as at least part of the modules of the processor 110 .
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low frequency baseband signal is processed by the baseband processor and passed to the application processor.
  • the application processor outputs sound signals through audio devices (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or videos through the display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent of the processor 110, and may be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide applications on the terminal 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT wireless fidelity
  • GNSS global navigation satellite system
  • frequency modulation frequency modulation, FM
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 1 , modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation
  • the antenna 1 of the terminal 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the terminal 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code Division Multiple Access (WCDMA), Time Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (GLONASS), a Beidou navigation satellite system (BDS), a quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the terminal 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • Display screen 194 is used to display images, videos, and the like.
  • Display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light).
  • LED diode AMOLED
  • flexible light-emitting diode flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED) and so on.
  • the terminal 100 may include one or N display screens 194 , where N is a positive integer greater than one.
  • the display screen 194 may be used to display various interfaces output by the system of the terminal 100 .
  • interfaces output by the terminal 100 For each interface output by the terminal 100, reference may be made to related descriptions in subsequent embodiments.
  • the terminal 100 can realize the shooting function through the ISP, the camera 193, the video codec, the GPU, the display screen 194 and the application processor.
  • the ISP is used to process the data fed back by the camera 193 .
  • the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin tone.
  • ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be provided in the camera 193 .
  • Camera 193 is used to capture still images or video.
  • the object is projected through the lens to generate an optical image onto the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the terminal 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
  • a digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals.
  • Video codecs are used to compress or decompress digital video.
  • Terminal 100 may support one or more video codecs.
  • the terminal 100 can play or record videos in various encoding formats, such as: moving picture experts group (moving picture experts group, MPEG) 1, MPEG1, MPEG3, MPEG4, and so on.
  • MPEG moving picture experts group
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • Applications such as intelligent cognition of the terminal 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the terminal 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the processor 110 executes various functional applications and data processing of the terminal 100 by executing the instructions stored in the internal memory 121 .
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), and the like.
  • the storage data area may store data (such as audio data, phone book, etc.) created during the use of the terminal 100 and the like.
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (UFS), and the like.
  • the terminal 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • the audio module 170 can be used to play the sound corresponding to the video. For example, when the display screen 194 displays a video playing screen, the audio module 170 outputs the sound of the video playing.
  • the audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal.
  • Speaker 170A also referred to as a “speaker” is used to convert audio electrical signals into sound signals.
  • the receiver 170B also referred to as “earpiece”, is used to convert audio electrical signals into sound signals.
  • the microphone 170C also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
  • the earphone jack 170D is used to connect wired earphones.
  • the earphone interface 170D can be the USB interface 130, or can be a 3.5mm open mobile terminal platform (OMTP) standard interface, a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the pressure sensor 180A is used to sense pressure signals, and can convert the pressure signals into electrical signals.
  • the pressure sensor 180A may be provided on the display screen 194 .
  • the gyro sensor 180B may be used to determine the motion attitude of the terminal 100 .
  • the air pressure sensor 180C is used to measure air pressure.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the terminal 100 in various directions (including three axes or six axes). When the terminal 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the terminal posture, and can be used in horizontal and vertical screen switching, pedometer and other applications.
  • Distance sensor 180F for measuring distance.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the temperature sensor 180J is used to detect the temperature.
  • Touch sensor 180K also called “touch panel”.
  • the touch sensor 180K may be disposed on the display screen 194 , and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to touch operations may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the terminal 100 , which is different from the position where the display screen 194 is located.
  • the keys 190 include a power-on key, a volume key, and the like. Keys 190 may be mechanical keys. It can also be a touch key.
  • the terminal 100 may receive key input and generate key signal input related to user settings and function control of the terminal 100 .
  • Motor 191 can generate vibrating cues.
  • the indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like.
  • the SIM card interface 195 is used to connect a SIM card.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: a U disk, a removable hard disk, a read-only memory, a random access memory, a magnetic disk or an optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

本申请实施例公开了一种图像处理方法,涉及图像处理技术领域。本申请实施例方法包括:获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数;确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中的像素相比于基准图像的偏移量;根据所述抖动幅度,在所述M个图像中确定N个图像,所述N小于所述M,且所述N为正整数;输出第二图像序列,所述第二图像序列包括所述N个图像。该方法能够避免视频画面出现无图像区域,提高视频的质量。

Description

一种图像处理方法及相关装置
本申请要求于2020年10月30日提交中国专利局、申请号为202011193237.8、发明名称为“一种图像处理方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种图像处理方法及相关装置。
背景技术
随着社会的发展,人们越来越多的使用各种终端进行视频拍摄,包括消费者领域以及视频监控领域等。在使用终端拍摄视频的过程中,由于拍摄者运动或者用于固定终端的固定杆受到外力发生振动,会导致视频画面出现抖动的情况,无法取得令人满意的视频质量。因此,视频防抖技术被提出,应用于视频的拍摄过程或者视频的后处理过程,以减轻抖动对视频质量的影响。
目前,在视频防抖算法中,通常是预先设置一个裁剪比例,对输入的视频进行裁剪,并输出较小尺寸的视频,从而实现视频防抖。然而,在拍摄视频的终端抖动较大时,视频画面的抖动程度会超过裁剪比例能处理的范围,从而导致视频画面出现无图像区域,影响了视频的质量。
发明内容
本申请提供了一种图像处理方法及相关装置,通过在获取到M个图像时,根据M个图像中每个图像对应的抖动幅度,在M个图像选择N个图像作为输出,N为小于M的正整数,从而将抖动幅度较大的图像筛选掉,尽可能保证视频画面的抖动程度不会超过裁剪比例能处理的范围,避免视频画面出现无图像区域,提高了视频的质量。
本申请第一方面提供一种图像处理方法,包括:终端获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数,即所述第一图像序列可以是图像采集装置在一个时间段内,例如0.5秒或1秒,所采集到的一组连续的图像。终端确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中的像素相比于该像素在基准图像中所对应的像素的偏移量。该基准图像可以是图像采集装置未发生抖动时所采集的图像,且该基准图像是图像采集装置在采集M个图像之前所采集的图像。根据所述M个图像中每个图像的抖动幅度,终端在所述M个图像中确定N个图像,所述N小于所述M,且所述N为正整数,即在M个图像中确定N个抖动幅度较小的图像,以便于将抖动幅度较大的图像筛选出来。终端输出第二图像序列,所述第二图像序列包括所述N个图像。
本方案中,通过在获取到M个图像时,根据M个图像中每个图像对应的抖动幅度,在M个图像中选择N个图像作为输出,N为小于M的正整数,从而将抖动幅度较大的图像筛选掉,保证视频画面的抖动程度不会超过裁剪比例能处理的范围,避免视频画面出现无图像区域,提高了视频的质量。
在一种可能的实现方式中,所述根据所述M个图像中每个图像对应的抖动幅度,在所 述M个图像中确定N个图像,包括:根据所述抖动幅度,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值。也就是说,终端在M个图像中选择了N个图像,剩余M-N个未选择的图像。终端所确定的N个图像为该M个图像中抖动幅度最小的N个图像,N个图像中的任意一个图像对应的抖动幅度均小于M-N个未选择的图像对应的抖动幅度。
通过按照抖动幅度从小到大的顺序,在M个图像中确定N个图像,可以将输入的图像序列中抖动幅度最大的一个或多个图像筛选掉,从而保证用于后续防抖处理的图像的抖动程度不会超过裁剪比例能处理的范围,避免视频画面出现无图像区域,提高视频的质量。
在一种可能的实现方式中,所述根据所述M个图像中每个图像对应的抖动幅度,在所述M个图像中确定N个图像,包括:根据所述抖动幅度以及约束条件,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值;其中,所述约束条件为得到的N个图像中相邻的两个图像在所述第一图像序列中的间隔小于第二阈值。也就是说,终端在M个图像中选择N个图像的过程中,除了按照抖动幅度从小到大的顺序来选择图像,还需要保证选择到的相邻的两个图像在第一图像序列中的间隔不大于第二阈值。
通过设定终端在选择图像的过程中的约束条件,可以保证终端所选择到的多个图像中任意两个相邻的图像之间的时间间隔在一定的范围内,避免视频画面出现卡顿的现象。
在一种可能的实现方式中,所述根据所述M个图像中每个图像对应的抖动幅度,在所述M个图像中确定N个图像,包括:根据所述抖动幅度,在所述M个图像中确定抖动幅度小于第三阈值的N个图像。
也就是说,在终端中可以预先设定有第三阈值,终端可以根据M个图像中每个图像对应的抖动幅度与第三阈值之间的大小关系,确定需要选择的图像为抖动幅度小于第三阈值的图像。其中,第三阈值可以是根据裁剪比例所确定的阈值,该裁剪比例为终端中预置的,是在对图像防抖处理中用于裁剪图像的比例。通过根据裁剪比例确定第三阈值,可以确保采用该裁剪比例处理抖动幅度小于该第三阈值的图像时,不会出现无图像区域。
在一种可能的实现方式中,所述方法还包括:在确定所述图像采集装置发生抖动时,向所述图像采集装置发送指令,所述指令用于指示所述图像采集装置采用第一帧率来采集图像;其中,所述图像采集装置在未发生抖动时采用第二帧率采集图像,所述第二帧率小于所述第一帧率。
在终端所输出的视频的帧率固定的情况下,终端在确定图像采集装置发生抖动时,再指示图像采集装置提高采集图像的帧率,可以保证终端获取到的输入图像多于终端输出的图像,便于终端筛选并剔除抖动幅度较大的图像。只有在图像采集装置发生抖动的情况下,图像采集装置才需要提高采集图像的帧率,从而能够避免图像采集装置一直采用较高的帧率采集图像,降低了图像采集装置的能耗。
在一种可能的实现方式中,所述方法还包括:获取所述图像采集装置在第一时间段内的S个时刻的角速度信息,所述S为大于1的整数;确定所述S个时刻的角速度信息的方差;当所述方差大于第四阈值时,确定所述图像采集装置发生抖动;当所述方差小于或等 于所述第四阈值时,确定所述图像采集装置未发生抖动。
其中,方差是指每个角速度信息与全体角速度信息的平均数之差的平方值的平均数,用于衡量每一个角速度信息与总体角速度信息均值之间的差异。在方差较大的情况下,可以认为角速度信息与总体角速度信息均值之间的差异较大,即角速度在总体角速度的均值附近波动较大,从而可以认为图像采集装置发生了抖动。
在一种可能的实现方式中,所述M个图像对应的抖动幅度包括所述M个图像对应的偏移量;所述终端确定所述M个图像对应的抖动幅度,包括:终端获取图像采集装置在第二时间段内的P个时刻的角速度信息,所述P为大于1的整数,所述图像采集装置用于采集所述第一图像序列;终端根据所述P个时刻的角速度信息确定所述图像采集装置在采集所述M个图像时的位姿信息;终端根据所述位姿信息确定所述M个图像中每个图像对应的偏移量。
在一种可能的实现方式中,终端根据所述P个时刻的角速度信息确定图像采集装置在采集所述M个图像时的位姿信息,包括:终端根据所述P个时刻的角速度信息以及所述M个图像的采集时刻,通过线性插值法确定所述图像采集装置在采集所述M个图像时的位姿信息。
在一种可能的实现方式中,所述终端根据所述位姿信息确定所述M个图像对应的偏移量,包括:终端根据所述图像采集装置在采集所述M个图像时的位姿信息确定所述M个图像中每个图像对应的旋转矩阵;终端根据所述M个图像对应的旋转矩阵,确定所述M个图像对应的偏移量。
在一种可能的实现方式中,所述终端在所述M个图像中确定N个图像之前,所述方法还包括:终端获取图像选择比值,所述图像选择比值为图像输入数量与图像输出数量之间的比值;终端根据所述M个图像与所述图像选择比例,确定所述N的取值;其中,所述M与所述N之间的比值与所述图像选择比值相同。
在一种可能的实现方式中,所述终端输出第二图像序列之前,所述方法还包括:终端根据防抖算法,对所述N个图像进行防抖处理,得到处理后的N个图像;终端输出所述第二图像序列,所述第二图像序列包括所述处理后的N个图像。
本申请第二方面提供一种终端,包括:获取单元和处理单元;所述获取单元,用于获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数;所述处理单元,用于确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中的像素相比于基准图像的偏移量;所述处理单元,还用于根据所述抖动幅度,在所述M个图像中确定N个图像,所述N小于所述M,且所述N为正整数;所述处理单元,还用于输出第二图像序列,所述第二图像序列包括所述N个图像。
在一种可能的实现方式中,所述处理单元,还用于根据所述抖动幅度,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值。
在一种可能的实现方式中,所述处理单元,还用于根据所述抖动幅度以及约束条件,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一 阈值;其中,所述约束条件为得到的N个图像中相邻的两个图像在所述第一图像序列中的间隔小于第二阈值。
在一种可能的实现方式中,所述处理单元,还用于根据所述抖动幅度,在所述M个图像中确定抖动幅度小于第三阈值的N个图像。
在一种可能的实现方式中,所述处理单元,还用于在确定所述图像采集装置发生抖动时,向所述图像采集装置发送指令,所述指令用于指示所述图像采集装置采用第一帧率来采集图像;其中,所述图像采集装置在未发生抖动时采用第二帧率采集图像,所述第二帧率小于所述第一帧率。
在一种可能的实现方式中,所述获取单元,还用于获取所述图像采集装置在第一时间段内的S个时刻的角速度信息,所述S为大于1的整数;所述处理单元,还用于确定所述S个时刻的角速度信息的方差;当所述方差大于第四阈值时,确定所述图像采集装置发生抖动。
在一种可能的实现方式中,所述M个图像对应的抖动幅度包括所述M个图像对应的偏移量;所述获取单元,还用于获取图像采集装置在第二时间段内的P个时刻的角速度信息,所述P为大于1的整数,所述图像采集装置用于采集所述第一图像序列;所述处理单元,还用于根据所述P个时刻的角速度信息确定所述图像采集装置在采集所述M个图像时的位姿信息;所述处理单元,还用于根据所述位姿信息确定所述M个图像中每个图像对应的偏移量。
在一种可能的实现方式中,所述处理单元,还用于根据所述P个时刻的角速度信息以及所述M个图像的采集时刻,通过线性插值法确定所述图像采集装置在采集所述M个图像时的位姿信息。
在一种可能的实现方式中,所述处理单元,还用于根据所述图像采集装置在采集所述M个图像时的位姿信息确定所述M个图像中每个图像对应的旋转矩阵;所述处理单元,还用于根据所述M个图像对应的旋转矩阵,确定所述M个图像对应的偏移量。
在一种可能的实现方式中,所述获取单元,还用于获取图像选择比值,所述图像选择比值为图像输入数量与图像输出数量之间的比值;所述处理单元,还用于根据所述M个图像与所述图像选择比例,确定所述N的取值;其中,所述M与所述N之间的比值与所述图像选择比值相同。
在一种可能的实现方式中,所述处理单元,还用于根据防抖算法,对所述N个图像进行防抖处理,得到处理后的N个图像;所述处理单元,还用于输出所述第二图像序列,所述第二图像序列包括所述处理后的N个图像。
本申请实施例第三方面提供了一种终端,包括:一个或多个处理器和存储器;其中,所述存储器中存储有计算机可读指令;所述一个或多个处理器读取所述存储器中的所述计算机可读指令以使所述终端实现如上述第一方面以及各种可能的实现方式中任一项所述的方法。该终端例如可以包括耳机。
本申请实施例第四方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行如上述第一方面以及各种可能的实现方式中任一项所述的方法。
本申请实施例第五方面提供了一种计算机可读存储介质,包括指令,当所述指令在计算机上运行时,使得计算机执行如上述第一方面以及各种可能的实现方式中任一项所述的方法。
本申请实施例第六方面提供了一种芯片,包括处理器。处理器用于读取并执行存储器中存储的计算机程序,以执行上述任一方面任意可能的实现方式中的方法。可选地,该芯片该包括存储器,该存储器与该处理器通过电路或电线与存储器连接。进一步可选地,该芯片还包括通信接口,处理器与该通信接口连接。通信接口用于接收需要处理的数据和/或信息,处理器从该通信接口获取该数据和/或信息,并对该数据和/或信息进行处理,并通过该通信接口输出处理结果。该通信接口可以是输入输出接口。
其中,第二方面至第六方面中任一种实现方式所带来的技术效果可参见第一方面中相应实现方式所带来的技术效果,此处不再赘述。
附图说明
图1为本申请实施例提供的一种裁剪视频画面的示意图;
图2为本申请实施例提供的一种应用场景的示意图;
图3为本申请实施例提供的一种图像处理方法300的流程示意图;
图4为本申请实施例提供的一种图像处理方法400的流程示意图;
图5为本申请实施例提供的一种终端抖动时的旋转模型的示意图;
图6为本申请实施例提供的一种选择图像的示意图;
图7为本申请实施例提供的一种图像选择的示意图;
图8为本申请实施例提供的一种路径平滑前的图像数据对比示意图;
图9为本申请实施例提供的一种终端的结构示意图;
图10为本申请实施例提供的一种终端100的结构示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。在本申请中出现的对步骤进行的命名或者编号,并不意味着必须按照命名或者编号所指示的时间/逻辑先后顺序执行方法流程中的步骤,已经命名或者编号的流程步骤可以根据要实现的技术目的变更执行次序,只要能达到相同或者相类似的技术效果即可。
随着终端技术的不断发展,诸如手机、平板电脑等电子设备已经具备了强大的处理能力,深深的融入人们的工作和生活中。目前,人们在日常生活中越来越多的使用电子设备拍摄视频。然而,在使用终端拍摄视频的过程中,由于拍摄者运动或者用于固定终端的固定杆受到外力发生振动,会导致视频画面出现抖动的情况,无法取得令人满意的视频质量。因此,视频防抖技术被提出,应用于视频的拍摄过程或者视频的后处理过程,以减轻抖动对视频质量的影响。
目前,在相关技术中,视频防抖算法包括运动路径平滑以及运动补偿两个步骤。运动路径平滑是指利用低通滤波器或算法,对终端原始的运动路径进行平滑,消除在运动过程中出现的抖动部分,得到平滑后的运动路径。运动补偿是指根据终端原始的运动路径与平滑后的运动路径之间的映射关系,得到运动的补偿信息,从而对当前视频帧进行校正,得到新的稳定视频帧。该方法需要预先设置一个裁剪比例,对处理后的图像进行裁剪,以保证视频画面的稳定。
示例性地,可以参阅图1,图1为本申请实施例提供的一种裁剪视频画面的示意图。如图1所示,对于输入的每一帧图像,均需要采用一个大小比例固定的裁剪窗口对图像进行裁剪,裁剪后得到的图像作为输出。对于输入的每一帧图像,由于抖动情况不一样,导致终端实际拍摄每一帧图像的位置都可能是不一样的。这样,主要拍摄物(例如图1中的人物)在每一帧图像中的位置都可能是不一样的。为了保证视频画面的稳定,即保证主要拍摄物在视频画面中的位置相对稳定,可以是基于运动的补偿信息调整裁剪窗口的位置,从而裁剪得到相对稳定的视频。但是,在终端抖动较大的情况下,裁剪窗口的位置调整量也相对较大,从而可能导致部分裁剪窗口在图像之外,例如图1中的第4帧图像。这样,裁剪得到的图像中会出现无图像区域,即黑边,从而影响了视频的质量。
有鉴于此,本申请实施例提供了一种图像处理方法,应用于视频防抖,通过在获取到M个图像时,根据M个图像中每个图像对应的抖动幅度,在M个图像选择N个图像作为输出,N为小于M的正整数,从而将抖动幅度较大的图像筛选掉,保证视频画面的抖动程度不会超过裁剪比例能处理的范围,避免视频画面出现无图像区域,提高视频的质量。
可以参阅图2,图2为本申请实施例提供的一种应用场景的示意图。如图2所示,本申请实施例提供的图像处理方法可以应用于终端上,该终端上安装有能够拍摄视频的图像采集装置。在图像采集装置拍摄视频的过程中,该终端发生抖动,图像采集装置拍摄到的视频画面会发生变化,并且视频画面的变化量与终端的抖动幅度相关。
该终端又称之为用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等,是一种安装有能够拍摄视频的图像采集装置的设备。例如,具有拍摄功能的手持式设备、监控摄像机等。目前,一些终端的举例为:手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、监控摄像机、移动互联网设备(mobile internet device,MID)、可穿戴设备,虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城 市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等。
终端中的图像采集装置用于将光信号转换为电信号,以生成图像信号。图像采集装置例如可以为图像传感器,图像传感器例如可以为电荷耦合器件(Charge Coupled Device,CCD)或者互补金属氧化物半导体(Complementary Metal Oxide Semiconductor,CMOS)。
终端中还可以安装有用于测量终端运动情况的装置,例如惯性测量单元(Inertial measurement unit,IMU)。IMU是测量物体三轴的角速率以及加速度的装置。一般地,一个IMU包含了三个单轴的加速度计和三个单轴的陀螺仪;加速度计用于检测物体在载体坐标系统独立三轴的加速度信号,而陀螺仪检测载体相对于导航坐标系的角速度信号。通过IMU测量物体在三维空间中的角速度和加速度,可以解算出物体的姿态。
可以参阅图3,图3为本申请实施例提供的一种图像处理方法300的流程示意图。如图3所示,该图像处理方法300包括以下的步骤:
步骤301,获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数。
本实施例中,在终端执行视频拍摄的过程中,终端中的图像采集装置持续采集图像,终端可以获取到图像采集装置采集到的第一图像序列。该第一图像序列可以是图像采集装置在一个时间段内,例如0.5秒或1秒,所采集到的一组连续的图像。
该第一图像序列包括有图像采集装置采集到的M个图像,M的大小与图像采集装置采集图像的帧率以及第一图像序列对应的采集时间相关。例如,在图像采集装置采集图像的帧率为每秒采集60张,且第一图像序列对应的采集时间为0.2秒时,M为12;在图像采集装置采集图像的帧率为:每秒采集30个图像,且第一图像序列对应的采集时间为0.5秒时,M为15。
在一个可能的实施例中,终端可以在确定所述图像采集装置发生抖动时,向所述图像采集装置发送指令,该指令用于指示所述图像采集装置采用第一帧率来采集图像;其中,所述图像采集装置在未发生抖动时采用第二帧率采集图像,所述第二帧率小于所述第一帧率。例如,在终端获取到该第一图像序列之前,终端采用第二帧率来采集图像,该第二帧率例如可以为每秒采集30张;如果终端确定图像采集装置发生抖动,则终端向图像采集装置发送指示其采用第一帧率采集图像的指令,从而使得图像采集装置提高采集图像的帧率,例如在第一帧率为每秒采集60张的情况下,图像采集装置则从每秒采集30张的帧率提高至每秒采集60张。
可以理解的是,在终端所输出的视频的帧率固定的情况下,终端在确定图像采集装置发生抖动时,再指示图像采集装置提高采集图像的帧率,可以保证终端获取到的输入图像多于终端输出的图像,便于终端筛选并剔除抖动幅度较大的图像。此外,在图像采集装置稳定的情况下,图像采集装置采集图像的帧率可以与终端输出图像的帧率相同。只有在图像采集装置发生抖动的情况下,图像采集装置才需要提高采集图像的帧率,从而能够避免图像采集装置一直采用较高的帧率采集图像,降低了图像采集装置的能耗。
在一个可能的实施例中,终端确定所述图像采集装置发生抖动的过程,可以包括:终端获取所述图像采集装置在第一时间段内的S个时刻的角速度信息,该S为大于1的整 数;终端确定所述S个时刻的角速度信息的方差,并且当所述方差大于第四阈值时,确定所述图像采集装置发生抖动,当所述方差小于或等于所述第四阈值时,确定所述图像采集装置发生抖动。其中,第一时间段的时间长度可以根据IMU采集角速度的频率来确定,例如在IMU采集角速度的频率为100赫兹时(即每秒采集100次角速度),则第一时间段可以为0.1秒。这样,在IMU采集角速度的时间间隔固定的情况下,第一时间段内的S个时刻也是固定的,例如第一时间段为0.1秒时,S个时刻可以是指10个时刻。其中,方差是指每个角速度信息与全体角速度信息的平均数之差的平方值的平均数,用于衡量每一个角速度信息与总体角速度信息均值之间的差异。在方差较大的情况下,可以认为角速度信息与总体角速度信息均值之间的差异较大,即角速度在总体角速度的均值附近波动较大,从而可以认为图像采集装置发生了抖动。
步骤302,确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中的像素相比于基准图像的偏移量。
可以理解的是,在终端拍摄视频的过程中,如果终端发生了抖动,则安装于终端上的图像采集装置同样会发生相同的抖动。这样,当图像采集装置在持续采集图像时发生抖动,图像采集装置采集到的每一个图像都具有对应的抖动幅度,该抖动幅度用于表示图像中的像素相比于该像素在基准图像中所对应的像素的偏移量。该基准图像可以是图像采集装置未发生抖动时所采集的图像,且该基准图像是图像采集装置在采集M个图像之前所采集的图像。例如,在静态场景下,当图像采集装置静止时,图像采集装置在多个时刻所采集的图像实际上是相同的,即不同图像中用于表示相同物体的像素的位置不变;当图像采集装置抖动时,图像采集装置的位置相对于图像采集装置未发生抖动时发生了变化,那么场景中的每个物体在图像采集装置所采集到的图像中的位置也是发生了变化,即不同图像中用于表示相同物体的像素的位置发生了变化。
示例性地,在图像采集装置采集该M个图像之前,图像采集装置未发生抖动,则终端可以获取图像采集装置在发生抖动前所采集的一个或多个图像,并且选择其中的一个图像作为基准图像。那么,对于M个图像中的任意一个图像来说,该图像中的像素(例如位于中心点的像素)可以在该基准图像找到对应的像素(即基准图像中与其表示相同物体的像素)。终端通过确定M个图像中的任意一个图像的像素相比于其在基准图像中对应的像素的偏移量,来确定该图像的抖动幅度。
在一个可能的实施例中,终端确定该M个图像对应的抖动幅度的过程可以包括:
终端通过IMU获取所述图像采集装置在一定时间段内的P个时刻的角速度信息,所述P为大于1的整数,且P个时刻中相邻的两个时刻的时间间隔可以是相等的。例如,在IMU的频率为100赫兹的情况下,终端可以通过IMU获取图像采集装置在0.1秒内的10个时刻的角速度信息,且0.1秒内的每两个时刻之间的时间间隔均为0.01秒。
在获取到P个时刻的角速度信息之后,终端可以根据P个时刻的角速度信息确定所述图像采集装置在采集所述M个图像时的位姿信息。在获取到了P个时刻的角速度信息,以及P个时刻中相邻的两个时刻的时间间隔的情况下,可以通过每个时刻的角速度信息与相邻两个时刻的时间间隔,确定图像采集装置在该时间间隔内的位姿变化量。通过叠加每个 时间间隔内的位姿变化量,即可得到图像采集装置在P个时刻的角速度信息。例如,在获取到时刻1、时刻2以及时刻3的角速度信息以及固定的时间间隔t时,可以基于时刻1的角速度信息和时间间隔t得到图像采集装置在时刻1的位姿信息;基于图像采集装置在时刻1的位姿信息、时刻2的角速度信息和时间间隔t可以叠加得到图像采集装置在时刻2的位姿信息;类似地,基于时刻2对应的位姿信息、时刻3的角速度信息和时间间隔t可以叠加得到图像采集装置在时刻3的位姿信息。
由于IMU采集角速度信息的时刻与图像采集装置采集图像的时刻可能不是相同的时刻,因此还可以基于IMU采集角速度信息的时刻对应的位姿信息,确定图像采集装置采集图像的时刻对应的位姿信息。示例性地,终端可以根据所述P个时刻的角速度信息以及所述M个图像的采集时刻,通过线性插值法确定所述图像采集装置在采集所述M个图像时的位姿信息。其中,线性插值法是基于经过两插值节点的直线来近似代替原函数,从而确定直线上的任意一点对应的值。也就是说,对于图像采集装置采集图像的任意一个时刻,可以基于离与该时刻最近的两个采集时刻(即IMU采集角速度信息的时刻)对应的位姿信息,通过线性插值法确定该时刻对应的位姿信息,从而得到每个图像所对应的位姿信息。
最后,终端可以根据M个图像对应的位姿信息来确定所述M个图像对应的偏移量,该偏移量实际上可以为图像中的像素的偏移量。示例性地,终端可以对图像采集装置进行运动估计,例如基于罗德里格斯公式,根据所述M个图像对应的位姿信息分别确定所述M个图像对应的旋转矩阵。终端再根据所述旋转矩阵对M个图像中的坐标点进行变换处理,得到M个变换后的坐标点,通过计算每个图像中变换后的坐标点与变换前的坐标点之间的偏移量,确定所述M个图像对应的偏移量,该偏移量即为图像的抖动幅度。
在另一个可能的示例中,终端也可以是通过IMU获取图像采集装置在采集每个图像时的角速度信息,然后计算图像采集装置在采集每个图像时的位姿信息,通过基于图像采集装置在采集M个图像中的任意一个图像时的位姿信息和图像采集装置在采集基准图像时的位姿信息,计算两个位姿信息之间的位姿变化量,来确定M个图像中的每个图像的抖动幅度。
步骤303,根据所述抖动幅度,在所述M个图像中确定N个图像,所述N小于所述M,且所述N为正整数。
本实施例中,在确定了M个图像中每个图像对应的抖动幅度之后,可以根据M个图像中每个图像的抖动幅度大小,在M个图像中确定N个抖动幅度较小的图像,以便于将抖动幅度较大的图像筛选出来。
其中,终端在M个图像中确定N个图像的方式可以有多种。
方式一,终端可以根据所述M个图像对应的抖动幅度,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值。
也就是说,终端在M个图像中选择了N个图像,剩余M-N个未选择的图像。终端所确定的N个图像为该M个图像中抖动幅度最小的N个图像,N个图像中的任意一个图像对应的抖动幅度均小于M-N个未选择的图像对应的抖动幅度。其中,N的取值为第一阈值, 第一阈值可以是终端在选择图像前所确定的,例如终端可以基于预先设定的M与N之间的比例关系,根据第一图像序列中的图像的数量(即M)确定N。示例性地,在M为60,且M与N之间的比例关系为2比1时,终端可以确定N为30。
例如,假设M为5,N为3,即终端需要在5个图像(例如图像A1,图像A2,图像A3,图像A4,图像A5)中确定其中的3个图像,且该5个图像对应的抖动幅度分别为1,2,3,4,5。这样,终端可以在该5个图像中确定抖动幅度最小的3个图像,即抖动幅度为1的图像A1,抖动幅度为2的图像A2以及抖动幅度为3的图像A3。
通过按照抖动幅度从小到大的顺序,在M个图像中确定N个图像,可以将输入的图像序列中抖动幅度最大的一个或多个图像筛选掉,从而保证用于后续防抖处理的图像的抖动程度不会超过裁剪比例能处理的范围,避免视频画面出现无图像区域,提高视频的质量。
方式二,终端根据所述M个图像对应的抖动幅度以及约束条件,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,N为第一阈值;其中,所述约束条件为得到的N个图像中相邻的两个图像在所述第一图像序列中的间隔小于第二阈值。
相比于方式一,终端在M个图像中选择N个图像的过程中,除了按照抖动幅度从小到大的顺序来选择图像,还需要保证选择到的相邻的两个图像在第一图像序列中的间隔不大于第二阈值。其中,N的取值为第一阈值,第一阈值可以是终端在选择图像前所确定的,例如终端可以基于预先设定的M与N之间的比例关系,根据第一图像序列中的图像的数量(即M)确定N。示例性地,在M为60,且M与N之间的比例关系为2比1时,终端可以确定N为30。
可以理解的是,第一图像序列中的M个图像是图像采集装置按照时间顺序依次采集的,每两个相邻的图像之间的时间间隔均是固定的。因此,在选择到的N个图像中,如果相邻的两个图像在第一图像序列中的间隔较大时,这两个图像之间的时间间隔也较大。这样一来,在图像中的运动物体运动速度较大的情况下,这两个图像中的运动物体所在的位置可能出现较大偏差,从而导致由这两个图像构成的视频中出现画面卡顿的现象,影响观看体验。
因此,通过设定终端在选择图像的过程中的约束条件,可以保证终端所选择到的多个图像中任意两个相邻的图像之间的时间间隔在一定的范围内,避免视频画面出现卡顿的现象。在实际应用过程中,第二阈值的取值可以根据图像采集装置采集图像的时间间隔来确定。例如,在图像采集装置采集图像的时间间隔较大时,第二阈值的取值则可以为较小的值,以保证选择到的两个图像之间的时间间隔在一定范围内;在图像采集装置采集图像的时间间隔较小时,第二阈值的取值则可以为较大的值。示例性地,在图像采集装置采集图像的时间间隔为0.02秒时,第二阈值可以为2或3;在图像采集装置采集图像的时间间隔为0.01秒时,第二阈值可以为4或5。
例如,假设第一图像序列为{B1,B2,B3,B4,B5,B6},第一图像序列对应的抖动幅度分别为{1,5,4,3,2,1},且终端需要在第一图像序列中选择3个图像(即M为6,N为3),约束条件为选择到的两个相邻的在第一图像序列中的间隔不大于2。可以看出,在终端只按照抖动幅度从小到大的顺序选择3个图像时,终端可以选择到抖动幅度分别为 1、2以及1的图像B1、图像B5以及图像B6。然而,图像B1与图像B5之间间隔了3个图像,即图像B1与B5之间的间隔大于3,并不满足约束条件的要求。因此,终端在根据该6个图像对应的抖动幅度以及约束条件,在该6个图像中选择3个图像时,终端可以选择到符合条件的图像B1,图像B4以及图像B6。其中,图像B1与图像B4之间的间隔不大于2,图像B4与图像B6之间的间隔也不大于2。
通过按照抖动幅度从小到大的顺序以及预先设定的约束条件,在M个图像中确定N个图像,可以将输入的图像序列中抖动幅度最大的一个或多个图像筛选掉,并且确保选择到的两个相邻图像之间的间隔,能够在避免视频画面出现无图像区域的同时,保证视频画面不会出现卡顿的现象,提高视频的质量。
方式三,根据所述M个图像对应的抖动幅度,在所述M个图像中确定抖动幅度小于第三阈值的N个图像。
也就是说,在终端中可以预先设定有第三阈值,终端可以根据M个图像中每个图像对应的抖动幅度与第三阈值之间的大小关系,确定需要选择的图像为抖动幅度小于第三阈值的图像。其中,第三阈值可以是根据裁剪比例所确定的阈值,该裁剪比例为终端中预置的,是在对图像防抖处理中用于裁剪图像的比例。通过根据裁剪比例确定第三阈值,可以确保采用该裁剪比例处理抖动幅度小于该第三阈值的图像时,不会出现无图像区域。示例性地,在抖动幅度为图像相比于基准图像的偏移量时,第三阈值例如可以为5个像素点(即偏移量为5个像素点的距离)。
在一个可能的实施例中,终端可以获取到图像选择比值,该图像选择比值为图像输入数量与图像输出数量之间的比值,该图像选择比值例如可以是预先设置于终端中的。在获取到第一图像序列中的M个图像之后,终端根据所述图像选择比例,确定在M个图像中选择输出N个图像;其中,所述M与所述N之间的比值与所述图像选择比值相同。
例如,在终端中预先设置图像选择比值为2比1,在获取到第一图像序列中的10个图像之后,终端可以根据图像选择比值确定输出5个图像。
步骤304,输出第二图像序列,所述第二图像序列包括所述N个图像。
终端在第一图像序列中的M个图像中确定了N个图像之后,该N个图像即可构成一个新的图像序列,即第二图像序列,终端通过输出该第二图像序列,以实现视频的输出。并且,该N个图像在第二图像序列中的排序与其在第一图像序列中的排序是相同的。也就是说,第二图像序列可以理解为是从第一图像序列中剔除了M-N个图像后所得到的图像序列。
在一个可能的实施例中,在终端输出第二图像序列之前,终端还可以对第二图像序列做进一步的防抖处理。示例性地,终端可以根据防抖算法,对所述N个图像进行防抖处理,得到处理后的N个图像;终端输出所述第二图像序列,所述第二图像序列包括所述处理后的N个图像。
其中,终端对图像进行防抖处理的方式例如可以包括对图像执行运动路径平滑。具体地,终端可以根据N个图像中每个图像对应的旋转矩阵,分别对该N个图像进行平滑处理(例如高斯平滑),从而得到稳定的视频。
应理解,在本实施例中,终端是在获取到M个图像之后,在M个图像中确定输出N个图像。因此,终端所输出的图像会具有一定的延迟,而延迟的时间则与M的取值相关。在实际情况中,可以根据延迟时间的需要来调整M的取值。例如,在实时性要求较高的情况下,M的取值可以为较小的值,以使得图像输出延时较小;在实时性要求较低的情况下,M的取值则可以为较大的值。
以上对本申请实施例提供的图像处理方法的流程进行了详细的描述,为了便于理解,以下将结合具体例子描述本申请实施例提供的图像处理方法。
可以参阅图4,图4为本申请实施例提供的一种图像处理方法400的流程示意图。如图4所示,该图像处理方法包括以下的步骤:
步骤401,检测终端运动状态。
在终端拍摄视频的过程中,终端可以实时获取安装于终端中的陀螺仪测量到的角速度信息。然后,终端基于获取到的角速度信息进行方差的求取,以确定终端的运动状态。
示例性地,假设陀螺仪在t时刻测量得到的角速度三个分量为
Figure PCTCN2021125974-appb-000001
可以通过以下的公式1来计算终端的速率:
Figure PCTCN2021125974-appb-000002
其中,
Figure PCTCN2021125974-appb-000003
表示终端在t时刻的速率。
对于当前t N时刻,终端可以计算在(t 0,t N)时间段内,陀螺仪的历史速率序列
Figure PCTCN2021125974-appb-000004
并且终端可以进一步计算历史速率序列的方差,以判断终端是否发生抖动。
402,判断终端是否发生抖动。
在得到陀螺仪的历史速率序列
Figure PCTCN2021125974-appb-000005
之后,终端可以计算该历史速率序列的方差。若方差大于阈值g thre,则可以确定当前t N时刻终端处于抖动中;若方差不大于阈值g thre,则可以确定当前t N时刻终端不处于抖动中。
403,在发生抖动时,开启高帧率模式。
在确定终端发生抖动时,终端可以通过向图像采集装置发送指令,指示图像采集装置提高采集图像的帧率,从而开启高帧率模式。例如,图像采集装置可以将采集图像的帧率提高至每秒采集60张图像,则当前的帧率模式为60帧/秒。
404,获取每个图像的运动信息,并将图像送入缓冲队列。
在图像采集装置采集图像的过程中,终端可以实时获取到图像采集装置所采集的图像,且终端可以基于陀螺仪实时测量的角速度信息,确定每个图像对应的运动信息。
示例性地,基于终端中的陀螺仪可以测量得到(t 0,t N)时刻对应的陀螺仪数据序列
Figure PCTCN2021125974-appb-000006
假设陀螺仪测量角速度的时间间隔不变,则相邻的陀螺仪数据的时间间隔固定,记为t d。因此,t N时刻的终端位姿信息
Figure PCTCN2021125974-appb-000007
可以如公式2所示:
Figure PCTCN2021125974-appb-000008
其中,
Figure PCTCN2021125974-appb-000009
表示t N时刻时,终端在x轴上的位移量;
Figure PCTCN2021125974-appb-000010
表示t N时刻时,终端在y轴上的位移量;
Figure PCTCN2021125974-appb-000011
为表示t N时刻时,终端在z轴上的位移量。
由于陀螺仪测量角速度信息的时间与图像采集装置采集图像的时间可能并非是同步的,因此可以对陀螺仪数据与图像数据进行时间上的同步,以得到图像采集装置在采集每个图像时终端对应的位姿信息。
对于图像采集装置采集到的一个图像,假设图像的采集时刻为t f,t a和t b为陀螺仪数据的采集时刻,且t a<t f<t b,t b-t a=t d。并且,基于公式2得到时刻t a和时刻t b对应的终端位姿信息分别表示为
Figure PCTCN2021125974-appb-000012
Figure PCTCN2021125974-appb-000013
那么,基于上述的线性插值法,可以确定时刻t f所对应的终端位姿信息
Figure PCTCN2021125974-appb-000014
可以表示为如下的公式3:
Figure PCTCN2021125974-appb-000015
其中,
Figure PCTCN2021125974-appb-000016
表示t f时刻时,终端在x轴上的位移量;
Figure PCTCN2021125974-appb-000017
表示t f时刻时,终端在y轴上的位移量;
Figure PCTCN2021125974-appb-000018
为表示t f时刻时,终端在z轴上的位移量。
可以参阅图5,图5为本申请实施例提供的一种终端抖动时的旋转模型的示意图。如图5所示,终端抖动时,图像采集装置采集到的每个图像在不同的平面上,且彼此之间可 以通过旋转矩阵R关联。因此,根据罗德里格斯公式,可以对终端进行运动估计,即基于t f时刻时图像对应的位姿信息
Figure PCTCN2021125974-appb-000019
得到其对应的旋转矩阵R。
在得到每个图像对应的旋转矩阵R之后,可以将图像逐个送入缓冲队列中,以便于在缓冲队列中的图像达到设定的个数之后,统一对缓冲队列中的图像进行处理。其中,缓冲队列的长度决定了图像输出的延时程度。缓冲队列越短,延时越低;缓冲队列越长,延时则越高。
405,根据缓冲队列中图像的抖动幅度进行图像的选择。
本实施例中,假设缓冲队列中图像的数量为M,终端需要在缓冲队列的M个图像中,根据图像的抖动幅度选择N个图像。
示例性地,假设缓冲队列为S={f 1,...,f M},且缓冲队列对应的旋转矩阵为R={R 1,...,R M}。终端可以从缓冲队列中选取由N个图像组成的子集S output={f 1,...,f N}。并且,对于子集中的N个图像,该N个图像中相邻的两个图像在缓冲队列中的间隔小于最大帧间隔X。示例性地,该最大帧间隔X例如可以为
Figure PCTCN2021125974-appb-000020
具体地,由于图像采集装置采集不同图像时的终端姿态之间有旋转关系,因此可以利用旋转矩阵将所有的图像转换到同一坐标系下,以计算其偏离的程度。终端可以确定缓冲队列中的图像m(图像m可以为M个图像中的任意一个图像)的中心点(x m,y m),然后利用图像m对应的旋转矩阵R m进行坐标变换,得到变换后坐标(x′ m,y′ m),并且计算原始坐标(x m,y m)与变换后坐标(x′ m,y′ m)的欧氏距离,记为偏移量c m。该偏移量c m的单位为像素个数,即该偏移量c m能够表示当前图像中的某一个像素(即位于中心点的像素)相对于没发生抖动时的图像中与其对应的像素的偏移量。例如,假设图像1中位于中心点的像素为像素点1,在图像2中存在有与该像素点1对应的像素点2,即像素点1和像素点2均是表示相同场景下相同物体上的同一部分;并且,图像1为图像采集装置在发生抖动时所采集的图像,图像2则为图像采集装置在未发生抖动时所采集的图像。那么,图像1的偏移量即可通过计算像素点1相对于像素点2偏移了多少个像素(即像素点1的位置与像素点2的位置之间相差了多少个像素),来计算得到。在以该偏移量c m表示图像对应的抖动幅度的情况下,缓冲队列中的每个图像的抖动幅度可以表示为C={c 0,c 1,...,c M}。
此外,终端也可以是通过IMU获取图像采集装置在采集每个图像时的角速度信息,然后计算图像采集装置在采集每个图像时的位姿信息,通过基于图像采集装置在采集M个图像中的任意一个图像时的位姿信息和图像采集装置在采集基准图像时的位姿信息,计算两个位姿信息之间的位姿变化量,来确定M个图像中的每个图像的抖动幅度。
此外,为了保证上一轮缓冲队列中所选择的最后一个图像与当前缓冲队列中所选择的第一个图像之间同样小于最大帧间隔X,在对本轮缓冲队列中的图像进行选择的过程中, 可以在本轮缓冲队列中加入上一轮缓冲队列中最后被选择的图像以及之后的所有图像,以构成新的序列S new={f -L,...,f -1,f 1,...,f M},在该新的序列中L<X。
应理解,由于图像f -L为上一轮缓冲队列中被选择到的图像,因此为了保证图像f -L在本轮图像选择中一定能够被选到,可以设置图像f -L对应的抖动幅度c -N为0。此外,图像序列{f -L+1,...,f -1}对应的抖动幅度{c -L+1,...,c -1}则可以设置为正无穷,以保证图像序列{f -L+1,...,f -1}不会被选择到。
具体地,可以参阅图6,图6为本申请实施例提供的一种选择图像的示意图。如图6所示,上一轮的缓冲队列为S last={f -M,f -M+1,...,f -1},且终端在该缓冲队列S last中选择了S output_last={f -M,f -M+1,...,f -3}作为输出。那么,在处理本轮缓冲队列时,上一轮的图像序列{f -3,f -2,f -1}与本轮的图像序列S={f 1,...,f M}构成新的图像序列S new={f -3,...,f -1,f 1,...,f M}。
在得到新的图像序列S new之后,终端可以从S new中选取N+1个图像。其中,终端选择的第一个图像即为上一轮缓冲队列中所选择的最后一个图像,该图像作为上一轮缓冲队列的输出序列;终端所选择的后N个图像则作为本轮缓冲队列的输出序列。
对于后N个图像,终端可以采用动态规划算法进行求解。也就是说,终端可以根据每一个图像的抖动幅度c m,基于动态规划算法从S new队列中选择出N+1个图像,使得该N+1个图像的抖动幅度之和最小,同时满足相邻的两个图像之间的间隔不大于X。这样,选择得到的N+1个图像中的后N个图像即为本轮缓冲队列对应的选择到的图像。
可以参阅图7,图7为本申请实施例提供的一种图像选择的示意图。如图7所示,假设M为4,N为3,且最大帧间隔为2。对于缓冲队列中的V 0、V 1、V 2和V 3这四个图像,V 0和V 1的抖动幅度均小于V 2和V 3的抖动幅度,因此,终端在V 0、V 1、V 2和V 3这四个图像选择了抖动幅度较小的V 0和V 1作为本轮缓冲队列的输出图像。
406,对选择到的图像进行路径平滑,并基于平滑后的旋转矩阵对图像进行校正。
对于选择到的图像,由于其对应的图像采集装置的运动路径上存在噪声以及抖动现象。为了获得稳定的视频流,需要对图像采集装置的运动路径进行平滑处理,并根据平滑后的运动路径上图像采集装置对应的位姿对图像进行校正处理。
以单个方向为例,n个连续的图像组成序列F={f i|i=1,...,n},对应的偏转角度为Y={y i|i=1,...,n}。可以参阅图8,图8为本申请实施例提供的一种路径平滑前的图像数据对比示意图。如图8所示(图8是以单个方向上的偏转角度为例),横坐标为时间(单位为毫秒),纵坐标是图像采集装置对应的偏转角度。其中,波浪形的实线是原始的图像采集装置的运动路径,视频画面存在抖动。此时,基于高斯平滑等技术,可以对于偏转角度Y进行路径平滑,得到平滑后的偏转角度Y′=(y′ i|i=1,...,n),从而构成虚拟的运动路径。其中,高斯滑动窗口依次滑过偏转角度序列Y,得到平滑偏转角度序列Y′,记为Y′=GaussianSmooth(Y)。经过平滑处理后,虚拟相机路线为中间平滑的虚线线段,路 径噪声以及抖动被基本消除。
根据上述的路径平滑操作,可以得到每一个图像应该校正的角度,即从姿态y i校正到姿态y′ i。对于图像f i,其校正后的图像可由公式4得出:
Figure PCTCN2021125974-appb-000021
其中,R′ i为校正后的图像采集装置姿态的旋转矩阵,R i为校正前对应的图像采集装置姿态的旋转矩阵,K为图像采集装置的内参矩阵。旋转矩阵与旋转角度可以由罗德里格斯公式得到,即R=I+sinθK+(1-cosθ)K 2
407,对图像进行运动补偿。
由于对于抖动剧烈的视频,基于平滑后的旋转矩阵所得到的修正后的图像,仍然存在无图像区域。因此,本实施例中,可以通过视频插帧算法或者视频修复方法进行修复。如图1所示,最后一个图像校正完成后,出现了无图像区域。此时,可以通过视频插帧算法来进行图像的修复,即利用与该图像相邻的前一个完整的图像以及后一个完整的图像来进行插帧处理,得到修复后的图像;此外,也可以是通过视频修复方法来进行图像的修复,即利用多个相邻的图像来进行图像画面的预测填充,从而得到修复后的图像。
上面介绍了本申请提供的数据处理方法,下面对执行该数据处理方法的主体进行介绍。可以参阅图9,图9为本申请实施例提供的一种终端的结构示意图。
如图9所示,该终端,包括:获取单元901和处理单元902;所述获取单元901,用于获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数;所述处理单元902,用于确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中的像素相比于基准图像的偏移量;所述处理单元902,还用于根据所述抖动幅度,在所述M个图像中确定N个图像,所述N小于所述M,且所述N为正整数;所述处理单元902,还用于输出第二图像序列,所述第二图像序列包括所述N个图像。
在一种可能的实现方式中,所述处理单元902,还用于根据所述抖动幅度,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值。
在一种可能的实现方式中,所述处理单元902,还用于根据所述抖动幅度以及约束条件,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值;其中,所述约束条件为得到的N个图像中相邻的两个图像在所述第一图像序列中的间隔小于第二阈值。
在一种可能的实现方式中,所述处理单元902,还用于根据所述抖动幅度,在所述M个图像中确定抖动幅度小于第三阈值的N个图像。
在一种可能的实现方式中,所述处理单元902,还用于在确定所述图像采集装置发生抖动时,向所述图像采集装置发送指令,所述指令用于指示所述图像采集装置采用第一帧率来采集图像;其中,所述图像采集装置在未发生抖动时采用第二帧率采集图像,所述第二帧率小于所述第一帧率。
在一种可能的实现方式中,所述获取单元901,还用于获取所述图像采集装置在第一时间段内的S个时刻的角速度信息,所述S为大于1的整数;所述处理单元902,还用于 确定所述S个时刻的角速度信息的方差;当所述方差大于第四阈值时,确定所述图像采集装置发生抖动;当所述方差小于或等于所述第四阈值时,确定所述图像采集装置未发生抖动。
在一种可能的实现方式中,所述M个图像对应的抖动幅度包括所述M个图像对应的偏移量;所述获取单元901,还用于获取图像采集装置在第二时间段内的P个时刻的角速度信息,所述P为大于1的整数,所述图像采集装置用于采集所述第一图像序列;所述处理单元902,还用于根据所述P个时刻的角速度信息确定所述图像采集装置在采集所述M个图像时的位姿信息;所述处理单元902,还用于根据所述位姿信息确定所述M个图像中每个图像对应的偏移量。
在一种可能的实现方式中,所述处理单元902,还用于根据所述P个时刻的角速度信息以及所述M个图像的采集时刻,通过线性插值法确定所述图像采集装置在采集所述M个图像时的位姿信息。
在一种可能的实现方式中,所述处理单元902,还用于根据所述图像采集装置在采集所述M个图像时的位姿信息确定所述M个图像中每个图像对应的旋转矩阵;所述处理单元902,还用于根据所述M个图像对应的旋转矩阵,确定所述M个图像对应的偏移量。
在一种可能的实现方式中,所述获取单元901,还用于获取图像选择比值,所述图像选择比值为图像输入数量与图像输出数量之间的比值;所述处理单元902,还用于根据所述M个图像与所述图像选择比例,确定所述N的取值;其中,所述M与所述N之间的比值与所述图像选择比值相同。
在一种可能的实现方式中,所述处理单元902,还用于根据防抖算法,对所述N个图像进行防抖处理,得到处理后的N个图像;所述处理单元902,还用于输出所述第二图像序列,所述第二图像序列包括所述处理后的N个图像。
可以参阅图10,图10为本申请实施例提供的一种终端100的结构示意图。
如图10所示,终端100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对终端100的具体限定。在本申请另一些实施例中,终端100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是终端100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I1C)接口,集成电路内置音频(inter-integrated circuit sound,I1S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对终端100的结构限定。在本申请另一些实施例中,终端100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。
终端100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
在一些可行的实施方式中,终端100可以使用无线通信功能和其他设备通信。例如,终端100可以和第二电子设备通信,终端100与第二电子设备建立投屏连接,终端100输出投屏数据至第二电子设备等。其中,终端100输出的投屏数据可以为音视频数据。
天线1和天线2用于发射和接收电磁波信号。终端100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在终端100上的包括1G/3G/4G/5G等无线通信的解决 方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线2转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在终端100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线1接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,终端100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得终端100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
终端100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液 晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,终端100可以包括1个或N个显示屏194,N为大于1的正整数。
在一些可行的实施方式中,显示屏194可用于显示终端100的系统输出的各个界面。终端100输出的各个界面可参考后续实施例的相关描述。
终端100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,终端100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。
视频编解码器用于对数字视频压缩或解压缩。终端100可以支持一种或多种视频编解码器。这样,终端100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG1,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现终端100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展终端100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行终端100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储终端100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁 盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
终端100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。在一些可行的实施方式中,音频模块170可用于播放视频对应的声音。例如,显示屏194显示视频播放画面时,音频模块170输出视频播放的声音。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。陀螺仪传感器180B可以用于确定终端100的运动姿态。气压传感器180C用于测量气压。
加速度传感器180E可检测终端100在各个方向上(包括三轴或六轴)加速度的大小。当终端100静止时可检测出重力的大小及方向。还可以用于识别终端姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。
环境光传感器180L用于感知环境光亮度。
指纹传感器180H用于采集指纹。
温度传感器180J用于检测温度。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于终端100的表面,与显示屏194所处的位置不同。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。终端100可以接收按键输入,产生与终端100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装 置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (25)

  1. 一种图像处理方法,其特征在于,包括:
    获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数;
    确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中的像素相比于基准图像的偏移量;
    根据所述抖动幅度,在所述M个图像中确定N个图像,所述N小于所述M,且所述N为正整数;
    输出第二图像序列,所述第二图像序列包括所述N个图像。
  2. 根据权利要求1所述的图像处理方法,其特征在于,所述根据所述M个图像中每个图像对应的抖动幅度,在所述M个图像中确定N个图像,包括:
    根据所述抖动幅度,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值。
  3. 根据权利要求1所述的图像处理方法,其特征在于,所述根据所述M个图像中每个图像对应的抖动幅度,在所述M个图像中确定N个图像,包括:
    根据所述抖动幅度以及约束条件,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值;
    其中,所述约束条件为得到的N个图像中相邻的两个图像在所述第一图像序列中的间隔小于第二阈值。
  4. 根据权利要求1所述的图像处理方法,其特征在于,所述根据所述M个图像中每个图像对应的抖动幅度,在所述M个图像中确定N个图像,包括:
    根据所述抖动幅度,在所述M个图像中确定抖动幅度小于第三阈值的N个图像。
  5. 根据权利要求1至4任意一项所述的图像处理方法,其特征在于,所述方法还包括:
    在确定所述图像采集装置发生抖动时,向所述图像采集装置发送指令,所述指令用于指示所述图像采集装置采用第一帧率来采集图像;
    其中,所述图像采集装置在未发生抖动时采用第二帧率采集图像,所述第二帧率小于所述第一帧率。
  6. 根据权利要求5所述的图像处理方法,其特征在于,所述方法还包括:获取所述图像采集装置在第一时间段内的S个时刻的角速度信息,所述S为大于1的整数;
    确定所述S个时刻的角速度信息的方差;
    当所述方差大于第四阈值时,确定所述图像采集装置发生抖动;
    当所述方差小于或等于所述第四阈值时,确定所述图像采集装置未发生抖动。
  7. 根据权利要求1至6任意一项所述的图像处理方法,其特征在于,所述M个图像对应的抖动幅度包括所述M个图像对应的偏移量;
    所述确定所述M个图像对应的抖动幅度,包括:
    获取图像采集装置在第二时间段内的P个时刻的角速度信息,所述P为大于1的整数,所述图像采集装置用于采集所述第一图像序列;
    根据所述P个时刻的角速度信息确定所述图像采集装置在采集所述M个图像时的位姿信息;
    根据所述位姿信息确定所述M个图像中每个图像对应的偏移量。
  8. 根据权利要求7所述的图像处理方法,其特征在于,所述根据所述P个时刻的角速度信息确定图像采集装置在采集所述M个图像时的位姿信息,包括:
    根据所述P个时刻的角速度信息以及所述M个图像的采集时刻,通过线性插值法确定所述图像采集装置在采集所述M个图像时的位姿信息。
  9. 根据权利要求7或8所述的图像处理方法,其特征在于,所述根据所述位姿信息确定所述M个图像对应的偏移量,包括:
    根据所述图像采集装置在采集所述M个图像时的位姿信息确定所述M个图像中每个图像对应的旋转矩阵;
    根据所述M个图像对应的旋转矩阵,确定所述M个图像对应的偏移量。
  10. 根据权利要求1至3任意一项所述的图像处理方法,其特征在于,所述在所述M个图像中确定N个图像之前,所述方法还包括:
    获取图像选择比值,所述图像选择比值为图像输入数量与图像输出数量之间的比值;
    根据所述M个图像与所述图像选择比例,确定所述N的取值;
    其中,所述M与所述N之间的比值与所述图像选择比值相同。
  11. 根据权利要求1至10任意一项所述的图像处理方法,其特征在于,所述输出第二图像序列之前,所述方法还包括:
    根据防抖算法,对所述N个图像进行防抖处理,得到处理后的N个图像;
    输出所述第二图像序列,所述第二图像序列包括所述处理后的N个图像。
  12. 一种终端,其特征在于,包括获取单元和处理单元;
    所述获取单元,用于获取第一图像序列,所述第一图像序列包括M个图像,所述M为正整数;
    所述处理单元,用于确定所述M个图像中每个图像对应的抖动幅度,所述抖动幅度用于表示图像中相比于基准图像的偏移量;
    所述处理单元,还用于根据所述抖动幅度,在所述M个图像中确定N个图像,所述N 小于所述M,且所述N为正整数;
    所述处理单元,还用于输出第二图像序列,所述第二图像序列包括所述N个图像。
  13. 根据权利要求12所述的终端,其特征在于,所述处理单元,还用于根据所述抖动幅度,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值。
  14. 根据权利要求12所述的终端,其特征在于,所述处理单元,还用于根据所述抖动幅度以及约束条件,按照抖动幅度从小到大的顺序,在所述M个图像中确定N个图像,所述N的取值为第一阈值;
    其中,所述约束条件为得到的N个图像中相邻的两个图像在所述第一图像序列中的间隔小于第二阈值。
  15. 根据权利要求12所述的终端,其特征在于,所述处理单元,还用于根据所述抖动幅度,在所述M个图像中确定抖动幅度小于第三阈值的N个图像。
  16. 根据权利要求12至15任意一项所述的终端,其特征在于,所述处理单元,还用于在确定所述图像采集装置发生抖动时,向所述图像采集装置发送指令,所述指令用于指示所述图像采集装置采用第一帧率来采集图像;
    其中,所述图像采集装置在未发生抖动时采用第二帧率采集图像,所述第二帧率小于所述第一帧率。
  17. 根据权利要求16所述的终端,其特征在于,所述获取单元,还用于获取所述图像采集装置在第一时间段内的S个时刻的角速度信息,所述S为大于1的整数;
    所述处理单元,还用于确定所述S个时刻的角速度信息的方差;
    当所述方差大于第四阈值时,确定所述图像采集装置发生抖动;
    当所述方差小于或等于所述第四阈值时,确定所述图像采集装置未发生抖动。
  18. 根据权利要求12至17任意一项所述的终端,其特征在于,所述M个图像对应的抖动幅度包括所述M个图像对应的偏移量;
    所述获取单元,还用于获取图像采集装置在第二时间段内的P个时刻的角速度信息,所述P为大于1的整数,所述图像采集装置用于采集所述第一图像序列;
    所述处理单元,还用于根据所述P个时刻的角速度信息确定所述图像采集装置在采集所述M个图像时的位姿信息;
    所述处理单元,还用于根据所述位姿信息确定所述M个图像中每个图像对应的偏移量。
  19. 根据权利要求18所述的终端,其特征在于,所述处理单元,还用于根据所述P个 时刻的角速度信息以及所述M个图像的采集时刻,通过线性插值法确定所述图像采集装置在采集所述M个图像时的位姿信息。
  20. 根据权利要求18或19所述的终端,其特征在于,所述处理单元,还用于根据所述图像采集装置在采集所述M个图像时的位姿信息确定所述M个图像中每个图像对应的旋转矩阵;
    所述处理单元,还用于根据所述M个图像对应的旋转矩阵,确定所述M个图像对应的偏移量。
  21. 根据权利要求12至14任意一项所述的终端,其特征在于,所述获取单元,还用于获取图像选择比值,所述图像选择比值为图像输入数量与图像输出数量之间的比值;
    所述处理单元,还用于根据所述M个图像与所述图像选择比例,确定所述N的取值;
    其中,所述M与所述N之间的比值与所述图像选择比值相同。
  22. 根据权利要求12至21任意一项所述的终端,其特征在于,所述处理单元,还用于根据防抖算法,对所述N个图像进行防抖处理,得到处理后的N个图像;
    所述处理单元,还用于输出所述第二图像序列,所述第二图像序列包括所述处理后的N个图像。
  23. 一种终端,其特征在于,包括:一个或多个处理器和存储器;其中,
    所述存储器中存储有计算机可读指令;
    所述一个或多个处理器用于读取所述计算机可读指令以使所述终端实现如权利要求1至11中任一项所述的方法。
  24. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1至11任一项所述的方法。
  25. 一种计算机可读存储介质,其特征在于,包括计算机可读指令,当所述计算机可读指令在计算机上运行时,使得所述计算机执行如权利要求1至11中任一项所述的方法。
PCT/CN2021/125974 2020-10-30 2021-10-25 一种图像处理方法及相关装置 WO2022089341A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011193237.8A CN114449151B (zh) 2020-10-30 2020-10-30 一种图像处理方法及相关装置
CN202011193237.8 2020-10-30

Publications (1)

Publication Number Publication Date
WO2022089341A1 true WO2022089341A1 (zh) 2022-05-05

Family

ID=81357318

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/125974 WO2022089341A1 (zh) 2020-10-30 2021-10-25 一种图像处理方法及相关装置

Country Status (2)

Country Link
CN (1) CN114449151B (zh)
WO (1) WO2022089341A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278183A (zh) * 2022-06-23 2022-11-01 广州市恒众车联网科技股份有限公司 一种hud画面显示方法及系统
CN116434128A (zh) * 2023-06-15 2023-07-14 安徽科大擎天科技有限公司 一种基于缓存帧的电子稳像未填充区域的去除方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1777238A (zh) * 2004-11-15 2006-05-24 佳能株式会社 图像处理装置和图像处理方法
US20070098291A1 (en) * 2005-11-02 2007-05-03 Kentarou Niikura Image stabilization apparatus, method thereof, and program product thereof
CN101272455A (zh) * 2007-03-23 2008-09-24 富士胶片株式会社 图像获取装置
CN104618674A (zh) * 2015-02-28 2015-05-13 广东欧珀移动通信有限公司 一种移动终端的视频录制方法及装置
WO2017075788A1 (zh) * 2015-11-05 2017-05-11 华为技术有限公司 一种防抖拍照方法、装置及照相设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7688352B2 (en) * 2005-11-25 2010-03-30 Seiko Epson Corporation Shake correction device, filming device, moving image display device, shake correction method and recording medium
CN107509034B (zh) * 2017-09-22 2019-11-26 维沃移动通信有限公司 一种拍摄方法及移动终端
CN108737734B (zh) * 2018-06-15 2020-12-01 Oppo广东移动通信有限公司 图像补偿方法和装置、计算机可读存储介质和电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1777238A (zh) * 2004-11-15 2006-05-24 佳能株式会社 图像处理装置和图像处理方法
US20070098291A1 (en) * 2005-11-02 2007-05-03 Kentarou Niikura Image stabilization apparatus, method thereof, and program product thereof
CN101272455A (zh) * 2007-03-23 2008-09-24 富士胶片株式会社 图像获取装置
CN104618674A (zh) * 2015-02-28 2015-05-13 广东欧珀移动通信有限公司 一种移动终端的视频录制方法及装置
WO2017075788A1 (zh) * 2015-11-05 2017-05-11 华为技术有限公司 一种防抖拍照方法、装置及照相设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278183A (zh) * 2022-06-23 2022-11-01 广州市恒众车联网科技股份有限公司 一种hud画面显示方法及系统
CN115278183B (zh) * 2022-06-23 2023-03-14 广州市恒众车联网科技股份有限公司 一种hud画面显示方法及系统
CN116434128A (zh) * 2023-06-15 2023-07-14 安徽科大擎天科技有限公司 一种基于缓存帧的电子稳像未填充区域的去除方法
CN116434128B (zh) * 2023-06-15 2023-08-22 安徽科大擎天科技有限公司 一种基于缓存帧的电子稳像未填充区域的去除方法

Also Published As

Publication number Publication date
CN114449151A (zh) 2022-05-06
CN114449151B (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
WO2020259038A1 (zh) 一种拍摄方法及设备
CN110035141B (zh) 一种拍摄方法及设备
JP7403551B2 (ja) 記録フレームレート制御方法及び関連装置
WO2018072353A1 (zh) 获取图像的方法和终端设备
WO2022057723A1 (zh) 一种视频的防抖处理方法及电子设备
CN113810601B (zh) 终端的图像处理方法、装置和终端设备
CN113454982A (zh) 用于使图像稳定化的电子装置及其操作方法
WO2022089341A1 (zh) 一种图像处理方法及相关装置
WO2021190613A9 (zh) 一种拍照方法及装置
CN113797530B (zh) 图像的预测方法、电子设备和存储介质
US10635892B2 (en) Display control method and apparatus
WO2022001806A1 (zh) 图像变换方法和装置
CN114339102B (zh) 一种录像方法及设备
WO2021208926A1 (zh) 一种用于拍照的装置及方法
WO2023005355A1 (zh) 图像防抖方法与电子设备
CN114257920B (zh) 一种音频播放方法、系统和电子设备
CN115701125A (zh) 图像防抖方法与电子设备
CN107734269B (zh) 一种图像处理方法及移动终端
CN113850709A (zh) 图像变换方法和装置
CN115150542B (zh) 一种视频防抖方法及相关设备
WO2022033344A1 (zh) 视频防抖方法、终端设备和计算机可读存储介质
CN113923351B (zh) 多路视频拍摄的退出方法、设备和存储介质
CN113286076A (zh) 拍摄方法及相关设备
WO2022206589A1 (zh) 一种图像处理方法以及相关设备
CN115297269B (zh) 曝光参数的确定方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21885068

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21885068

Country of ref document: EP

Kind code of ref document: A1