WO2022242569A1 - 延迟校准方法、装置、计算机设备和存储介质 - Google Patents

延迟校准方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2022242569A1
WO2022242569A1 PCT/CN2022/092757 CN2022092757W WO2022242569A1 WO 2022242569 A1 WO2022242569 A1 WO 2022242569A1 CN 2022092757 W CN2022092757 W CN 2022092757W WO 2022242569 A1 WO2022242569 A1 WO 2022242569A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
amplitude
frequency
shake
frequency characteristic
Prior art date
Application number
PCT/CN2022/092757
Other languages
English (en)
French (fr)
Inventor
门泽华
Original Assignee
影石创新科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 影石创新科技股份有限公司 filed Critical 影石创新科技股份有限公司
Publication of WO2022242569A1 publication Critical patent/WO2022242569A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/681Motion detection
    • H04N23/6812Motion detection based on additional sensors, e.g. acceleration sensors
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/682Vibration or motion blur correction

Definitions

  • the present application relates to the technical field of image processing, in particular to a delay calibration method, device, computer equipment and storage medium.
  • the attitude of the camera is usually calculated from the shake signal detected by the IMU (Inertial Measurement Unit, inertial sensor), and then according to the calculated attitude of the camera, the image captured by the vision system is compensated to achieve electronic anti-shake.
  • the vision system captures an image and when the IMU detects the jitter signal, for example, the vision system captures an image of a certain frame, but the jitter detected by the IMU is at the corresponding time of the previous frame, but the system may think that this The two are matched at the same moment, that is, it is difficult for the vision system to capture the image at the moment when the IMU detects the shake.
  • the delay between the IMU and the vision system needs to be calibrated in practical applications. That is to say, for the clock corresponding to the IMU and the clock of the vision system, it is necessary to determine the time deviation of one of the clocks with the other clock as the standard.
  • the two groups of motions are generally estimated through the IMU and the vision system respectively, and then the error between the two groups of motions is used as a cost value through a nonlinear optimization algorithm to minimize the error to estimate the delay between the two groups. Due to errors in the two groups of motion estimation itself, the delay estimated by this method has low precision, which cannot meet the demand for high-precision delay. In addition, if there are periodically repeated motions in the above two groups of motions, this method may also cause estimation errors.
  • a delay calibration method comprising:
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • obtaining the video group includes:
  • attitude data of the shooting equipment acquired in the corresponding shooting time period of each video in the multiple videos multiple videos are screened, and the videos obtained after screening form a video group; wherein, the attitude data of the shooting equipment is based on inertial sensors.
  • the multiple videos are screened according to the posture data of the shooting device acquired during the shooting time period corresponding to each of the multiple videos, including:
  • the frequency domain score corresponding to each video is obtained, including:
  • the frequency domain score corresponding to each amplitude-frequency characteristic curve is obtained;
  • the frequency domain score corresponding to each amplitude-frequency characteristic curve is obtained.
  • the frequency domain score corresponding to each amplitude-frequency characteristic curve is obtained, including:
  • the score corresponding to the frequency of each amplitude-frequency characteristic curve is obtained, and the product of the score corresponding to each amplitude-frequency characteristic curve and the amplitude is obtained, and the product is used as a frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the frequency domain score corresponding to any video is obtained, including:
  • multiple videos are screened according to the frequency domain score corresponding to each video, including:
  • the frequency domain scores corresponding to each of the multiple videos are sorted from large to small, and a preset number of videos are selected, and used as the obtained videos after screening.
  • Video is single-channel video or multi-channel video.
  • a delay calibration device comprising:
  • An acquisition module configured to acquire a video group, where at least one video is included in the video group
  • the update module is used to update the delay value between the inertial sensor and the vision system, and based on the updated delay value, obtain the anti-shake performance score corresponding to the video group, repeat the update process of the above delay value and obtain the anti-shake The process of scoring the anti-shake performance until the obtained anti-shake performance score meets the preset condition, then obtain the delay value corresponding to the anti-shake performance score that meets the preset condition;
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • a computer device including a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • the above delay calibration method, device, computer equipment, and storage medium update the delay value between the inertial sensor and the vision system by acquiring the video group, and obtain the anti-shake performance corresponding to the video group based on the updated delay value scoring, repeating the above process of updating the delay value and obtaining the anti-shake performance score until the obtained anti-shake performance score satisfies the preset condition, then obtains the delay value corresponding to the anti-shake performance score that meets the preset condition. Since there is no need to estimate the two groups of motions by the IMU and the vision system separately, and then use the error between the two groups of motions as a cost value to minimize the error to estimate the delay between the two, so that the two groups of motion estimation itself can be avoided. The resulting error can improve the accuracy of calibration delay.
  • Fig. 1 is a schematic flow chart of a delay calibration method in an embodiment
  • FIG. 2 is a schematic flow chart of a delay calibration method in another embodiment
  • FIG. 3 is a structural block diagram of a delay calibration device in an embodiment
  • Figure 4 is an internal block diagram of a computer device in one embodiment.
  • the terms “first” and “second” used in this application may be used to describe various technical terms herein, but unless otherwise specified, these technical terms are not limited by these terms. These terms are only used to distinguish one term from another.
  • the third preset threshold and the fourth preset threshold may be the same or different.
  • EIS Electronic Image Stablization, Electronic Image Stabilization
  • the sensor in the shooting device detects the slight jitter during the image capture process, so that according to the signal corresponding to the slight jitter, the image of the edge is used Compensate the image to overcome the image blur caused by the shaking of the shooting equipment.
  • the sensor mainly used is an IMU.
  • the shake signal detected by the IMU it is mainly to use the shake signal detected by the IMU to calculate the attitude of the camera, and then perform compensation processing on the image captured by the vision system according to the calculated attitude of the camera.
  • the vision system captures an image and when the IMU detects the jitter signal
  • the vision system captures an image of a certain frame
  • the jitter detected by the IMU is at the corresponding time of the previous frame, but the system may think that this The two are matched at the same moment, that is, it is difficult for the vision system to capture the image at the moment when the IMU detects the shake. Therefore, for the above delay, the delay between the IMU and the vision system needs to be calibrated in practical applications. That is to say, for the clock corresponding to the IMU and the clock of the vision system, it is necessary to determine the time deviation of one of the clocks with the other clock as the standard.
  • the two groups of motions are generally estimated through the IMU and the vision system respectively, and then the error between the two groups of motions is used as a cost value through a nonlinear optimization algorithm to minimize the error to estimate the delay between the two groups. Due to errors in the two groups of motion estimation itself, the delay estimated by this method has low precision, which cannot meet the demand for high-precision delay. In addition, if there are periodically repeated motions in the above two groups of motions, this method may also cause estimation errors.
  • the embodiment of the present invention provides a delay calibration method, which can be applied to terminals, which can be but not limited to various personal computers, notebook computers, smart phones, tablet computers and Portable wearable devices, etc. It can be understood that the delay calibration method can also be applied to the server and the corresponding execution subject is the server, or according to actual needs and feasibility, the delay calibration method can be applied to the terminal and the server at the same time, that is, the delay A part of steps in the time calibration method may be executed by a terminal, and another part of steps may be executed by a server, which is not specifically limited in this embodiment of the present invention.
  • step 101 in the method flow corresponding to Figure 1 can be executed by the terminal, and then the terminal sends the video group to the server, so that step 102 is executed by the server, and the server can send the video after obtaining the delay value between the IMU and the vision system. to the terminal.
  • the quantities such as “multiple” mentioned in various embodiments of the present application all refer to the quantity of "at least two", for example, “multiple” refers to "at least two”.
  • the delay calibration method in this application is mainly used to calibrate the delay value between the IMU and the vision system, so that the subsequent IMU and the vision system can implement electronic anti-shake based on the delay value between the two.
  • a delay calibration method is provided. Taking this method applied to a terminal as an example to illustrate, the method includes the following steps:
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • the reason why the inertial sensor and the vision system need to be coupled to the same shooting device is because the embodiment of the present invention mainly calibrates the delay value between the inertial sensor and the vision system based on the imaging quality of the vision system.
  • the inertial sensor needs to capture the shaking of the photography equipment, and the vision system needs to take images under the premise of the shaking of the photography equipment and then determine the imaging quality based on this.
  • the inertial sensor and the vision system need to be coupled to the same shooting device.
  • the video group may include only one video, or may include multiple videos, which is not specifically limited in this embodiment of the present invention.
  • the anti-shake performance score corresponding to the video group is obtained based on the anti-shake performance score of each video in the video group.
  • the embodiment of the present invention does not specifically limit the method of obtaining the anti-shake performance score corresponding to the video group, including but not limited to: adding the anti-shake performance scores of each video in the video group, and using the added value as the video The anti-shake performance score corresponding to the group; or, add the anti-shake performance scores of each video in the video group, and then take the average value of the sum, and use the average value as the anti-shake performance score corresponding to the video group .
  • the delay value may have an initial value, such as an initial value of 0. Updating the delay value for the first time may refer to updating an initial value of the delay value.
  • the delay value may not be updated, that is, the anti-shake performance score corresponding to the video group is not obtained based on the updated delay value, but It is directly based on the initial value of the delay value, which is not specifically limited in this embodiment of the present invention.
  • the delay value may be updated in a direction of increasing the delay value, or may be updated in a direction of decreasing the delay value, which is not specifically limited in this embodiment of the present invention.
  • the previous delay value can be 0.2 seconds, and after the update, it can be increased to 0.3 seconds.
  • the delay value can be reduced from 0.3 seconds to 0.2 seconds after the update.
  • the preset conditions can be set according to requirements. For example, for the anti-shake performance score obtained in step 102 that satisfies the preset conditions, it is actually the anti-shake performance score obtained after the last update of the delay value.
  • the anti-shake performance score is also the last obtained anti-shake performance score.
  • the preset condition may be that the difference between the last obtained anti-shake performance score and the last obtained anti-shake performance score is less than the first preset threshold.
  • the anti-shake performance that meets the preset condition The delay value corresponding to the score may be the delay value corresponding to the anti-shake performance score obtained last time.
  • the preset condition may be that the last obtained anti-shake performance score is greater than the second preset threshold.
  • the delay value corresponding to the anti-shake performance score that meets the preset condition may also be the last obtained The delay value corresponding to the anti-shake performance score.
  • the preset The condition may also be that the anti-shake performance scores obtained for n consecutive times are all greater than the third preset threshold and the difference between every two adjacent anti-shake performance scores in the n consecutive obtained anti-shake performance scores is equal to less than the fourth preset threshold.
  • n is a positive integer not less than 2.
  • the delay value corresponding to the anti-shake performance score that satisfies the preset condition may be the delay value corresponding to the last obtained anti-shake performance score.
  • the preset condition may also be other content in the actual implementation process, which is not specifically limited in this embodiment of the present invention.
  • the first preset threshold to the fourth preset threshold can be obtained according to actual measurement or experience, which is not specifically limited in this embodiment of the present invention.
  • the IMU may include an accelerometer and a gyroscope, which are not specifically limited in this embodiment of the present invention.
  • the delay value between the inertial sensor and the vision system is updated by acquiring the video group, and based on the updated delay value, the anti-shake performance score corresponding to the video group is obtained, and the above delay is repeated.
  • the embodiment of the present invention does not specifically limit the method of obtaining the anti-shake performance score of the video, including but not limited to: according to the image corresponding to the video Frame parameters, to get the anti-shake performance score of the video.
  • the image frame parameters may include a degree of difference and/or similarity between image frames, and the image frame parameters may be calculated based on image parameters between image frames in the video.
  • the image parameters may include brightness and/or contrast, etc., which are not specifically limited in this embodiment of the present invention.
  • the image frame parameter may include similarity and/or difference in brightness between image frames.
  • the image frame parameter may include similarity and/or difference of contrast between image frames.
  • image frame parameters may include brightness similarity and/or difference, and contrast similarity and/or difference.
  • the degree of difference can be obtained by calculating the difference
  • the degree of similarity can be obtained by calculating the degree of similarity.
  • the brightness difference between two image frames can be obtained by calculating the brightness difference between the two image frames.
  • the similarity of brightness between two image frames can be calculated by a similarity algorithm.
  • the similarity between the two brightness feature vectors can be calculated as two Similarity in brightness between image frames.
  • the image frame parameters can be mainly used to represent the degree of difference and/or similarity between image frames in the video.
  • the degree of difference and/or degree of similarity between which image frames in the video it can be set according to requirements, which is not specifically limited in this embodiment of the present invention.
  • the image frame parameters may only be composed of the difference and/or similarity between the start frame and the middle frame in the video, or only the difference and/or similarity between the middle frame and the end frame may constitute the image
  • the frame parameter may also be the degree of difference and/or similarity between the start frame and the intermediate frame, and the degree of difference and/or similarity between the intermediate frame and the end frame together constitute the image frame parameter.
  • the video is composed of frames of images.
  • some image parameters will be deformed due to shaking between image frames in the video.
  • the deformation of these image parameters will be combined together, which is reflected in the visual effect, and may present a bad shooting effect. For example, it will cause bad shooting effects such as shaking and blurring in the video, and the anti-shake processing can eliminate these parameters as much as possible. Warp to improve your shots.
  • the deformation of these image parameters will be reflected in the calculation results corresponding to the image parameters between image frames, that is, it can be reflected in the image frame parameters. Therefore, image frame parameters, as an external quantification of the visual effect presented by the video after anti-shake processing, can represent the anti-shake performance of the video after anti-shake processing, so that image frame parameters can be used to evaluate Video stabilization performance.
  • the embodiment of the present invention does not specifically limit the manner in which the terminal 101 obtains the anti-shake performance score of the video according to the image frame parameters corresponding to the video.
  • the ways to obtain the anti-shake performance score can be divided into the following ways:
  • Image frame parameters include the degree of difference between image frames.
  • the degree of difference between the image frames in the video can be set according to requirements. Regardless of the degree of difference between image frames, it is actually a group of two frames of images in the video, and is the degree of difference between the two frames of images in the group. Therefore, the image frame parameters may actually include several degrees of difference, and each degree of difference is determined by a certain group of two frames of images in the video. Wherein, "several" may refer to one or more.
  • the difference degree can be directly used as the anti-shake performance score of the video. If the image frame parameters include multiple degrees of difference, the average value of the multiple degrees of difference can be taken, and the average value can be used as the anti-shake performance score of the video.
  • Image frame parameters include the similarity between image frames.
  • the similarity can be directly used as the anti-shake performance score of the video.
  • the image frame parameters include multiple similarities, the multiple similarities may be averaged, and the average value may be used as the anti-shake performance score of the video.
  • Image frame parameters include similarity and difference between image frames.
  • the image frame parameters may actually include several degrees of similarity and degrees of difference, and each degree of similarity or degree of difference is determined by a certain group of two frames of images in the video. Wherein, “several” may refer to one or more.
  • the anti-shake performance score of the video when obtaining the anti-shake performance score of the video according to the image frame parameters corresponding to the video, you can first take the average value of several degrees of difference in the image frame parameters to obtain the average value of the degree of difference, and then calculate the average value of several degrees of difference in the image frame parameters The similarity is averaged to obtain the average similarity. By performing weighted summation on the average value of the difference degree and the average value of the similarity degree, the weighted summation result is used as the anti-shake performance score of the video. Wherein, if the above-mentioned "several" are essentially one, the average value may not be used, and the one similarity or average degree may be directly used for weighted summation.
  • the difference degree may be directly used as the anti-shake performance score.
  • the image frame parameters including the difference between the start frame and the middle frame in the video, and the difference between the middle frame and the end frame as an example you can take the average of the two differences and use the average as the anti-shake performance score.
  • the method provided by the embodiment of the present invention obtains the video anti-shake performance score by acquiring the video formed through anti-shake processing, and according to the image frame parameters corresponding to the video. Since the anti-shake performance score is a relatively objective evaluation basis obtained based on the image frame parameters corresponding to the video, compared with the human visual system, the anti-shake performance score is more accurate as an evaluation result. In addition, because the anti-shake performance score is directly obtained according to the image frame parameters corresponding to the video to evaluate the anti-shake effect, it does not need to spend a long time evaluating the anti-shake effect through visual and intuitive feelings, so the time-consuming is shorter and the evaluation efficiency is higher. high.
  • the image frame parameters include image similarity; correspondingly, the embodiment of the present invention does not specifically limit the method of obtaining the anti-shake performance score of the video according to the image frame parameters corresponding to the video, Including but not limited to: For each group of two frames of images with adjacent preset intervals in the video, obtain the image similarity between the previous frame image and the next frame image in each group of adjacent preset intervals of two frame images , and as the image similarity corresponding to two frame images of each group of adjacent preset intervals; according to the image similarity corresponding to the two frame images of each group of adjacent preset intervals in the video, the anti-shake performance score of the video is obtained.
  • the preset interval may be represented by m, and m represents an interval of m frames.
  • m can be 1 or 2, but cannot be greater than the value obtained by subtracting 1 from the total number of frames.
  • m should not be too large. If it is too large, the total amount of image similarity will be too small, which will lead to inaccurate subsequent anti-shake performance scores.
  • the embodiment of the present invention takes the preset interval as 1 as an example to explain the subsequent process.
  • each group of adjacent preset intervals of two frames of images in the video when the preset interval is 1, refers to the first frame and the second frame as a group of adjacent two frames of images,
  • the second frame and the third frame are used as a group of adjacent two frame images,
  • the third frame and the fourth frame are used as a group of adjacent two frame images, ..., until the m-1th frame and the mth frame are used as a group
  • Two adjacent frames of images can form m-1 groups in total.
  • the anti-shake of the video can be further obtained according to the image similarity corresponding to the two frame images of each group of adjacent preset intervals performance score.
  • the embodiment of the present invention does not specifically limit the method of obtaining the anti-shake performance score of the video according to the image similarity corresponding to two frames of images at adjacent preset intervals in each group in the video, including but not limited to: obtaining each group of images in the video A summation result of image similarities corresponding to two frames of images at adjacent preset intervals, and the summation result is used as the anti-shake performance score of the video.
  • the summation results are averaged, and the average value is used as the anti-shake performance score of the video.
  • the anti-shake performance score of the video may be further obtained based on multiple image similarities.
  • the image similarity is calculated based on image parameters between two adjacent frames of images in the video, and the image parameters may include brightness and/or contrast.
  • the image similarity can include two items, one is obtained based on the image parameters for brightness, which is recorded as brightness similarity, and the other is based on the image parameters obtained for the contrast. The obtained value is recorded as the contrast similarity.
  • the anti-shake performance score of the video can be obtained, which can be further: to obtain the two frame images of each group of adjacent preset intervals in the video
  • the summation result of each image similarity corresponding to the frame image is summed again to the summation result corresponding to each image similarity, and the final summation result is used as the anti-shake performance score of the video.
  • a method of weighted summation of multiple image similarities can also be adopted to obtain the anti-shake performance score of the video.
  • the image similarity including brightness similarity results obtained based on image parameters as brightness and contrast similarity results obtained based on image parameters as contrast
  • it can be based on each group of adjacent preset intervals in the video
  • Each image similarity corresponding to the two frames of images and the weight corresponding to each image similarity are weighted and summed, and the obtained weighted sum result is used as the anti-shake performance score of the video.
  • the improvement effect after the anti-shake processing will be between two frames of images in each group of adjacent preset intervals in the video. It is reflected in the comparison, and the image similarity corresponding to the two frame images of each group of adjacent preset intervals can reflect the actual improvement effect, so based on the image similarity corresponding to the two frame images of each group of adjacent preset intervals
  • the obtained anti-shake performance score can be used as a relatively objective evaluation basis, and the evaluation result is more accurate.
  • the preset interval is 1, and for any set of two frames of images adjacent to the preset interval in the video, the two frames of images are recorded as the qth frame image and the qth frame image respectively.
  • -1 frame image correspondingly, the embodiment of the present invention does not specifically limit the method of obtaining the image similarity between the previous frame image and the subsequent frame image in each group of adjacent preset intervals of two frame images, including but Not limited to the following two methods:
  • the first way to obtain image similarity obtain the image similarity between the first subregion in the qth frame image and the second subregion in the q-1th frame image, and use it as the qth frame image and the qth -
  • the image similarity between 1 frame of images, the first sub-region and the second sub-region are divided according to the same division method and are located in the same position in each image; or,
  • the second way to obtain image similarity obtain the image similarity between the third subregion and the fourth subregion in each subregion group, and obtain the qth frame image according to the image similarity corresponding to multiple subregion groups and the image similarity between the q-1th frame image; wherein, each subregion group is composed of the third subregion in the qth frame image and the fourth subregion in the q-1th frame image, The third sub-area in the qth frame image and the fourth sub-area in the q-1th frame image are obtained according to the same division method, and the third sub-area and the fourth sub-area in each sub-area group are in respective at the same location in the image.
  • the qth frame image and the q-1th frame image are divided into 4 parts of 2*2 according to the same division method, and the first sub-region is the 4 parts divided by the qth frame image The part in the upper left corner of the center, the second sub-region is the part in the upper left corner of the 4 parts divided by the q-1th frame image.
  • the first sub-region can be obtained respectively according to the method of calculating the image similarity in the above example. Image similarity between the region and the second subregion.
  • the average luminance value of all pixels in the first sub-area can be obtained first, and then the average luminance value of all pixels in the second sub-area can be obtained, and the average luminance value corresponding to the first sub-area can be compared with the average luminance value corresponding to the second sub-area The difference between values, as the image similarity between the first sub-region and the second sub-region.
  • the part of the upper right corner in the q-1th frame image can also be used as the first subregion, and the part of the upper right corner in the qth frame image can be used as the second subregion.
  • the part of the lower left corner in the q-1th frame image can also be used as the first subregion, and the part of the lower left corner in the qth frame image can be used as the second subregion, so as to obtain the first
  • the image similarity between the sub-region and the second sub-region is not specifically limited in this embodiment of the present invention.
  • both the qth frame image and the q-1th frame image are divided into 4 parts of 2*2 according to the same division method.
  • the qth frame of image includes 4 third sub-regions
  • the q-1 th frame of image includes 4 fourth sub-regions, and thus 4 sub-region groups can be formed.
  • the third sub-region in the upper left corner of the qth frame image and the fourth subregion in the upper left corner of the q-1th frame image can form the first subregion group, and the third subregion in the upper right corner of the qth frame image
  • the area and the fourth sub-area in the upper right corner of the q-1th frame image can form the second sub-area group, the third sub-area in the lower left corner of the q-th frame image and the fourth sub-area in the lower left corner of the q-1th frame image
  • the regions can form a third sub-region group, and the third sub-region in the lower right corner of the qth frame image and the fourth sub-region in the lower right corner of the q-1th frame image can form a fourth sub-region group.
  • the image similarity corresponding to each sub-region group in the four sub-region groups can be obtained respectively.
  • the image similarity between the qth frame image and the q-1th frame image can be obtained.
  • the embodiment of the present invention does not specifically limit the method of obtaining the image similarity between the qth frame image and the q-1th frame image according to the image similarity corresponding to multiple sub-region groups, including but not limited to: taking the summation result as The image similarity between the qth frame image and the q-1th frame image; or, based on the number of subregion groups, obtain the average value of the summation results, and use the average value as the qth frame image and the q-1th frame image similarity between images.
  • the summation result is obtained after adding the image similarities corresponding to each subregion group.
  • the improvement effect after the anti-shake processing will be between two frames of images in each group of adjacent preset intervals in the video. It is reflected in the comparison, and the image similarity corresponding to two frames of images with adjacent preset intervals can reflect the actual improvement effect, so for a set of two frames of images with adjacent preset intervals, the two frames of images After using the same division method to divide, based on a certain area divided by the two frames of images at the same position or by taking all the divided areas as a global consideration, the image similarity corresponding to the two frames of images is obtained. It can be used as a relatively objective evaluation basis, and the evaluation results obtained based on this are more accurate.
  • the embodiment of the present invention does not specify the method of obtaining the anti-shake performance score of the video according to the image similarity corresponding to two frames of images at adjacent preset intervals in the video.
  • Restrictions including but not limited to: According to the similarity of each image corresponding to two frames of images in each group of adjacent preset intervals in the video, and the weight corresponding to each image similarity, obtain each group of adjacent presets in the video The similarity score corresponding to the two frames of images in the interval; according to the similarity score corresponding to the two frames of images in each group of adjacent preset intervals in the video, the anti-shake performance score of the video is obtained.
  • the method of the similarity score corresponding to the image is not specifically limited in this embodiment of the present invention, including but not limited to the following two methods:
  • the first way to obtain the similarity score based on each image similarity corresponding to each set of adjacent preset intervals in the video and the weight corresponding to each image similarity, the weighted summation result is obtained, and The weighted summation result is used as the similarity score corresponding to the two frame images of each group of adjacent preset intervals in the video.
  • the second way to obtain the similarity score take each image similarity corresponding to two adjacent preset intervals in the video as the power base, use the weight corresponding to each image similarity as the power exponent, and obtain The result of the power of each image similarity corresponding to each group of adjacent preset intervals of two frame images in the video, according to the multiplication of each image similarity corresponding to each group of adjacent preset intervals of two frame images in the video According to the square result, the similarity score corresponding to each group of adjacent preset intervals of two frame images in the video is obtained.
  • the embodiment of the present invention does not obtain the correspondence between the two frames of images of each group of adjacent preset intervals in the video according to the power result of each image similarity corresponding to each group of adjacent preset intervals in the video.
  • the method of similarity score is specifically defined, including but not limited to: summing the power results of each image similarity corresponding to two frames of images at adjacent preset intervals in the video, and using the summation result as The similarity score corresponding to two frames of images at adjacent preset intervals in each group; or multiply the power result of each image similarity corresponding to two frames of images in each group of adjacent preset intervals in the video, The result of the product is used as the similarity score corresponding to two frames of images with adjacent preset intervals.
  • the first image similarity corresponding to the two frame images of the t-1th group of adjacent preset intervals in the video is recorded as L t
  • the t-1th group of adjacent images in the video The second item of image similarity corresponding to the two frame images at the preset interval is denoted as C t
  • the third item of image similarity corresponding to the t-1th group of adjacent preset intervals of the two frame images in the video is denoted as S t .
  • the weight corresponding to the first image similarity is denoted as a
  • the weight corresponding to the second image similarity is denoted as b
  • the weight corresponding to the third image similarity is denoted as c.
  • P t represents the similarity score corresponding to the t-th group of adjacent preset intervals of two frames of images.
  • P t represents the similarity score corresponding to the t-th group of adjacent preset intervals of two frames of images.
  • the weight corresponding to each item of image similarity can be set according to actual needs. For example, if there are two image similarities, one of which is the image similarity calculated based on brightness, and the other is the image similarity calculated based on contrast, and the ambient brightness in the video is dark, then for these two The image similarity should minimize the error caused by the dark environment. Therefore, the weight corresponding to the image similarity calculated based on the brightness can be appropriately reduced, and the weight corresponding to the image similarity calculated based on the contrast can be appropriately increased.
  • the anti-shake of the video can be obtained according to the similarity scores corresponding to two frames of images at adjacent preset intervals in the video performance score.
  • the embodiment of the present invention does not specifically limit the method of obtaining the anti-shake performance score of the video according to the similarity scores corresponding to two frames of images at adjacent preset intervals in the video, including but not limited to: obtaining the accumulation of similarity scores As a result, the accumulation result is obtained by accumulating the similarity scores corresponding to two frames of images in each group of adjacent preset intervals in the video.
  • the method provided by the embodiment of the present invention can obtain the similarity score between two frames of images at adjacent preset intervals based on the similarity of each image corresponding to two frames of images at adjacent preset intervals, thus compared with The similarity score is obtained based on a single item of image similarity, and the obtained results are more accurate.
  • the weight of each image similarity can be set according to the actual needs, it can make it possible to focus on obtaining the similarity score and reduce the error caused by the image similarity corresponding to the low weight.
  • the anti-shake performance score is determined by The similarity score and weight are determined, which in turn makes the subsequently obtained anti-shake performance score more accurate.
  • the image similarity includes at least one of the following three items of similarity, and the following three items of similarity are brightness similarity, contrast similarity and structure similarity.
  • the brightness similarity corresponding to the two frame images of the t-1th group of adjacent preset intervals is calculated, that is, the t-th frame image and the t-1th frame of the two frame images of the t-1th group of adjacent preset intervals
  • the brightness similarity between images can refer to the following formula (3):
  • ⁇ t represents the average brightness value of the t-th frame image
  • ⁇ t-1 represents the brightness average value of the t-1-th frame image.
  • ⁇ t can be calculated by the following formula (4):
  • N represents the total number of pixels in the t-th frame image
  • i represents the i-th pixel in the t-th frame image
  • t i represents the brightness value of the i-th pixel
  • ⁇ t represents the brightness standard deviation of the t-th frame image, that is, the contrast of the t-th frame image
  • ⁇ t-1 represents the contrast of the t-1-th frame image
  • ⁇ t,t-1 represents the luminance covariance between the t-th frame image and the t-1-th frame image.
  • ⁇ t ,t-1 can be calculated by the following formula (8):
  • (t-1) i represents the brightness value of the i-th pixel in the t-1th frame image
  • ⁇ t-1 represents the brightness average value of the t-1th frame image
  • the method provided by the embodiment of the present invention can obtain the similarity between two frames of images at adjacent preset intervals based on the brightness similarity, contrast similarity, and structural similarity corresponding to two frames of images at adjacent preset intervals Compared with obtaining similarity scores based on a single item of image similarity, the obtained results are more accurate, and the anti-shake performance score is determined by the similarity score, so that the subsequent obtained anti-shake performance Scoring is more accurate.
  • the video is a single-channel video or a multi-channel video.
  • the single-channel video is a grayscale video
  • the multi-channel video is a color video. It should be noted that, if the video is a grayscale video, the anti-shake performance score of the grayscale video may be obtained directly according to the manner provided in the foregoing embodiment.
  • the video is a color video
  • the method provided in the above-mentioned embodiment first obtain the similarity of each image corresponding to each group of adjacent preset intervals of two frames of images in the video under each channel, for a certain same type Image similarity, and then add the similarity of the same type of image corresponding to the two frame images of each group of adjacent preset intervals in the video under each channel, and use the summation result as each group of adjacent presets in the video
  • the similarity of the image of the same type corresponding to the two frames of images at intervals.
  • the method provided by the embodiment of the present invention can be applied to single-channel video or multi-channel video at the same time, so it can be applied to a wider range of scenarios.
  • a delay calibration method comprising the following steps:
  • the posture data of the shooting device acquired in the shooting time period corresponding to each of the multiple videos, filter the multiple videos, and form a video group from the filtered videos; wherein, the posture data of the shooting device
  • the data is acquired based on inertial sensors;
  • the video is shot on the premise that the shooting device is shaken means that the shooting environment of the shooting device may be shaken, for example, the video can be shot during exercise, for example, the user is running and holding the shot, Mountain bike riding and other high-frequency sports. Due to the above-mentioned movement, the shooting device will continue to shake as the user moves, so the video captured during these movements can be considered to be shot on the premise that the shooting device is shaken. It should be noted that, when actually obtaining multiple videos, such as n, it is not necessary to shoot n times to obtain n videos, but to shoot one video first, and then intercept multiple video segments from the video based on the sliding window. to get multiple videos.
  • the embodiment of the present invention needs to use videos with "jitter” and use these videos as evaluation objects for anti-shake performance scores.
  • the more severe the "jitter” video the better it is to be used as an evaluation object.
  • the video is shot on the premise that the shooting device shakes.
  • the shooting equipment it is not necessary to let the shooting equipment shoot in an environment with jitter, that is, it can be shot in a general environment, but compared to the former, It is more difficult to obtain a video with severe "jitter” as an evaluation object.
  • the length of the sliding window itself may be set according to requirements, which is not specifically limited in this embodiment of the present invention.
  • the sliding step length of each sliding window can also be set according to requirements, and the sliding step length of each sliding can be the same or different, which is not specifically limited in this embodiment of the present invention.
  • the length of the sliding window itself can be fixed at 100 frames, and the sliding step is fixed at 10 frames as an example.
  • the first frame to the 100th frame can be intercepted as the first frame.
  • you can skip 10 frames you can intercept the 111th to 211th frames as the second video, and so on, until the required number of multiple videos are intercepted.
  • the attitude data of the photographing device is used to describe the attitude of the photographing device, which can be expressed in different ways such as attitude angle or quaternion, which is not specifically limited in this embodiment.
  • the acquisition frequency may be consistent with or inconsistent with the frame number frequency when shooting the video, and this embodiment of the present invention does not make specific details on this. limited.
  • the embodiment of the present invention does not specifically limit the manner of obtaining the pose data of the shooting device, including but not limited to: estimating the angle of the shooting device through an IMU-based preset algorithm. attitude to obtain the attitude data of the shooting device.
  • the preset algorithm can be AKF (Adaptive Kalman Filter, adaptive Kalman filter) algorithm, UKF (Unscented Kalman Filter, unscented Kalman filter), complementary filtering algorithm or other filtering algorithms. This is not specifically limited.
  • the delay value affects the anti-shake performance score is because the anti-shake performance score is obtained according to the image frame parameters, and the image frame parameters are obtained based on the image frame after the anti-shake processing, while the anti-shake Jitter processing is done by the vision system and inertial sensors, based on the delay value between the two. Therefore, for the clock corresponding to the IMU and the clock corresponding to the vision system, the more accurate the delay value is, adding the delay value to one of the clocks as a standard, and indexing the corresponding data on the other clock, the more accurate the indexing result will be. It should also be noted that, based on one of the clocks, the delay value between the other clock and this clock can be positive or negative. For example, based on the clock corresponding to the vision system, the clock corresponding to the IMU may be slow It may be too fast, and based on this, the delay value may be positive or negative.
  • the attitude data of the shooting device obtained based on the IMU are: 0.01 second, 0.02 second, and 0.03 second , 0.04 second, 0.05 second, 0.06 second, 0.07 second, 0.08 second, 0.09 second and 0.10 second
  • the attitude data of the shooting device at 10 moments and under the clock corresponding to the vision system
  • the image frames captured based on the vision system are: 0.01 second, 0.02 second, 0.03 second, 0.04 second, 0.05 second, 0.06 second, 0.07 second, 0.08 second, 0.09 second and 0.10 second
  • the image frames at these 10 moments are taken as an example.
  • the clock corresponding to the IMU is 0.03 seconds slower, that is, the time between the clock corresponding to the IMU and the clock corresponding to the vision system
  • the delay value of -0.03 the image frame captured by the vision system at the moment of 0.04 seconds corresponds to the attitude data of the shooting device acquired by the IMU at the moment of 0.01 seconds, and is subsequently captured at the moment of 0.04 seconds
  • the attitude data of the shooting device acquired by the IMU at the moment of 0.01 seconds will be used.
  • the real delay value is 0.01 seconds, that is, the image frame captured by the visual system at the moment of 0.04 seconds should correspond to the attitude data of the shooting device acquired by the IMU at the moment of 0.03 seconds.
  • the attitude data of the shooting device acquired by the IMU at the moment of 0.03 seconds should be used.
  • the greater the difference between the estimated delay value and the real delay value the less able to index the correct attitude data of the shooting device, and thus the greater the error in subsequent electronic anti-shake processing.
  • multiple videos can be screened by acquiring the posture data of the shooting device acquired within the shooting time period corresponding to each video.
  • the screening process can be to calculate the variance corresponding to the attitude data of the shooting device acquired in the shooting time period corresponding to each video, and then sort according to the variance from large to small, so as to select the preset number of videos . Since the larger the variance, the more unstable the data, the video with more intense jitter can be selected as the video obtained after screening.
  • the multiple videos are screened according to the posture data of the shooting device acquired within the shooting time period corresponding to each of the multiple videos.
  • the video can be screened to select the video with severe jitter as the video obtained after screening, and the more intense the jitter, the higher the requirements for anti-shake processing, and the anti-shake performance The more the score can reflect the real effect of anti-shake processing, the higher the requirement for the accuracy of the delay value. Therefore, the above-mentioned screened video is used as the basis for testing the effect of anti-shake processing.
  • the update process of the value and the process of obtaining the anti-shake performance score will eventually obtain a more accurate delay value.
  • steps in the flow charts of FIG. 1 and FIG. 2 are shown sequentially as indicated by the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in FIG. 1 and FIG. 2 may include multiple steps or stages. These steps or stages are not necessarily executed at the same time, but may be executed at different times. The steps or stages The order of execution is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of steps or stages in other steps.
  • the embodiment of the present invention does not filter multiple videos according to the posture data of the shooting device acquired within the shooting time period corresponding to each of the multiple videos Specifically defined, including but not limited to: converting the attitude data of the shooting device acquired during the shooting time period corresponding to each video to the frequency domain space to obtain the set of amplitude-frequency characteristic curves corresponding to each video; According to the frequency domain score corresponding to each video, multiple videos are screened.
  • the attitude data of the shooting device acquired during the shooting time period corresponding to each video can be a continuous axis angle, which is a continuous discrete value.
  • a linearly changing time-domain curve is formed in the formed coordinate system. Through fast Fourier transform, this curve can be transformed into multiple sine wave curves, that is, multiple amplitude-frequency characteristic curves, and thus form a set of amplitude-frequency characteristic curves.
  • each amplitude-frequency characteristic curve in these sets of amplitude-frequency characteristic curves can be regarded as a point in a coordinate system formed by taking the frequency as the abscissa and the amplitude as the ordinate.
  • the frequency domain score corresponding to each video can be used to represent the intensity of shaking when each video is shot.
  • the maximum frequency can be determined from the frequency and amplitude corresponding to each amplitude-frequency characteristic curve in the amplitude-frequency characteristic curve set and the maximum value of the amplitude, so that the product of the two is taken as the frequency domain score corresponding to the video.
  • the frequency average and the amplitude average according to the frequency and amplitude corresponding to each amplitude-frequency characteristic curve in the amplitude-frequency characteristic curve set, so that the product of the two average values can be used as the frequency domain corresponding to the video Score.
  • the reason why the frequency domain score corresponding to the video can be used to indicate the intensity of the jitter during video shooting is because the amplitude can indicate the intensity of the jitter during video shooting.
  • the value associated with the amplitude is used as a multiplication factor, and the value associated with the frequency is used as another multiplication factor.
  • the frequency score obtained by multiplying the two multiplication factors can also be used to represent the video. The degree of shaking during shooting. After the frequency domain score corresponding to each video is obtained, multiple videos may be screened according to the frequency domain score corresponding to each video, specifically, videos whose frequency domain score is greater than a preset threshold may be screened out.
  • the attitude data of the shooting device acquired in the shooting time period corresponding to each video is converted into the frequency domain space to obtain the amplitude corresponding to each video.
  • Frequency characteristic curve set, according to the amplitude-frequency characteristic curve set corresponding to each video obtain the frequency domain score corresponding to each video, and filter multiple videos according to the frequency domain score corresponding to each video.
  • the video Before calculating the anti-shake performance score corresponding to the video, the video can be screened to select the video with severe jitter as the video obtained after screening, and the more intense the jitter, the higher the requirements for anti-shake processing, and the anti-shake performance The more the score can reflect the real effect of anti-shake processing, the higher the requirement for the accuracy of the delay value, so the video is screened based on the frequency domain score of the video, and based on the above screened video As the basis for testing the anti-shake processing effect, through the process of continuously updating the delay value and obtaining the anti-shake performance score, the final delay value obtained will be more accurate.
  • the embodiment of the present invention does not specifically limit the method of obtaining the frequency-domain score corresponding to each video according to the set of amplitude-frequency characteristic curves corresponding to each video, including but not limited to : For the amplitude-frequency characteristic curve set corresponding to any video, according to the frequency and amplitude corresponding to each amplitude-frequency characteristic curve in the amplitude-frequency characteristic curve set, obtain the frequency domain score corresponding to each amplitude-frequency characteristic curve; The frequency-domain score corresponding to the amplitude-frequency characteristic curve is used to obtain the frequency-domain score corresponding to the video.
  • the frequencies and amplitudes corresponding to the amplitude-frequency characteristic curve can be weighted and summed, so that the weighted summation result can be used as the frequency domain score corresponding to the amplitude-frequency characteristic curve.
  • the maximum value and the minimum value can be selected from the frequency-domain scores corresponding to all amplitude-frequency characteristic curves. Value, the average value of the two is used as the frequency domain score corresponding to the amplitude-frequency characteristic curve set, that is, as the frequency domain score corresponding to the video.
  • the frequency domain score corresponding to each amplitude-frequency characteristic curve is obtained ; Obtain the frequency domain score corresponding to the video according to the frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the video Before calculating the anti-shake performance score corresponding to the video, the video can be screened to select the video with severe jitter as the video obtained after screening, and the more intense the jitter, the higher the requirements for anti-shake processing, and the anti-shake performance
  • the embodiment of the present invention does not obtain the frequency domain analysis corresponding to each amplitude-frequency characteristic curve according to the frequency and amplitude corresponding to each amplitude-frequency characteristic curve in the amplitude-frequency characteristic curve set.
  • the way of value is specifically defined, including but not limited to: obtaining the product of frequency and amplitude corresponding to each amplitude-frequency characteristic curve, and using the product as the frequency domain score corresponding to each amplitude-frequency characteristic curve; or obtaining each For the score corresponding to the frequency of the amplitude-frequency characteristic curve, the product of the score corresponding to each amplitude-frequency characteristic curve and the amplitude value is obtained, and the product is used as the frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the embodiment of the present invention does not specifically limit the way of obtaining the score corresponding to the frequency of each amplitude-frequency characteristic curve, including but not limited to: determining the frequency of each amplitude-frequency characteristic curve according to the frequency corresponding to each amplitude-frequency characteristic curve The frequency within the preset time period, and use the frequency as the score corresponding to each frequency characteristic curve.
  • the preset time period may be 1 second, which is not specifically limited in this embodiment of the present invention.
  • the frequency corresponding to each amplitude-frequency characteristic curve is converted into a score, because the frequency corresponding to each amplitude-frequency characteristic curve is different, and it can be converted into a score under the same standard, which can be The identity of the data is guaranteed, so as to ensure that the frequency domain scores obtained by subsequent calculations are all based on the same calculation standard.
  • the video before calculating the anti-shake performance score corresponding to the video, the video can be screened to select the video with more intense jitter as the video obtained after screening, and the more intense the jitter, the more the anti-shake processing requirements
  • the finally obtained delay value will be more accurate.
  • the embodiment of the present invention does not specifically limit the method of obtaining the frequency-domain score corresponding to any video according to the frequency-domain score corresponding to each amplitude-frequency characteristic curve, including but It is not limited to: perform weighted summation of the frequency-domain scores corresponding to all the amplitude-frequency characteristic curves in the amplitude-frequency characteristic curve set, and use the obtained sum as the frequency-domain score corresponding to the video.
  • the obtained sum value is used as the corresponding value of the video.
  • the video Before calculating the anti-shake performance score corresponding to the video, the video can be screened to select the video with severe jitter as the video obtained after screening, and the more intense the jitter, the higher the requirements for anti-shake processing, and the anti-shake performance The more the score can reflect the real effect of anti-shake processing, the higher the requirement for the accuracy of the delay value, so the video is screened based on the frequency domain score of the video, and based on the above screened video As the basis for testing the anti-shake processing effect, through the process of continuously updating the delay value and obtaining the anti-shake performance score, the final delay value obtained will be more accurate.
  • the embodiment of the present invention does not specifically limit the manner of screening multiple videos according to the frequency domain score corresponding to each video, including but not limited to: multiple video
  • the frequency-domain scores corresponding to each video in are sorted from large to small, and a preset number of videos are selected and used as the obtained videos after screening.
  • the frequency domain score of the video is, the more intense the degree of shaking is when the video is shot. Therefore, in order to select a video with more intense shaking during shooting, the frequency domain score can be selected from large to small. Sort, and filter out the preset number of videos in the sorted results.
  • the video can be screened to select the video with severe jitter as the video obtained after screening, and the more intense the jitter, the higher the requirements for anti-shake processing, and the anti-shake performance
  • the score can reflect the real effect of anti-shake processing, the higher the requirement for the accuracy of the delay value, so the video is screened based on the frequency domain score of the video, and based on the above screened video
  • the final delay value obtained will be more accurate.
  • a delay calibration device including: an acquisition module 301 and an update module 302, wherein:
  • An acquisition module 301 configured to acquire a video group, where at least one video is included in the video group;
  • the update module 302 is used to update the delay value between the inertial sensor and the vision system, and based on the updated delay value, obtain the anti-shake performance score corresponding to the video group, and repeat the update process and acquisition of the above delay value The anti-shake performance score process, until the obtained anti-shake performance score meets the preset condition, then obtain the delay value corresponding to the anti-shake performance score that meets the preset condition;
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • the acquisition module 301 includes:
  • the obtaining sub-module is used to obtain multiple videos, and the videos are shot under the premise that the shooting device is shaken;
  • the screening sub-module is used to screen multiple videos according to the attitude data of the shooting equipment acquired in the corresponding shooting time period of each video in the multiple videos, and form a video group from the filtered videos; wherein, The attitude data of the shooting device is acquired based on the inertial sensor.
  • the screening submodule includes:
  • a conversion unit configured to convert the attitude data of the shooting device acquired during the shooting time period corresponding to each video into a frequency domain space, so as to obtain a set of amplitude-frequency characteristic curves corresponding to each video;
  • An acquisition unit configured to acquire a frequency-domain score corresponding to each video according to a set of amplitude-frequency characteristic curves corresponding to each video;
  • the filtering unit is configured to filter multiple videos according to the frequency domain score corresponding to each video.
  • the acquisition unit includes:
  • the first acquisition subunit is used to obtain the frequency corresponding to each amplitude-frequency characteristic curve according to the frequency and amplitude corresponding to each amplitude-frequency characteristic curve in the amplitude-frequency characteristic curve set corresponding to any video frequency characteristic curve set. domain score;
  • the second acquiring subunit is configured to acquire the frequency domain score corresponding to the video according to the frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the first obtaining subunit is configured to obtain the product of frequency and amplitude corresponding to each amplitude-frequency characteristic curve, and use the product as the frequency domain score corresponding to each amplitude-frequency characteristic curve; or, obtain The score corresponding to the frequency of each amplitude-frequency characteristic curve is obtained, and the product of the score corresponding to each amplitude-frequency characteristic curve and the amplitude is obtained, and the product is used as a frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the second acquisition subunit is configured to perform weighted summation of the frequency-domain scores corresponding to all the amplitude-frequency characteristic curves in the amplitude-frequency characteristic curve set, and use the obtained sum as the frequency-domain score corresponding to the video. value.
  • the screening unit is configured to sort the frequency domain scores corresponding to each of the plurality of videos in descending order, select a preset number of videos, and use them as the videos obtained after screening.
  • Each module in the above-mentioned delay calibration device can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure may be as shown in FIG. 4 .
  • the computer device includes a processor, a memory, a communication interface, a display screen and an input device connected through a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer programs.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the communication interface of the computer device is used for wired or wireless communication with external terminals, and the wireless mode can be realized through WIFI, operator network, NFC (Near Field Communication) or other technologies.
  • the computer program is executed by a processor, an image processing method is realized.
  • the display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the casing of the computer device , and can also be an external keyboard, touchpad, or mouse.
  • FIG. 4 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation to the computer equipment on which the solution of the application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • attitude data of the shooting equipment acquired in the corresponding shooting time period of each video in the multiple videos multiple videos are screened, and the videos obtained after screening form a video group; wherein, the attitude data of the shooting equipment is based on inertial sensors.
  • the processor executes the computer program, the following steps are also implemented: for any set of amplitude-frequency characteristic curves corresponding to any video, according to the frequency and amplitude corresponding to each amplitude-frequency characteristic curve in the amplitude-frequency characteristic curve set, obtain A frequency-domain score corresponding to each amplitude-frequency characteristic curve; according to the frequency-domain score corresponding to each amplitude-frequency characteristic curve, the frequency-domain score corresponding to the video is obtained.
  • the processor executes the computer program, the following steps are also implemented: obtaining the product of frequency and amplitude corresponding to each amplitude-frequency characteristic curve, and using the product as the frequency-domain score corresponding to each amplitude-frequency characteristic curve; Or, obtain the score corresponding to the frequency of each amplitude-frequency characteristic curve, obtain the product of the score corresponding to each amplitude-frequency characteristic curve and the amplitude, and use the product as the frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the processor when the processor executes the computer program, the following steps are also implemented: performing weighted summation of the frequency domain scores corresponding to all the amplitude-frequency characteristic curves in the amplitude-frequency characteristic curve set, and using the obtained sum value as the corresponding Frequency domain scores.
  • the processor executes the computer program, the following steps are also implemented: sort the frequency domain scores corresponding to each of the multiple videos in descending order, select a preset number of videos, and use them as get the video.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the inertial sensor and the visual system are coupled on the same shooting device, each video in the video group is obtained based on the visual system, and the anti-shake processing is through the visual system and the inertial sensor, and based on the delay value between the two Completed, the anti-shake performance score is used to evaluate the anti-shake effect of the video after anti-shake processing.
  • attitude data of the shooting equipment acquired in the corresponding shooting time period of each video in the multiple videos multiple videos are screened, and the videos obtained after screening form a video group; wherein, the attitude data of the shooting equipment is based on inertial sensors.
  • the frequency domain score corresponding to each amplitude-frequency characteristic curve is obtained;
  • the frequency domain score corresponding to each amplitude-frequency characteristic curve is obtained.
  • the score corresponding to the frequency of each amplitude-frequency characteristic curve is obtained, and the product of the score corresponding to each amplitude-frequency characteristic curve and the amplitude is obtained, and the product is used as a frequency domain score corresponding to each amplitude-frequency characteristic curve.
  • the frequency domain scores corresponding to each of the plurality of videos are sorted from large to small, and a preset number of videos are selected, and used as the obtained videos after screening.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
  • Volatile memory can include Random Access Memory (RAM) or external cache memory.
  • RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Studio Devices (AREA)

Abstract

本申请涉及一种延迟校准方法、装置、计算机设备和存储介质。所述方法包括:获取视频组,视频组中至少包括一个视频;对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值。由于不需要由IMU与视觉系统分别估计两组运动,再以两组运动之间的误差作为代价值,使误差最小化来估计两者之间的延迟,从而能够避免两组运动估计本身所带来的误差,进而能够提高校准延迟时的精准度。

Description

延迟校准方法、装置、计算机设备和存储介质 技术领域
本申请涉及图像处理技术领域,特别是涉及一种延迟校准方法、装置、计算机设备和存储介质。
背景技术
目前通常从IMU(Inertial Measurement Unit,惯性传感器)侦测到的抖动信号计算相机的姿态,再根据计算得到的相机姿态,对通过视觉系统拍摄得到的图像进行补偿处理,以实现电子防抖。由于视觉系统在拍摄图像时与IMU侦测到抖动信号之间存在延迟,比如视觉系统拍摄了某一帧的图像,IMU检测到的抖动却是上一帧对应时刻的,而系统却可能认为这两者是同一时刻相匹配的,也即视觉系统很难在IMU检测到抖动的那一刻恰好拍摄了图像,从而针对上述延迟,实际应用中就需要对IMU与视觉系统之间的延迟进行校准,也即对于IMU对应的时钟及视觉系统的时钟,需要确定其中一个时钟以另一个时钟为标准下的时间偏差。
技术问题
在相关技术中,一般是分别通过IMU与视觉系统估计两组运动,再通过非线性优化算法以两组运动之间的误差作为代价值,使误差最小化来估计二者之间的延迟。由于两组运动估计本身存在误差,从而导致通过该方法估计的延迟精度较低,不能满足高精度延迟的需求。另外,若上述两组运动中存在周期性重复的运动,则该方法还存在估计错误的情况。
发明内容
基于此,有必要针对上述技术问题,提供一种能够精准校准IMU与视觉系统之间延迟的延迟校准方法、装置、计算机设备和存储介质。
一种延迟校准方法,该方法包括:
获取视频组,视频组中至少包括一个视频;
对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
在其中一个实施例中,获取视频组包括:
获取多个视频,视频是在拍摄设备存在抖动的前提下所拍摄的;
根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选,并由筛选后得到的视频构成视频组;其中,拍摄设备的姿态数据是基于惯性传感器所获取的。
在其中一个实施例中,根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选,包括:
将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合;
根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值;
根据每个视频对应的频域分值,对多个视频进行筛选。
在其中一个实施例中,根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值,包括:
对于任一视频对应的幅频特性曲线集合,根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值;
根据每一幅频特性曲线对应的频域分值,获取该视频对应的频域分值。
在其中一个实施例中,根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值,包括:
获取每一幅频特性曲线对应的频率与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值;或者,
获取每一幅频特性曲线对应频率的得分,获取每一幅频特性曲线对应的得分与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值。
在其中一个实施例中,根据每一幅频特性曲线对应的频域分值,获取任一视频对应的频域分值,包括:
对幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为该视频对应的频域分值。
在其中一个实施例中,根据每个视频对应的频域分值,对多个视频进行筛选,包括:
对多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并 作为筛选后得到的视频。视频为单通道视频或多通道视频。
一种延迟校准装置,该装置包括:
获取模块,用于获取视频组,视频组中至少包括一个视频;
更新模块,用于对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现以下步骤:
获取视频组,视频组中至少包括一个视频;
对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取视频组,视频组中至少包括一个视频;
对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
技术效果
上述延迟校准方法、装置、计算机设备和存储介质,通过获取视频组,对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值。由于不需要由IMU与视觉系统分别估计两组运动,再以两组运动之间的误差作为代价值,使误差最小化来估计两者之间的延迟,从而能够避免两组运动估计本身所带来的误差,进而能够提高校准延迟时的精准度。
附图说明
图1为一个实施例中延迟校准方法的流程示意图;
图2为另一个实施例中延迟校准方法的流程示意图;
图3为一个实施例中延迟校准装置的结构框图;
图4为一个实施例中计算机设备的内部结构图。
本发明的实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
可以理解,本申请所使用的术语“第一”、“第二”等可在本文中用于描述各种专业名词,但除非特别说明,这些专业名词不受这些术语限制。这些术语仅用于将一个专业名词与另一个专业名词区分。举例来说,在不脱离本申请的范围的情况下,第三预设阈值与第四预设阈值可以相同可以不同。
目前移动终端拍照以及摄像的表现越来越好,已经逐渐取代了传统的卡片机,并且越来越多的移动终端,通过多摄像头组合,对超广角、长焦以及人像场景进行覆盖,以带来更好的影像体验。其中,有一个环节却也是绕不开的话题,即防抖。防抖不光是应用在视频中,拍照中出色的防抖效果可以带来更大的安全快门,提高成片率,从而防抖效果也是很多移动终端制造厂商所追求的目标。
基于上述需求,电子防抖应运而生。EIS(Electronic Image Stablization,电子防抖),主要是在图像拍下来后,通过拍摄设备内的传感器在图像拍摄过程中所侦测到的微小抖动,从而根据微小抖动对应的信号,利用边缘的图象来进行补偿,从而克服因拍摄设备的抖动产生的 影像模糊。在相关技术中,主要利用的传感器为IMU。相应地,在实现电子防抖时,主要是先利用从IMU侦测到的抖动信号计算相机的姿态,再根据计算得到的相机姿态,对通过视觉系统拍摄得到的图像进行补偿处理。
由于视觉系统在拍摄图像时与IMU侦测到抖动信号之间存在延迟,比如视觉系统拍摄了某一帧的图像,IMU检测到的抖动却是上一帧对应时刻的,而系统却可能认为这两者是同一时刻相匹配的,也即视觉系统很难在IMU检测到抖动的那一刻恰好拍摄了图像,从而针对上述延迟,实际应用中就需要对IMU与视觉系统之间的延迟进行校准,也即对于IMU对应的时钟及视觉系统的时钟,需要确定其中一个时钟以另一个时钟为标准下的时间偏差。在相关技术中,一般是分别通过IMU与视觉系统估计两组运动,再通过非线性优化算法以两组运动之间的误差作为代价值,使误差最小化来估计二者之间的延迟。由于两组运动估计本身存在误差,从而导致通过该方法估计的延迟精度较低,不能满足高精度延迟的需求。另外,若上述两组运动中存在周期性重复的运动,则该方法还存在估计错误的情况。
针对上述相关技术中存在的问题,本发明实施例提供了一种延时校准方法,该方法可以应用于终端中,终端可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备等。可以理解的是,该延时校准方法也可以应用于服务器中且相应执行主体为服务器,再或者根据实际需求及可行性,该延时校准方法可以同时应用于终端与服务器中,也即该延时校准方法中一部分步骤的执行主体可以为终端,而另一部分步骤的执行主体可以为服务器,本发明实施例对此不作具体限定。例如,图1对应方法流程中步骤101可以由终端执行,再由终端将视频组发送至服务器,从而步骤102由服务器执行,服务器在获取到IMU与视觉系统之间的延时值后可以再发送给终端。需要说明的是,本申请各实施例中提及的“多个”等的数量均指代“至少两个”的数量,比如,“多个”指“至少两个”。
在对本申请的具体实施方式进行说明之前,先对本申请的主要应用场景进行说明。本申请中的延迟校准方法主要应用于校准IMU与视觉系统之间的延时值,从而以便于后续IMU与视觉系统基于两者之间的延时值实现电子防抖。结合上述实施例的内容,在一个实施例中,参见图1,提供了一种延时校准方法。以该方法应用于终端,且执行主体为终端为例进行说明,该方法包括如下步骤:
101、获取视频组,视频组中至少包括一个视频;
102、对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视 频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值。
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。惯性传感器与视觉系统之所以需要耦合在同一拍摄设备上,是因为本发明实施例主要是根据视觉系统的成像质量,来校准惯性传感器与视觉系统之间的延时值。其中,惯性传感器需要捕获摄影设备的抖动,而视觉系统需要在拍摄设备存在抖动的前提下拍摄成像并后续以此确定成像质量。为达到上述前提,惯性传感器与视觉系统需要耦合在同一拍摄设备上。
在上述步骤101中,视频组可以仅包括一个视频,也可以包括多个视频,本发明实施例对此不作具体限定。在上述步骤102中,视频组对应的防抖性能得分是基于视频组中每一视频的防抖性能得分所得到的。本发明实施例不对获取视频组对应的防抖性能得分的方式作具体限定,包括但不限于:将视频组中每一视频的防抖性能得分进行相加,将相加得到的和值作为视频组对应的防抖性能得分;或者,将视频组中每一视频的防抖性能得分进行相加,再对相加得到的和值取平均值,将平均值作为视频组对应的防抖性能得分。
另外,在步骤102中,延时值可具有初始值,如初始值为0。第一次对延时值进行更新,可以指的是对延时值的初始值进行更新。当然,实际实施过程中,第一次获取视频组对应的防抖性能得分,可以不对延时值进行更新,也即不是基于更新后的延时值获取视频组对应的防抖性能得分,而是直接基于延时值的初始值,本发明实施例不对此作具体限定。
对于延时值的更新方式,可以朝延时值增加的方向去更新,也可以朝延时值降低的方向去更新,本发明实施例对此不作具体限定。例如,朝延时值增加的方向去更新,可以之前延时值为0.2秒,更新后增加为0.3秒。朝延时值降低的方向去更新,可以之前延时值为0.3秒,更新后降低为0.2秒。
在上述步骤102中,预设条件可以根据需求进行设置,例如,对于步骤102中获取到的满足预设条件的防抖性能得分,实际上就是最后一次更新延时值后,所获取到的防抖性能得分,同时也是最后一次获取到的防抖性能得分。基于此,预设条件可以为最后一次获取到的防抖性能得分与上一次获取到的防抖性能得分之间的差值小于第一预设阈值,此时,满足预设条件的防抖性能得分所对应的延时值,可以为最后一次获取到的防抖性能得分所对应的延 时值。或者,预设条件可以为最后一次获取到的防抖性能得分大于第二预设阈值,此时,满足预设条件的防抖性能得分所对应的延时值,也可以为最后一次获取到的防抖性能得分所对应的延时值。
再或者,考虑到当更新的延时值在逐渐逼近延时值的真实值时,防抖性能得分虽然可能在逐渐提高但提升幅度会随着前者逼近过程而逐渐减少,基于该原理,预设条件还可以为连续n次获取到的防抖性能得分均大于第三预设阈值且该连续n次获取到的防抖性能得分中每两个相邻的防抖性能得分之间的差值均小于第四预设阈值。其中,n为不小于2的正整数。此时,满足预设条件的防抖性能得分所对应的延时值,可以为最后一次获取到的防抖性能得分所对应的延时值。当然,实际实施过程中预设条件还可以为其它内容,本发明实施例对此不作具体限定。需要说明的是,第一预设阈值至第四预设阈值均可以根据实测或者经验获取,本发明实施例对此不作具体限定。另外,IMU可以包括加速度计和陀螺仪,本发明实施例对此也不作具体限定。
本发明实施例提供的方法,通过获取视频组,对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值。由于不需要由IMU与视觉系统分别估计两组运动,再以两组运动之间的误差作为代价值,使误差最小化来估计两者之间的延迟,从而能够避免两组运动估计本身所带来的误差,进而能够提高校准延迟时的精准度。
结合上述实施例的内容,在一个实施例中,对于视频组中任一视频,本发明实施例不对获取该视频的防抖性能得分的方式作具体限定,包括但不限于:根据视频对应的图像帧参数,获取视频的防抖性能得分。
图像帧参数可以包括图像帧之间的差异度和/或相似度,图像帧参数可以基于视频中图像帧之间的图像参数计算得到。其中,图像参数可以包括亮度和/或对比度等,本发明实施例对此不作具体限定。以图像参数为亮度为例,图像帧参数可以包括图像帧之间亮度的相似度和/或差异度。以图像参数为对比度为例,图像帧参数可以包括图像帧之间对比度的相似度和/或差异度。以图像参数包括亮度及对比度为例,图像帧参数可以包括亮度的相似度和/或差异度,以及,对比度的相似度和/或差异度。其中,差异度可以通过计算差值得到,相似度可以通过相似度算法计算得到。例如,两个图像帧之间亮度的差异度,可以通过计算两个图像帧之间 亮度的差值得到。两个图像帧之间亮度的相似度,可以通过相似度算法计算得到,如对于两个图像帧各自对应的亮度特征向量,可以计算该两个亮度特征向量之间的相似度,以作为两个图像帧之间亮度的相似度。
由上述过程可知,图像帧参数主要可以用于表示视频中图像帧之间的差异度和/或相似度。至于是视频中哪些图像帧之间的差异度和/或相似度,可以根据需求设置,本发明实施例对此不作具体限定。例如,可以仅是由视频中起始帧与中间帧之间的差异度和/或相似度构成图像帧参数,也可以仅是中间帧与结束帧之间的差异度和/或相似度构成图像帧参数,还可以是起始帧与中间帧之间的差异度和/或相似度,以及中间帧与结束帧之间的差异度和/或相似度共同构成图像帧参数。
需要说明的是,视频是由一帧帧图像所构成的,当视频是由处于运动状态下的拍摄设备所拍摄得到时,视频中图像帧之间会因为抖动而产生些许图像参数的变形。而这些图像参数的变形会组合在一起,体现在视觉效果上,可能会呈现不好的拍摄效果,如会造成视频呈现拍摄抖动模糊等不好的拍摄效果,而防抖处理能够尽量消除这些参数变形以提高拍摄效果。在数据处理的角度,这些图像参数的变形会体现在图像帧之间的图像参数所对应的计算结果上,也即可以体现在图像帧参数上。因此,图像帧参数作为视频经过防抖处理后其所呈现的视觉效果的一种外在量化,是可以代表视频经过防抖处理后其防抖性能好坏的,从而可以利用图像帧参数来评估视频防抖性能。
另外,结合上述示例中的内容,关于终端101根据视频对应的图像帧参数,获取视频的防抖性能得分的方式,本发明实施例对此不作具体限定。基于图像帧参数中包含的内容,获取防抖性能得分的方式可以分为如下几种方式:
(1)图像帧参数包括图像帧之间的差异度。
由上述示例中的内容可知,在根据视频对应的图像帧参数,获取视频的防抖性能得分时,至于是视频中哪些图像帧之间的差异度,可以根据需求设置。无论是哪些图像帧之间的差异度,其实际均是视频中某两帧图像构成一组,并为该组内两帧图像之间的差异度。因此,图像帧参数实际上可以包括若干个差异度,每一差异度均是由视频中某组两帧图像所确定的。其中,“若干个”可以指的是一个或多个。相应地,在根据视频对应的图像帧参数,获取视频的防抖性能得分时,若图像帧参数中包含一个差异度,则可以直接将该差异度作为视频的防抖性能得分。若图像帧参数中包含多个差异度,则可以对该多个差异度取平均值,将平均值 作为视频的防抖性能得分。
(2)图像帧参数包括图像帧之间的相似度。
与上述第(1)种情形类似,由上述示例中的内容可知,在根据视频对应的图像帧参数,获取视频的防抖性能得分时,至于是视频中哪些图像帧之间的差异度,可以根据需求设置。无论是哪些图像帧之间的相似度,其实际均是视频中某两帧图像构成一组,并为该组内两帧图像之间的相似度。因此,图像帧参数实际上可以包括若干个相似度,每一相似度均是由视频中某组两帧图像所确定的。其中,“若干个”可以指的是一个或多个。相应地,在根据视频对应的图像帧参数,获取视频的防抖性能得分时,若图像帧参数中包含一个相似度,则可以直接将该相似度作为视频的防抖性能得分。若图像帧参数中包含多个相似度,则可以对该多个相似度取平均值,将平均值作为视频的防抖性能得分。
(3)图像帧参数包括图像帧之间的相似度及差异度。
与上述第(1)种及第(2)种情形类似,无论是哪些图像帧之间的相似度或差异度,其实际均是视频中某两帧图像构成一组,并为该组内两帧图像之间的相似度或差异度。因此,图像帧参数实际上可以包括若干个相似度及若干个差异度,每一相似度或差异度均是由视频中某组两帧图像所确定的。其中,“若干个”可以指的是一个或多个。相应地,在根据视频对应的图像帧参数,获取视频的防抖性能得分时,可以先对图像帧参数中若干个差异度取平均值,得到差异度平均值,并对图像帧参数中若干个相似度取平均值,得到相似度平均值。通过对差异度平均值与相似度平均值进行加权求和,将加权求和结果作为视频的防抖性能得分。其中,如果上述“若干个”实质为一个,则可以不作平均值,直接使用该一个相似度或平均度进行加权求和。
例如,结合上述示例内容,以图像帧参数包括视频中起始帧与结束帧之间的差异度为例,可以将该差异度直接作为防抖性能得分。以图像帧参数包括视频中起始帧与中间帧之间的差异度,以及中间帧与结束帧之间的差异度为例,可以将两个差异度取平均值,并将平均值作为防抖性能得分。以图像帧参数包括视频中起始帧与中间帧之间的差异度,以及视频中起始帧与中间帧之间的相似度为例,可先按照差异度与相似度在让视频呈现更好拍摄效果所占据的重要程度上,设置差异度与相似度各自的权重,从而对差异度与相似度进行加权求和,从而将加权求和结果作为防抖性能得分。
本发明实施例提供的方法,通过获取经由防抖处理所形成的视频,根据视频对应的图像 帧参数,获取视频的防抖性能得分。由于防抖性能得分是基于视频对应的图像帧参数所获取的相对客观的评估依据,从而相较于人类视觉系统,防抖性能得分作为评估结果更加精准。另外,由于是根据视频对应的图像帧参数,直接获取防抖性能得分以评估防抖效果,而不需要花费较长时间通过视觉直观感受来评估防抖效果,从而耗费时间较短,评估效率更高。
结合上述实施例的内容,在一个实施例中,图像帧参数包括图像相似度;相应地,本发明实施例不对根据视频对应的图像帧参数,获取视频的防抖性能得分的方式作具体限定,包括但不限于:对于视频中每一组相邻预设间隔的两帧图像,获取每一组相邻预设间隔的两帧图像中前一帧图像与后一帧图像之间的图像相似度,并作为每一组相邻预设间隔的两帧图像对应的图像相似度;根据视频中每一组相邻预设间隔的两帧图像对应的图像相似度,获取视频的防抖性能得分。
在上述过程中,预设间隔可以用m表示,m表示间隔m帧。具体地,m可以为1,也可以为2,但不能大于总帧数减1所得到的数值。其中,m也不宜过大,过大则图像相似度的总量太少,会导致后续防抖性能得分不够准确。基于上述理由以及为了便于说明,本发明实施例以预设间隔为1为例,对后续过程进行解释说明。
以视频中一共包含m帧图像,分别为第1帧、第2帧、…、第m帧为例。上述过程中所提及的视频中每一组相邻预设间隔的两帧图像,在预设间隔为1时,指的是第1帧与第2帧作为一组相邻的两帧图像、第2帧与第3帧作为一组相邻的两帧图像、第3帧与第4帧作为一组相邻的两帧图像、……、直至第m-1帧与第m帧作为一组相邻的两帧图像,这样一共可以形成m-1组。其中,每一组相邻预设间隔的两帧图像对应的图像相似度的计算方式,可以参考上述示例种图像相似度的相关定义。
在获取到视频中每一组相邻预设间隔的两帧图像对应的图像相似度后,可以根据每一组相邻预设间隔的两帧图像对应的图像相似度,进一步获取视频的防抖性能得分。本发明实施例不对根据视频中每一组相邻预设间隔的两帧图像对应的图像相似度,获取视频的防抖性能得分的方式作具体限定,包括但不限于:获取视频中每一组相邻预设间隔的两帧图像对应的图像相似度的求和结果,并将求和结果作为视频的防抖性能得分。或者,进一步地,基于视频中每一组相邻预设间隔的两帧图像所形成的总组数,对求和结果取平均值,将平均值作为视频的防抖性能得分。
再或者,若上述求得的图像相似度不止一种,则可进一步基于多种图像相似度来获取视 频的防抖性能得分。比如结合上述示例中的说明,图像相似度是基于视频中相邻两帧图像之间的图像参数计算得到的,图像参数可以包括亮度和/或对比度。以图像参数包括亮度和对比度为例,相应地,图像相似度可以包括两项,一项是基于图像参数为亮度所求得的,记为亮度相似度,另一项是基于图像参数为对比度所求得的,记为对比度相似度。
基于上述说明,根据视频中每一组相邻预设间隔的两帧图像对应的图像相似度,获取视频的防抖性能得分,可以进一步为:获取视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度的求和结果,对每项图像相似度对应的求和结果再进行求和,将最终的求和结果作为视频的防抖性能得分。当然,除了该方式之外,对于多项图像相似度的情形,还可以采取对多项图像相似度进行加权求和的方式,来获取视频的防抖性能得分。例如,以图像相似度包括基于图像参数为亮度所求得的亮度相似度结果,以及基于图像参数为对比度所求得的对比度相似度结果为例,可以基于视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度,及每项图像相似度对应的权重进行加权求和,将得到的加权求和结果作为视频的防抖性能得分。
本发明实施例提供的方法,由于拍摄抖动是连续的,在经过防抖处理的前提下,防抖处理后的提升效果会在视频中每一组相邻预设间隔的两帧图像之间的对比中有所体现,而每一组相邻预设间隔的两帧图像对应的图像相似度能够反映实际提升效果,从而基于每一组相邻预设间隔的两帧图像对应的图像相似度所获取的防抖性能得分,能够作为相对客观的评估依据,以此作为评估结果更加精准。
结合上述实施例的内容,在一个实施例中,预设间隔为1,对于视频中任意一组相邻预设间隔的两帧图像,将该两帧图像分别记为第q帧图像及第q-1帧图像;相应地,本发明实施例不对获取每一组相邻预设间隔的两帧图像中前一帧图像与后一帧图像之间的图像相似度的方式作具体限定,包括但不限于如下两种方式:
第一种获取图像相似度的方式:获取第q帧图像中的第一子区域与第q-1帧图像中的第二子区域之间的图像相似度,并作为第q帧图像与第q-1帧图像之间的图像相似度,第一子区域与第二子区域是按照相同划分方式划分的且在各自图像中位于相同位置;或者,
第二种获取图像相似度的方式:获取每一子区域组中第三子区域与第四子区域之间的图像相似度,并根据多个子区域组对应的图像相似度,获取第q帧图像与第q-1帧图像之间的图像相似度;其中,每一子区域组是由第q帧图像中的第三子区域及第q-1帧图像中的第四 子区域所组成的,第q帧图像中的第三子区域与第q-1帧图像中的第四子区域是按照相同的划分方式所得到的,每一子区域组中第三子区域与第四子区域在各自图像中位于相同位置。
在上述第一种方式中,以第q帧图像及第q-1帧图像均按照相同划分方式划分为2*2的4个部分,第一子区域为第q帧图像所划分的4个部分中左上角的那部分,第二子区域为第q-1帧图像所划分的4个部分中左上角的那部分为例,可以按照上述示例中计算图像相似度的方式来分别获取第一子区域与第二子区域之间的图像相似度。例如,可以先获取第一子区域中所有像素的平均亮度值,再获取第二子区域中所有像素的平均亮度值,将第一子区域对应的平均亮度值与第二子区域对应的平均亮度值之间的差值,作为第一子区域与第二子区域之间的图像相似度。
当然,在按照上述划分方式所形成的4个部分中,也可以将第q-1帧图像中右上角的那部分作为第一子区域,将第q帧图像中右上角的那部分作为第二子区域,同样地,还可以将第q-1帧图像中左下角的那部分作为第一子区域,将第q帧图像中左下角的那部分作为第二子区域,以此来获取第一子区域与第二子区域之间的图像相似度,本发明实施例对此不作具体限定。
在上述第二种方式中,以第q帧图像及第q-1帧图像均按照相同划分方式划分为2*2的4个部分为例。相应地,第q帧图像中包括4个第三子区域,第q-1帧图像包括4个第四子区域,并由此可形成4个子区域组。
具体地,第q帧图像位于左上角的第三子区域与第q-1帧图像位于左上角的第四子区域可形成第一个子区域组,第q帧图像位于右上角的第三子区域与第q-1帧图像位于右上角的第四子区域可形成第二个子区域组,第q帧图像位于左下角的第三子区域与第q-1帧图像位于左下角的第四子区域可形成第三个子区域组,第q帧图像位于右下角的第三子区域与第q-1帧图像位于右下角的第四子区域可形成第四个子区域组。
结合上述示例的内容,基于相同的图像相似度计算方式,可以分别获取这四个子区域组中每一子区域组对应的图像相似度。由此,根据多个子区域组对应的图像相似度,可获取第q帧图像与第q-1帧图像之间的图像相似度。本发明实施例不对根据多个子区域组对应的图像相似度,获取第q帧图像与第q-1帧图像之间的图像相似度的方式作具体限定,包括担不限于:将求和结果作为第q帧图像与第q-1帧图像之间的图像相似度;或者,基于子区域组的数量,获取求和结果的平均值,将平均值作为第q帧图像与第q-1帧图像之间的图像相似度。 其中,求和结果是对每一子区域组对应的图像相似度进行相加后得到的。需要说明的是,上述示例给出的预设间隔为1时的实现过程,预设间隔为除1之外的其它值时,也可以参考上述示例中的过程,此处不再赘述。
本发明实施例提供的方法,由于拍摄抖动是连续的,在经过防抖处理的前提下,防抖处理后的提升效果会在视频中每一组相邻预设间隔的两帧图像之间的对比中有所体现,而每一组相邻预设间隔的两帧图像对应的图像相似度能够反映实际提升效果,从而对于一组相邻预设间隔的两帧图像,在将该两帧图像采用相同的划分方式进行划分后,基于该两帧图像位于相同位置所划分得到的某一块区域或者将所划分得到的所有区域作为全局考虑,以此来获取该两帧图像对应的图像相似度,能够作为相对客观的评估依据,基于此所获取的评估结果更加精准。
结合上述实施例的内容,在一个实施例中,本发明实施例不对根据视频中每一组相邻预设间隔的两帧图像对应的图像相似度,获取视频的防抖性能得分的方式作具体限定,包括但不限于:根据视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度,及每项图像相似度对应的权重,获取视频中每一组相邻预设间隔的两帧图像对应的相似度得分;根据视频中每一组相邻预设间隔的两帧图像对应的相似度得分,获取视频的防抖性能得分。
其中,关于根据视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度,及每项图像相似度对应的权重,获取视频中每一组相邻预设间隔的两帧图像对应的相似度得分的方式,本发明实施例对此也不作具体限定,包括但不限于如下两种方式:
第一种获取相似度得分的方式:基于视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度及每项图像相似度对应的权重,获取加权求和结果,并将加权求和结果作为视频中每一组相邻预设间隔的两帧图像对应的相似度得分。
第二种获取相似度得分的方式:将视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度作为幂底数,将每项图像相似度对应的权重作为幂指数,获取视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度的乘方结果,根据视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度的乘方结果,获取视频中每一组相邻预设间隔的两帧图像对应的相似度得分。
其中,本发明实施例不对根据视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度的乘方结果,获取视频中每一组相邻预设间隔的两帧图像对应的相似度得分的方式作具 体限定,包括但不限于:对视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度的乘方结果进行求和,将求和结果作为每一组相邻预设间隔的两帧图像对应的相似度得分;或者,对视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度的乘方结果进行相乘,将乘积结果作为每一组相邻预设间隔的两帧图像对应的相似度得分。
例如,以图像相似度为3项为例,视频中第t-1组相邻预设间隔的两帧图像对应的第一项图像相似度记为L t,视频中第t-1组相邻预设间隔的两帧图像对应的第二项图像相似度记为C t,视频中第t-1组相邻预设间隔的两帧图像对应的第三项图像相似度记为S t。而第一项图像相似度对应的权重记为a,第二项图像相似度对应的权重记为b,第三项图像相似度对应的权重记为c。
对于上述第一种获取相似度得分的方式,可以参考如下公式(1)来计算:
P t=a*L t+b*C t+c*S t;(1)
对于上述第二种获取相似度得分的方式,若获取视频中每一组相邻预设间隔的两帧图像对应的相似度得分,是采用将乘方结果进行相乘的方式,则第二种获取相似度得分的方式,可以参考如下公式(2)来计算:
Figure PCTCN2022092757-appb-000001
在上述公式(1)及公式(2)中,P t表示第t组相邻预设间隔的两帧图像对应的相似度得分。在上述公式(2),
Figure PCTCN2022092757-appb-000002
表示第t-1组相邻预设间隔的两帧图像对应的第一项图像相似度的乘方结果,
Figure PCTCN2022092757-appb-000003
表示第t-1组相邻预设间隔的两帧图像对应的第二项图像相似度的乘方结果,
Figure PCTCN2022092757-appb-000004
表示第t-1组相邻预设间隔的两帧图像对应的第三项图像相似度的乘方结果。
需要说明的是,在上述两种获取相似度得分的方式中,每项图像相似度对应的权重可以根据实际需求进行设置。例如,若存在两项图像相似度,其中一项是基于亮度所计算得到的图像相似度,另一项是基于对比度计算得到的图像相似度,而视频中环境亮度较暗,则对于这两项图像相似度,应当尽量减少环境亮度较暗所带来的误差。由此,可适当减小基于亮度所计算得到的图像相似度对应的权重,而适当提升基于对比度所计算得到的图像相似度对应的权重。
在获取视频中每一组相邻预设间隔的两帧图像对应的相似度得分之后,可以根据视频中每一组相邻预设间隔的两帧图像对应的相似度得分,获取视频的防抖性能得分。本发明实施例不对根据视频中每一组相邻预设间隔的两帧图像对应的相似度得分,获取视频的防抖性能 得分的方式作具体限定,包括但不限于:获取相似度得分的累加结果,累加结果是对视频中每一组相邻预设间隔的两帧图像对应的相似度得分进行累加后所得到的。
本发明实施例提供的方法,由于可以基于相邻预设间隔的两帧图像对应的每项图像相似度,来获取相邻预设间隔的两帧图像之间的相似度得分,从而相较于基于单一一项图像相似度来获取相似度得分,获取到的结果更加精准。另外,由于可以按照实际需求设置每项图像相似度的权重,从而可以使得获取相似度得分时能够有所侧重,减少权重低对应的图像相似度所带来的误差,而防抖性能得分是由相似度得分及权重所确定的,进而使得后续获取到的防抖性能得分更加精准。
结合上述实施例的内容,在一个实施例中,图像相似度包括以下三项相似度中的至少一项,以下三项相似度分别为亮度相似度、对比度相似度及结构相似度。
结合上述实施例、具体示例中的内容以及相似度的定义,以预设间隔为1为例,现对上述三项相似度的计算过程进行说明,以视频中第t-1组相邻预设间隔的两帧图像对应的亮度相似度记为L t,视频中第t-1组相邻预设间隔的两帧图像对应的对比度相似度记为C t,视频中第t-1组相邻预设间隔的两帧图像对应的结构相似度记为S t
其中,计算第t-1组相邻预设间隔的两帧图像对应的亮度相似度,也即第t-1组相邻预设间隔的两帧图像中第t帧图像与第t-1帧图像之间的亮度相似度,可参考如下公式(3):
Figure PCTCN2022092757-appb-000005
在上述公式(3)中,μ t表示第t帧图像的亮度均值,μ t-1表示第t-1帧图像的亮度均值。其中,μ t可采用如下公式(4)计算:
Figure PCTCN2022092757-appb-000006
在上述公式(4)中,N表示第t帧图像中的像素总数,i表示第t帧图像中的第i个像素,t i表示第i个像素的亮度值。
计算第t-1组相邻预设间隔的两帧图像对应的对比度相似度,也即第t-1组相邻预设间隔的两帧图像中第t帧图像与第t-1帧图像之间的对比度相似度,可参考如下公式(5):
Figure PCTCN2022092757-appb-000007
在上述公式(5)中,δ t表示第t帧图像的亮度标准偏差,也即第t帧图像的对比度,δ t-1表示第t-1帧图像的对比度。其中,δ t可采用如下公式(6)计算:
Figure PCTCN2022092757-appb-000008
在上述公式(6)中,各个参数的定义可参考上述公式中的相关说明。
计算第t-1组相邻预设间隔的两帧图像对应的结构相似度,也即第t-1组相邻预设间隔的两帧图像中第t帧图像与第t-1帧图像之间的结构相似度,可参考如下公式(7):
Figure PCTCN2022092757-appb-000009
在上述公式(7)中,δ t,t-1表示第t帧图像与第t-1帧图像之间的亮度协方差。其中,δ t,t-1可采用如下公式(8)计算:
Figure PCTCN2022092757-appb-000010
在上述公式(8)中,(t-1) i表示第t-1帧图像中的第i个像素的亮度值,μ t-1表示第t-1帧图像的亮度均值。
本发明实施例提供的方法,由于可以基于相邻预设间隔的两帧图像对应的亮度相似度、对比度相似度及结构相似度,来获取相邻预设间隔的两帧图像之间的相似度得分,从而相较于基于单一一项图像相似度来获取相似度得分,获取到的结果更加精准,而防抖性能得分是由相似度得分所确定的,进而使得后续获取到的防抖性能得分更加精准。
结合上述实施例的内容,在一个实施例中,视频为单通道视频或多通道视频。其中,单通道视频为灰度视频,多通道视频为彩色视频。需要说明的是,若该视频为灰度视频,则可以直接按照上述实施例提供的方式,获取该灰度视频的防抖性能得分。若该视频为彩色视频,则可以按照上述实施例提供的方式,先获取每一通道下视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度,对于某一同类型的图像相似度,再将每一通道下视频中每一组相邻预设间隔的两帧图像对应的该同类型图像相似度进行加和,将加和结果作为视频中每一组相邻预设间隔的两帧图像对应的该同类型图像相似度。通过上述过程,即可得到视频中每一组相邻预设间隔的两帧图像对应的每项图像相似度,再采用上述实施例提供的方式,即可获取该视频的防抖性能得分。
本发明实施例提供的方法,由于可以同时适用于单通道视频或多通道视频,从而适用场景更加广泛。
结合上述实施例的内容,在一个实施例中,参见图2,提供一种延迟校准方法,包括以下步骤:
201、获取多个视频,视频是在拍摄设备存在抖动的前提下所拍摄的;
202、根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选,并由筛选后得到的视频构成视频组;其中,拍摄设备的姿态数据是基于惯性传感器所获取的;
203、对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值。
其中,关于步骤203中内容的相关解释可参考上述实施例的内容,此处不再赘述。在上述步骤201中,“视频是在拍摄设备存在抖动的前提下所拍摄的”,指的是拍摄设备的拍摄环境可以是抖动的,如可以在运动过程中拍摄视频,比如用户跑步手持拍摄,山地车骑行拍摄等高频运动。由于上述运动中,拍摄设备会随着用户运动而不断发生抖动,从而在这些运动过程中拍摄得到的视频,可以认为是在在拍摄设备存在抖动的前提下所拍摄的。需要说明的是,实际在获取多个视频时,如n个,可以不是分别拍摄n次以获取n个视频,而是先拍摄一个视频,再基于滑窗从该视频中截取多个视频段,从而获取多个视频。
本发明实施例正是需要利用存在“抖动”的视频,并将这些视频作为防抖性能得分的评测对象。其中,“抖动”越严重的视频,用于作为评测对象则越佳。基于该原理,上述步骤201中才说明“视频是在拍摄设备存在抖动的前提下所拍摄的”。当然,实际实施过程中,只要是人手持拍摄设备,通常都会存在抖动,不一定需要让拍摄设备特意在存在抖动的环境下拍摄,也即可以在一般环境下拍摄也可以,只是相对于前者,比较难获得“抖动”严重的视频以作为评测对象。
其中,滑窗自身的长度可以根据需求进行设置,本发明实施例对此不作具体限定。另外,滑窗每次滑动时的滑动步长也可以根据需求进行设置,每次滑动的滑动步长可以相同,也可以不同,本发明实施例对此也不作具体限定。例如,以一个视频一共有4800帧,滑窗自身的长度可以固定为100帧,滑动步长固定为10帧为例,通过滑窗滑动的方式可以先截取第1帧 至第100帧为第1个视频,滑动1次后,可以跳过10帧,接着可以截取第111帧至211帧为第2个视频,依次类推,直到截取出所需数量的多个视频。
在上述步骤202中,拍摄设备的姿态数据是用于描述拍摄设备的姿态,其可以用姿态角或四元数等不同方式进行表示,本实施例对此不作具体限定。另外,对于某个视频,在获取该视频对应的拍摄时间段内的拍摄设备的姿态数据时,获取频率可以与拍摄视频时的帧数频率一致,也可以不一致,本发明实施例对此不作具体限定。例如,对于2021年4月7日17点10分至2021年4月7日17点11分这个拍摄时间段所获取到的1分钟长度的视频,若1秒为24帧,则在这个时间段内每次获取图像帧的时刻,可同时获取拍摄设备的姿态数据,也即每秒可获取24次拍摄设备的姿态数据,从而该1分钟可以获取到24*60=1440个拍摄设备的姿态数据。
以拍摄设备的姿态数据由姿态角的方式表示为例,相应地,本发明实施例不对获取拍摄设备的姿态数据的方式作具体限定,包括但不限于:通过基于IMU的预设算法估计拍摄设备的姿态,以得到拍摄设备的姿态数据。其中,预设算法可以为AKF(Adaptive Kalman Filter,自适应的卡尔曼滤波器)算法、UKF(Unscented Kalman Filter,无迹卡尔曼滤波器)、互补滤波算法或者其它滤波算法,本发明实施例对此不作具体限定。
需要说明的是,延时值之所以会影响防抖性能得分是因为防抖性能得分是由根据图像帧参数来获取的,图像帧参数是基于防抖处理后的图像帧所获取的,而防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的。因此,对于IMU对应的时钟及视觉系统对应的时钟,延时值越精准,以其中一个时钟为标准加上延时值,在另一个时钟下去索引相应的数据,索引结果也越精准。还需要说明的是,以其中一个时钟为标准,另一个时钟与该时钟之间的延时值可以为正可以为负,例如,以视觉系统对应的时钟为标准,IMU对应的时钟可能是慢了,也可能是快了,从而基于此,延时值可能是正值,也有可能是负值。
例如,以IMU与视觉系统之间的真实延时值是0.01秒,在IMU对应的时钟下,以基于IMU获取到的拍摄设备的姿态数据分别为:第0.01秒、第0.02秒、第0.03秒、第0.04秒、第0.05秒、第0.06秒、第0.07秒、第0.08秒、第0.09秒及第0.10秒这10个时刻下的拍摄设备的姿态数据,而在视觉系统对应的时钟下,以基于视觉系统拍摄到的图像帧分别为:第0.01秒、第0.02秒、第0.03秒、第0.04秒、第0.05秒、第0.06秒、第0.07秒、第0.08秒、第0.09秒及第0.10秒这10个时刻下的图像帧为例。
假定IMU与视觉系统之间估计的延时值为0.03秒,且是以视觉系统对应的时钟为标准,IMU对应的时钟慢了0.03秒,也即IMU对应的时钟与视觉系统对应的时钟之间的延时值为-0.03。按照该延时值,视觉系统在第0.04秒这个时刻拍摄到的图像帧,与IMU在第0.01秒这个时刻获取到的拍摄设备的姿态数据是对应的,后续在对第0.04秒这个时刻拍摄到的图像帧进行电子防抖处理时,所使用的便会是IMU在第0.01秒这个时刻获取到的拍摄设备的姿态数据。而真实延时值是0.01秒,也即视觉系统在第0.04秒这个时刻拍摄到的图像帧,应与IMU在第0.03秒这个时刻获取到的拍摄设备的姿态数据是对应的,后续在对第0.04秒这个时刻拍摄到的图像帧进行电子防抖处理时,应当需要使用IMU在第0.03秒这个时刻获取到的拍摄设备的姿态数据。其中,估计的延时值与真实延时值差距越大,就越不能够索引到正确的拍摄设备的姿态数据,从而后续进行电子防抖处理时误差也就越大。
通过上述过程,在获取到每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,可以对多个视频进行筛选。其中,筛选过程可以是计算每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据所对应的方差,再按照方差从大到小进行排序,从而以此选取前预设数量个视频。由于方差越大能够代表数据越不稳定,从而依此可以选择抖动比较激烈的视频作为筛选后得到的视频。
本发明实施例提供的方法,通过获取多个视频,根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选。由于在计算视频对应的防抖性能得分之前,可以对视频进行筛选,以选择抖动比较激烈的视频作为筛选后得到的视频,而抖动越激烈对防抖处理的要求也就越高,防抖性能得分也越能体现防抖处理的真实效果,对延时值其取值的准确性也就有越高要求,从而以上述筛选后的视频作为测试防抖处理效果的基础,通过不断执行延时值的更新过程及获取防抖性能得分的过程,最终获取到的延时值会更加精准。
应该理解的是,虽然图1及图2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1及图2中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
结合上述实施例的内容,在一个实施例中,本发明实施例不对根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选的方式作具体限定,包括但不限于:将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合;根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值;根据每个视频对应的频域分值,对多个视频进行筛选。
其中,每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据可以为连续轴角,即为连续的离散数值,这些离散数值可以在以横坐标为时间,纵坐标为轴角大小所形成的坐标系中形成一条线性变化的时域曲线。通过快速傅里叶变换,可以将这条曲线变为多个正弦波曲线,也即多个幅频特性曲线,并由此组成幅频特性曲线集合。而这些幅频特性曲线集合中每一幅频特性曲线可以作为以频率为横坐标,幅值为纵坐标所形成的坐标系中的一个点。
每个视频对应的频域分值,可以用于表示每个视频拍摄时抖动的剧烈程度。对于某一视频对应的幅频特性曲线集合,在获取该视频对应的频域分值时,可从该幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值中确定频率最大值及幅值最大值,从而将两者乘积作为该视频对应的频域分值。当然,也可以根据该幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,确定频率平均值及幅值平均值,从而将两个平均值的乘积作为该视频对应的频域分值。结合上述计算视频的频域分值的过程,之所以视频对应的频域分值,可以用来表示视频拍摄时抖动的剧烈程度,是因为幅值是可以表示视频拍摄时抖动的剧烈程度的,而将幅值相关联的数值作为一项乘积因子,将频率相关联的数值作为另一项乘积因子,两个乘积因子相乘所得到的频率分值,相应地,是也可以用来表示视频拍摄时抖动的剧烈程度。在获取到每个视频对应的频域分值后,可根据每个视频对应的频域分值,对多个视频进行筛选,具体地,可以筛选出频域分值大于预设阈值的视频。
本发明实施例提供的方法,通过基于快速傅里叶变换,将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合,根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值,根据每个视频对应的频域分值,对多个视频进行筛选。由于在计算视频对应的防抖性能得分之前,可以对视频进行筛选,以选择抖动比较激烈的视频作为筛选后得到的视频,而抖动越激烈对防抖处理的要求也就越高,防抖性能得分也越能体现防抖处理的真实效果,对延时值其取值的准确性也就有越高要求,从而基于视频的频域分值对视频进行筛选,并基于以上述筛选后的视频 作为测试防抖处理效果的基础,通过不断执行延时值的更新过程及获取防抖性能得分的过程,最终获取到的延时值会更加精准。
结合上述实施例的内容,在一个实施例中,本发明实施例不对根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值的方式作具体限定,包括但不限于:对于任一视频对应的幅频特性曲线集合,根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值;根据每一幅频特性曲线对应的频域分值,获取该视频对应的频域分值。
其中,对于某一幅频特性曲线,可将该幅频特性曲线对应的频率及幅值进行加权求和,从而将加权求和结果,作为该幅频特性曲线对应的频域分值。对于某一视频,在得到该视频对应的幅频特性曲线集合中每一幅频特性曲线对应的频域分值后,可从所有幅频特性曲线对应的频域分值中选取最大值及最小值,将两者的平均值作为该幅频特性曲线集合对应的频域分值,即作为该视频对应的频域分值。
本发明实施例提供的方法,对于某一视频,根据该视频对应的幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值;根据每一幅频特性曲线对应的频域分值,获取该视频对应的频域分值。由于在计算视频对应的防抖性能得分之前,可以对视频进行筛选,以选择抖动比较激烈的视频作为筛选后得到的视频,而抖动越激烈对防抖处理的要求也就越高,防抖性能得分也越能体现防抖处理的真实效果,对延时值其取值的准确性也就有越高要求,从而基于视频对应的幅频特性曲线集合中每一幅频特性视频对应的频域分值,获取视频对应的频域分值,再基于视频的频域分值对视频进行筛选,并基于以上述筛选后的视频作为测试防抖处理效果的基础,通过不断执行延时值的更新过程及获取防抖性能得分的过程,最终获取到的延时值会更加精准。
结合上述实施例的内容,在一个实施例中,本发明实施例不对根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值的方式作具体限定,包括但不限于:获取每一幅频特性曲线对应的频率与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值;或者,获取每一幅频特性曲线对应频率的得分,获取每一幅频特性曲线对应的得分与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值。
在上述过程中,本发明实施例不对获取每一幅频特性曲线对应频率的得分的方式作具体限定,包括但不限于:根据每一幅频特性曲线对应的频率,确定每一幅频特性曲线在预设时 间段内的频次,并将频次作为每一幅频特性曲线对应的得分。其中,预设时间段可以为1秒钟,本发明实施例对此不作具体限定。
另外,在第二种方式中之所以将每一幅频特性曲线对应的频率转化为得分,是由于每一幅频特性曲线对应的频率都不相同,将其转化为相同标准下的得分,可以保证数据的同一性,从而保证后续计算得到的频域分值均是基于相同的计算标准。
本发明实施例提供的方法,由于在计算视频对应的防抖性能得分之前,可以对视频进行筛选,以选择抖动比较激烈的视频作为筛选后得到的视频,而抖动越激烈对防抖处理的要求也就越高,防抖性能得分也越能体现防抖处理的真实效果,对延时值其取值的准确性也就有越高要求,从而基于视频的频域分值对视频进行筛选,并基于以上述筛选后的视频作为测试防抖处理效果的基础,通过不断执行延时值的更新过程及获取防抖性能得分的过程,最终获取到的延时值会更加精准。
结合上述实施例的内容,在一个实施例中,本发明实施例不对根据每一幅频特性曲线对应的频域分值,获取任一视频对应的频域分值的方式作具体限定,包括但不限于:对幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为该视频对应的频域分值。
本发明实施例提供的方法,对于某一视频,通过将该视频对应的幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为该视频对应的频域分值。由于在计算视频对应的防抖性能得分之前,可以对视频进行筛选,以选择抖动比较激烈的视频作为筛选后得到的视频,而抖动越激烈对防抖处理的要求也就越高,防抖性能得分也越能体现防抖处理的真实效果,对延时值其取值的准确性也就有越高要求,从而基于视频的频域分值对视频进行筛选,并基于以上述筛选后的视频作为测试防抖处理效果的基础,通过不断执行延时值的更新过程及获取防抖性能得分的过程,最终获取到的延时值会更加精准。
结合上述实施例的内容,在一个实施例中,本发明实施例不对根据每个视频对应的频域分值,对多个视频进行筛选的方式作具体限定,包括但不限于:对多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并作为筛选后得到的视频。
由上述实施例的内容可知,视频的频域分值越大则表明视频拍摄时抖动程度越激烈,从而为了选取拍摄时抖动程度更激烈的视频,可以通过对频域分值进行从大到小排序,并筛选出排序结果中预设数量个视频。
本发明实施例提供的方法,通过对多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并作为筛选后得到的视频。由于在计算视频对应的防抖性能得分之前,可以对视频进行筛选,以选择抖动比较激烈的视频作为筛选后得到的视频,而抖动越激烈对防抖处理的要求也就越高,防抖性能得分也越能体现防抖处理的真实效果,对延时值其取值的准确性也就有越高要求,从而基于视频的频域分值对视频进行筛选,并基于以上述筛选后的视频作为测试防抖处理效果的基础,通过不断执行延时值的更新过程及获取防抖性能得分的过程,最终获取到的延时值会更加精准。
需要说明的是,上述阐述的技术方案在实际实施过程中可以作为独立实施例来实施,也可以彼此之间进行组合并作为组合实施例实施。另外,在对上述本发明实施例内容进行阐述时,仅基于方便阐述的思路,按照相应顺序对不同实施例进行阐述,如按照数据流流向的顺序,而并非是对不同实施例之间的执行顺序进行限定。相应地,在实际实施过程中,若需要实施本发明提供的多个实施例,则不一定需要按照本发明阐述实施例时所提供的执行顺序,而是可以根据需求安排不同实施例之间的执行顺序。
结合上述实施例的内容,在一个实施例中,如图3所示,提供了一种延迟校准装置,包括:获取模块301及更新模块302,其中:
获取模块301,用于获取视频组,视频组中至少包括一个视频;
更新模块302,用于对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
在一个实施例中,获取模块301,包括:
获取子模块,用于获取多个视频,视频是在拍摄设备存在抖动的前提下所拍摄的;
筛选子模块,用于根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选,并由筛选后得到的视频构成视频组;其中,拍摄设备的姿态数据是基于惯性传感器所获取的。
在一个实施例中,筛选子模块,包括:
转换单元,用于将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合;
获取单元,用于根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值;
筛选单元,用于根据每个视频对应的频域分值,对多个视频进行筛选。
在一个实施例中,获取单元,包括:
第一获取子单元,用于对于任一视频对应的幅频特性曲线集合,根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值;
第二获取子单元,用于根据每一幅频特性曲线对应的频域分值,获取该视频对应的频域分值。
在一个实施例中,第一获取子单元,用于获取每一幅频特性曲线对应的频率与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值;或者,获取每一幅频特性曲线对应频率的得分,获取每一幅频特性曲线对应的得分与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值。
在一个实施例中,第二获取子单元,用于对幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为该视频对应的频域分值。
在一个实施例中,筛选单元,用于对多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并作为筛选后得到的视频。
关于延迟校准装置的具体限定可以参见上文中对于延迟校准方法的限定,在此不再赘述。上述延迟校准装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图4所示。该计算机设备包括通过系统总线连接的处理器、存储器、通信接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的通信接口用于与外部的终端进行有线或无线方式的通信,无线方式可通过WIFI、运营商网 络、NFC(近场通信)或其他技术实现。该计算机程序被处理器执行时以实现一种图像处理方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:
获取视频组,视频组中至少包括一个视频;
对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
获取多个视频,视频是在拍摄设备存在抖动的前提下所拍摄的;
根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选,并由筛选后得到的视频构成视频组;其中,拍摄设备的姿态数据是基于惯性传感器所获取的。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合;
根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值;
根据每个视频对应的频域分值,对多个视频进行筛选。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:对于任一视频对应的幅频特性曲线集合,根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅 频特性曲线对应的频域分值;根据每一幅频特性曲线对应的频域分值,获取该视频对应的频域分值。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:获取每一幅频特性曲线对应的频率与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值;或者,获取每一幅频特性曲线对应频率的得分,获取每一幅频特性曲线对应的得分与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:对幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为该视频对应的频域分值。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:对多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并作为筛选后得到的视频。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取视频组,视频组中至少包括一个视频;
对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
其中,惯性传感器与视觉系统耦合在同一拍摄设备上,视频组中每一视频均是基于视觉系统所获取的,防抖处理是通过视觉系统及惯性传感器,并基于两者之间的延时值所完成的,防抖性能得分用于评估对视频作防抖处理后的防抖效果。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
获取多个视频,视频是在拍摄设备存在抖动的前提下所拍摄的;
根据多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对多个视频进行筛选,并由筛选后得到的视频构成视频组;其中,拍摄设备的姿态数据是基于惯性传感器所获取的。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合;
根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值;
根据每个视频对应的频域分值,对多个视频进行筛选。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
对于任一视频对应的幅频特性曲线集合,根据幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值;
根据每一幅频特性曲线对应的频域分值,获取该视频对应的频域分值。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
获取每一幅频特性曲线对应的频率与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值;或者,
获取每一幅频特性曲线对应频率的得分,获取每一幅频特性曲线对应的得分与幅值的乘积,并将乘积作为每一幅频特性曲线对应的频域分值。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
对幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为该视频对应的频域分值。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
对多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并作为筛选后得到的视频。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种延迟校准方法,其特征在于,所述方法包括:
    获取视频组,所述视频组中至少包括一个视频;
    对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取所述视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
    其中,所述惯性传感器与所述视觉系统耦合在同一拍摄设备上,所述视频组中每一视频均是基于所述视觉系统所获取的,防抖处理是通过所述视觉系统及所述惯性传感器,并基于两者之间的延时值所完成的,所述防抖性能得分用于评估对视频作防抖处理后的防抖效果。
  2. 根据权利要求1所述的方法,其特征在于,所述获取视频组包括:
    获取多个视频,所述视频是在拍摄设备存在抖动的前提下所拍摄的;
    根据所述多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对所述多个视频进行筛选,并由筛选后得到的视频构成所述视频组;其中,所述拍摄设备的姿态数据是基于所述惯性传感器所获取的。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述多个视频中每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据,对所述多个视频进行筛选,包括:
    将在每个视频对应的拍摄时间段内所获取到的拍摄设备的姿态数据转换至频域空间,以得到每个视频对应的幅频特性曲线集合;
    根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值;
    根据每个视频对应的频域分值,对所述多个视频进行筛选。
  4. 根据权利要求3所述的方法,其特征在于,所述根据每个视频对应的幅频特性曲线集合,获取每个视频对应的频域分值,包括:
    对于任一视频对应的幅频特性曲线集合,根据所述幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值;
    根据每一幅频特性曲线对应的频域分值,获取所述任一视频对应的频域分值。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述幅频特性曲线集合中每一幅频特性曲线对应的频率及幅值,获取每一幅频特性曲线对应的频域分值,包括:
    获取每一幅频特性曲线对应的频率与幅值的乘积,并将所述乘积作为每一幅频特性曲线对应的频域分值;或者,
    获取每一幅频特性曲线对应频率的得分,获取每一幅频特性曲线对应的得分与幅值的乘积,并将所述乘积作为每一幅频特性曲线对应的频域分值。
  6. 根据权利要求4所述的方法,其特征在于,所述根据每一幅频特性曲线对应的频域分值,获取所述任一视频对应的频域分值,包括:
    对所述幅频特性曲线集合中所有幅频特性曲线对应的频域分值进行加权求和,将得到的和值作为所述任一视频对应的频域分值。
  7. 根据权利要求3所述的方法,其特征在于,所述根据每个视频对应的频域分值,对所述多个视频进行筛选,包括:
    对所述多个视频中每个视频对应的频域分值按照从大到小排序,选取前预设数量个视频,并作为筛选后得到的视频。
  8. 一种延迟校准装置,其特征在于,所述装置包括:
    获取模块,用于获取视频组,所述视频组中至少包括一个视频;
    更新模块,用于对惯性传感器与视觉系统之间的延时值进行更新,并基于更新后的延时值,获取所述视频组对应的防抖性能得分,重复上述延时值的更新过程及获取防抖性能得分的过程,直至获取到的防抖性能得分满足预设条件,则获取满足预设条件的防抖性能得分所对应的延时值;
    其中,所述惯性传感器与所述视觉系统耦合在同一拍摄设备上,所述视频组中每一视频均是基于所述视觉系统所获取的,防抖处理是通过所述视觉系统及所述惯性传感器,并基于两者之间的延时值所完成的,所述防抖性能得分用于评估对视频作防抖处理后的防抖效果。
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至7中任一项所述的方法的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的方法的步骤。
PCT/CN2022/092757 2021-05-18 2022-05-13 延迟校准方法、装置、计算机设备和存储介质 WO2022242569A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110541140.XA CN113438409B (zh) 2021-05-18 2021-05-18 延迟校准方法、装置、计算机设备和存储介质
CN202110541140.X 2021-05-18

Publications (1)

Publication Number Publication Date
WO2022242569A1 true WO2022242569A1 (zh) 2022-11-24

Family

ID=77802592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/092757 WO2022242569A1 (zh) 2021-05-18 2022-05-13 延迟校准方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN113438409B (zh)
WO (1) WO2022242569A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113438409B (zh) * 2021-05-18 2022-12-20 影石创新科技股份有限公司 延迟校准方法、装置、计算机设备和存储介质
CN115134525B (zh) * 2022-06-27 2024-05-17 维沃移动通信有限公司 数据传输方法、惯性测量单元及光学防抖单元

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130335554A1 (en) * 2012-06-14 2013-12-19 Qualcomm Incorporated Adaptive estimation of frame time stamp latency
CN110430365A (zh) * 2019-08-26 2019-11-08 Oppo广东移动通信有限公司 防抖方法、装置、计算机设备和存储介质
CN110519513A (zh) * 2019-08-26 2019-11-29 Oppo广东移动通信有限公司 防抖方法和装置、电子设备、计算机可读存储介质
CN111225155A (zh) * 2020-02-21 2020-06-02 Oppo广东移动通信有限公司 视频防抖方法、装置、电子设备、计算机设备和存储介质
CN111526285A (zh) * 2020-04-15 2020-08-11 浙江大华技术股份有限公司 一种图像防抖方法及电子设备、计算机可读存储介质
CN113438409A (zh) * 2021-05-18 2021-09-24 影石创新科技股份有限公司 延迟校准方法、装置、计算机设备和存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4872797B2 (ja) * 2007-05-18 2012-02-08 カシオ計算機株式会社 撮像装置、撮像方法および撮像プログラム
WO2017075788A1 (zh) * 2015-11-05 2017-05-11 华为技术有限公司 一种防抖拍照方法、装置及照相设备
WO2019075617A1 (zh) * 2017-10-16 2019-04-25 深圳市大疆创新科技有限公司 一种视频处理方法、控制终端及可移动设备
CN111246100B (zh) * 2020-01-20 2022-03-18 Oppo广东移动通信有限公司 防抖参数的标定方法、装置和电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130335554A1 (en) * 2012-06-14 2013-12-19 Qualcomm Incorporated Adaptive estimation of frame time stamp latency
CN110430365A (zh) * 2019-08-26 2019-11-08 Oppo广东移动通信有限公司 防抖方法、装置、计算机设备和存储介质
CN110519513A (zh) * 2019-08-26 2019-11-29 Oppo广东移动通信有限公司 防抖方法和装置、电子设备、计算机可读存储介质
CN111225155A (zh) * 2020-02-21 2020-06-02 Oppo广东移动通信有限公司 视频防抖方法、装置、电子设备、计算机设备和存储介质
CN111526285A (zh) * 2020-04-15 2020-08-11 浙江大华技术股份有限公司 一种图像防抖方法及电子设备、计算机可读存储介质
CN113438409A (zh) * 2021-05-18 2021-09-24 影石创新科技股份有限公司 延迟校准方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN113438409A (zh) 2021-09-24
CN113438409B (zh) 2022-12-20

Similar Documents

Publication Publication Date Title
AU2019326496B2 (en) Method for capturing images at night, apparatus, electronic device, and storage medium
WO2022242569A1 (zh) 延迟校准方法、装置、计算机设备和存储介质
WO2020253618A1 (zh) 一种视频抖动的检测方法及装置
CN103973999B (zh) 摄像装置及其控制方法
CN110678898B (zh) 一种视频防抖方法及移动设备
US8606035B2 (en) Image processing apparatus and image processing method
US10127635B2 (en) Method and device for image noise estimation and image capture apparatus
WO2017075788A1 (zh) 一种防抖拍照方法、装置及照相设备
US9307148B1 (en) Video enhancement techniques
US11704563B2 (en) Classifying time series image data
KR20150132846A (ko) 비디오 안정화를 위한 캐스케이드 카메라 모션 추정, 롤링 셔터 검출 및 카메라 흔들림 검출
CN113395454B (zh) 图像拍摄的防抖方法与装置、终端及可读存储介质
CN102986208A (zh) 成像装置、图像处理方法和用于在其上记录程序的记录介质
WO2013151873A1 (en) Joint video stabilization and rolling shutter correction on a generic platform
WO2020171379A1 (en) Capturing a photo using a mobile device
TWI513294B (zh) 電子裝置及用於影像穩定之方法
KR101202642B1 (ko) 배경의 특징점을 이용한 전역 움직임 추정 방법 및 장치
EP3267675B1 (en) Terminal device and photographing method
WO2022242568A1 (zh) 防抖效果评估方法、装置、计算机设备和存储介质
CN112637496B (zh) 图像矫正方法及装置
WO2022227916A1 (zh) 图像处理方法、图像处理器、电子设备及存储介质
US20130236055A1 (en) Image analysis device for calculating vector for adjusting a composite position between images
WO2023185096A1 (zh) 图像模糊度的确定方法及其相关设备
CN115150549B (zh) 成像防抖方法、成像防抖装置、拍摄设备和可读存储介质
JP2019097004A (ja) 画像生成装置、画像生成方法及び画像生成プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22803891

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22803891

Country of ref document: EP

Kind code of ref document: A1