WO2021218694A1

WO2021218694A1 - Video processing method and mobile terminal

Info

Publication number: WO2021218694A1
Application number: PCT/CN2021/088267
Authority: WO
Inventors: 王宇; 朱聪超; 敖欢欢; 胡斌; 李远友
Original assignee: 华为技术有限公司
Priority date: 2020-04-27
Filing date: 2021-04-20
Publication date: 2021-11-04
Also published as: CN113572993A; CN113572993B

Abstract

Provided by the present application are a video processing method and a mobile terminal, which may dynamically adjust the EIS video frame cropping ratio according to changes in exercise intensity, thus improving the stability of the picture during video capture, and improving the video recording experience of a user. The mobile terminal provided by the present application comprises a camera, a display screen, a motion sensor, and a processor; the camera is used to record a video; the display screen is used to display a video recording interface; the motion sensor is used to continuously collect exercise data when the camera is recording the video; it is determined whether the exercise intensity changes according to the exercise data; if the exercise intensity changes, the video recording interface is used to display a prompt to start dynamic anti-shake; the display screen is used to receive a touch operation of clicking the prompt; and the processor is used to perform dynamic anti-shake.

Description

Video processing method and mobile terminal

This application claims the priority of a Chinese patent application filed with the State Intellectual Property Office of China with an application number of 202010345609.8 on April 27, 2020, and the priority of a Chinese patent application with the title of "a video processing method and mobile terminal". The entire content is incorporated into this application by reference.

Technical field

This application relates to the field of video technology, and in particular to a video processing method and mobile terminal.

Background technique

When a video shooter holds a terminal device for video shooting, the terminal device shakes due to the shooter holding it without asking and shooting while in motion. The electronic image stabilization (EIS) algorithm calculates the motion vector of each pixel in the video frame imaging process through the motion information such as the gyroscope, and calculates the input image established by the homography matrix and the output image pixel after image stabilization The mapping relationship between the calculation of the motion vector and the smoothness of the motion path.

The realization of EIS relies on the cropping of the image, but in the existing EIS algorithm, since the cropping ratio is fixed, the amount of jitter that the EIS algorithm can remove is limited. For large motion scenes, insufficient cropping area means that the pixel information required by the output image exceeds the range of the input image.

Summary of the invention

The embodiments of the present application provide a video processing method and a mobile terminal, which can dynamically adjust the EIS video frame cropping ratio according to changes in exercise intensity, improve the stability of the picture during video shooting, and enhance the user's video recording experience.

In order to achieve the above objectives, the following solutions are adopted in the embodiments of this application:

On the one hand, an embodiment of the present application provides a mobile terminal, which includes a camera, a display screen, a motion sensor, and a processor;

The camera is used to record video; the screen is used to display the video recording interface;

The motion sensor is used to continuously collect motion data when the camera is recording video;

According to the exercise data, determine whether the exercise intensity changes;

If the exercise intensity changes, the video recording interface is used to display the prompt to start dynamic anti-shake;

The display screen is used to receive the touch operation of clicking the prompt, and the processor is used to perform dynamic anti-shake.

In this solution, when the user is recording a video, if the mobile terminal is in a scene of large movement, the mobile terminal can pop up a prompt whether to perform dynamic anti-shake, so as to remind the user whether to enable dynamic anti-shake, that is, to prompt the user whether to enable dynamic anti-shake Adjust the cutting ratio of EIS. This kind of scheme gives users the right to choose, allowing them to choose whether to use the dynamic anti-shake function, because some users may not like the constant change of the cutting ratio.

In one possible design, motion sensors include inertial sensors, acceleration sensors, and gyroscopes.

In another possible design, the prompt to activate dynamic anti-shake includes: the description part of dynamic anti-shake and the switch control part.

In another possible design, the camera generates a video frame when recording a video; aligns the motion data with the video frame by time stamping, and the time stamp alignment is to establish the corresponding relationship between the motion data and the video frame according to time; Image anti-shake processing, the electronic image anti-shake is to cut the video frame, and warp the cut video frame; calculate the rotation vector of the video frame; perform path smoothing according to the motion data; path smoothing is optimized A curve composed of the motion data; determine the motion state of the mobile terminal; count the number of cross-borders of the warped video frame; if the number of cross-borders is greater than the first threshold, increase the cropping ratio; if the number of cross-borders is less than or equal to the first Threshold, then maintain the cutting ratio.

In another possible design, the out-of-bounds means that some pixels of the video frame before the warping are not defined in the video frame after the warping.

On the other hand, this embodiment provides a video processing method, including a camera of a mobile terminal collecting video frames;

The motion sensor of the mobile terminal collects motion data;

The processor of the mobile terminal aligns the video frame and the motion data with time stamps;

The processor performs electronic image stabilization on the video frame. The electronic image stabilization is to cut the video frame and distort the cut video frame;

Calculate the rotation vector of the video frame according to the motion data;

The processor recognizes the movement state of the mobile terminal;

The processor performs cutting processing on the video frame, and counts the number of cross-border times of the video frame after the cutting processing;

The processor determines whether to adjust the cutting ratio according to the number of cross-borders;

If the number of crossings is less than or equal to the first threshold, the cropping ratio is maintained, and the processor calculates the H matrix corresponding to the video frame; and performs image distortion processing according to the H matrix;

If the number of cross-border times is greater than the first threshold, the processor calculates a new cropping ratio of the video frame, and generates an initial video frame at the new cropping ratio;

The processor determines the cropping ratio of each of the video frames according to the change in the motion intensity of the mobile terminal;

The processor calculates the H matrix corresponding to the video frame;

Perform image distortion processing according to the H matrix.

In a possible design, the video frames collected by the camera of the mobile terminal are stored in the buffer of the memory.

In another possible design, the motion data includes: acceleration and angular velocity of the mobile terminal.

In another possible design, the time stamp alignment is that the processor uses spline interpolation to change the motion data from discrete values to continuous curves; the processor performs nonlinear optimization on the continuous curve to obtain the time difference between different continuous curves; processing The loop performs nonlinear optimization, and when the time difference meets certain conditions, the loop ends.

In another possible design, the processor performs path smoothing on the video frame according to the rotation vector.

In another possible design, the motion path smoothing is that the processor calculates the vector of every two adjacent data points in the motion data, and traverses all the data points; the processor removes the two adjacent data points with the same vector The processor removes the inflection point in the data curve composed of motion data; the processor removes all the data points between the two data points that can be directly passed.

In another possible design, the out-of-bounds is that some pixels of the video frame before the cropping process are not defined in the video frame after the cropping process.

In another possible design, if the number of crossings is greater than the first threshold, the display screen of the mobile terminal displays an interface that prompts the user to enable dynamic anti-shake. The display screen receives the user's touch operation and turns on dynamic anti-shake.

In another possible design, dynamic anti-shake is that the processor adjusts the cropping ratio according to changes in exercise intensity.

In another aspect, this embodiment provides a computer-readable storage medium, including instructions, which when executed on a mobile terminal, cause the mobile terminal to perform the above-mentioned video processing method.

Description of the drawings

Figure 1 is a schematic diagram of gyroscope data processed by path smoothing algorithm;

Figure 2A is a schematic diagram of a video cropping ratio in a jittery scene;

Figure 2B is another schematic diagram of the video cropping ratio in that scene;

Figure 3 is a schematic diagram of the steps of a video processing method;

Figure 4A is a schematic diagram of a motion state determined by a motion sensor;

Figure 4B is a schematic diagram of gyroscope data in different motion states;

Fig. 4C is a schematic diagram of a movement state transition under different movement states;

Figure 5 is a schematic diagram of the steps of a method for adjusting the video cropping ratio;

Figure 6 is a schematic diagram of a UI interface for enabling dynamic FOV;

Fig. 7A is a schematic diagram of processing video jitter;

Fig. 7B is another schematic diagram of processing video jitter;

Fig. 8A is a schematic diagram of gyroscope data without path smoothing processing;

Fig. 8B is a schematic diagram of gyroscope data after path smoothing processing;

Fig. 9 is a schematic diagram of a method for determining the size change of the cutting scale according to the movement trend;

Fig. 10A is a schematic diagram of a video frame collected by a terminal device;

FIG. 10B is a schematic diagram showing changes in exercise intensity and cutting ratio;

Figure 11 is another schematic diagram showing changes in exercise intensity and cutting ratio;

FIG. 12 is a schematic diagram of the steps of a method for adjusting the video cropping ratio;

Figure 13A is a schematic diagram of the hardware structure of an electronic device;

Figure 13B is a schematic diagram of the structure of an application framework.

Detailed ways

The terms used in the following embodiments are only for the purpose of describing specific embodiments, and are not intended to limit the application. As used in the specification and appended claims of this application, the singular expressions "a", "an", "said", "above", "the" and "this" are intended to also This includes expressions such as "one or more" unless the context clearly indicates to the contrary. It should also be understood that in the following embodiments of the present application, "at least one" and "one or more" refer to one, two, or more than two. The term "and/or" is used to describe the association relationship of associated objects, indicating that there can be three types of relationships; for example, A and/or B can mean: A alone exists, A and B exist at the same time, and B exists alone. Among them, A and B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship.

References described in this specification to "one embodiment" or "some embodiments", etc. mean that one or more embodiments of the present application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized otherwise. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.

The embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application.

In the process of calculating the motion vector of the existing EIS algorithm, it is necessary to consider the smoothness of the view angle transformation between frames within a period of time, that is, the motion vector sequence and the motion path need to maintain a certain degree of smoothness. As shown in FIG. 1, the curve 101 is a curve of the data collected by the gyroscope of the terminal changing with time when the video shooter holds the terminal to shoot a video. The vertical axis in FIG. 1 represents gyroscope data, and the horizontal axis represents time data. The curve 102 is a curve of the gyroscope data processed by the path smoothing algorithm over time. The function of the path smoothing algorithm is that when the viewing angle is changed in different frames, the appearance of noise will not cause a large change in the picture and affect the viewing. The realization of the EIS anti-shake algorithm depends on the cropping of the image, and the cropping ratio of the image is generally preset, which needs to be based on the resolution of the complementary metal oxide semiconductor (CMOS) sensor and the image signal processor ( The image signal processor (ISP) processing capability, image texture preservation and other factors are used for setting. The fixed cropping ratio will cause the amplitude of jitter that can be removed by the EIS anti-shake algorithm to be limited by the cropping ratio. For scenes with violent motion, insufficient cropping area will cause the pixel information required by the video output image to exceed the range of the video input image. Therefore, in order to avoid indexing to non-existent pixel positions, the existing EIS anti-shake algorithm adopts a solution to increase the cropping area. However, the disadvantage of this solution is that it will result in reduced compensation motion amplitude and weakened anti-shake effect. As shown in FIG. 2A, the image 201 is a video frame that is cropped with a smaller cropping ratio, and the arrow 203 indicates the ratio of the video frame that needs to be cropped in FIG. 2A. It can be seen from Figure 2A that in this case, the amplitude of the jitter that the EIS anti-shake algorithm can correct is small. As shown in FIG. 2B, the image 202 is a video frame that is cropped with a larger cropping ratio, and the arrow 204 indicates the ratio of the video frame that needs to be cropped in FIG. 2B. It can be seen from Fig. 2B that in this case, the amplitude of the jitter that the EIS anti-shake algorithm can correct is relatively large. Therefore, as the cropping ratio of the image increases, the amplitude of jitter that can be corrected by the EIS anti-shake algorithm increases, and it is more suitable for use in severe sports scenarios.

According to an embodiment of the present application, the cropping ratio can be dynamically adjusted according to different motion amplitudes to adapt to different sports scenes. In small movements, the EIS algorithm uses a lower crop ratio. When the movement amplitude increases, it automatically enters the dynamic adjustment mode, automatically controls the crop ratio, and better balances the video stability, definition, and field of view. ,FOV). As shown in Figure 3, it is a method of dynamically adjusting the cutting ratio. In step S301, the video frame metadata collected by the terminal is first time stamp aligned. Because the metadata of the video frame comes from different hardware devices, the data collected by different hardware devices is different in time. Some metadata comes from the gyroscope data that has undergone optical image stabilization (OIS), some metadata comes from the angular velocity and acceleration data of the inertial measurement unit (IMU), and some metadata comes from Video frames captured by the image capture unit. There are multiple algorithms for aligning timestamps of different metadata. The following uses a search algorithm as an example, but the algorithm for aligning timestamps is not limited to this. The time stamp and the trajectory are recorded discretely, and the optimization method cannot be used directly to obtain the time difference. The trajectory can be transformed into a continuous curve through spline interpolation, so that the approximate continuous amount of time and pose is obtained, and then the nonlinear optimization method is used to obtain the time difference. Time difference. First, assume that the time difference between the two trajectories is within ±10 seconds, take a certain trajectory as a reference, choose the one with the higher frame rate, and change the time stamp of the other trajectory from -10 seconds to +10 seconds, with a step size of 1 second . In this way, 20 trajectories with modified timestamps are obtained, and each trajectory finds the corresponding frame through the timestamp from the reference, that is, the frame with the smallest time stamp difference, uses the solving algorithm to fit the two trajectories, calculates the absolute error, and takes the smallest error The time difference corresponding to the trajectory, such as -5 seconds. Then do the above operation again in the interval of -5 seconds ± 0.1 seconds in steps of 0.1 seconds. It will end after searching with 0.001 second as the step length. It is essentially a hierarchical search algorithm. In order to improve the search accuracy, you can also fit a quadratic curve near the time difference that minimizes the absolute error, where the independent variable is the time difference and the dependent variable is the absolute error, and the time difference at the minimum of the quadratic curve is taken as the final result. The accuracy of the time search depends on the fitting accuracy of the two trajectories, and the algorithm can be optimized using the idea of alternate optimization. The video frame is time stamp aligned with the data collected by the OIS and IMU, that is, the data collected by the OIS and IMU is established in a one-to-one correspondence with the video frame according to the time stamp. In step S302, EIS processing is performed on the above-mentioned video frame according to the video frame aligned by the time stamp of the previous step. After EIS processing, the rotation vector of the terminal is calculated and recorded to obtain the motion path. The above-mentioned rotation vector is a more intuitive geometric method for describing simple harmonic motion. Draw a vector from the coordinate origin O (balanced position), make its mode equal to the amplitude A of the harmonic motion, and set the angle between A and the x axis at t=0 to be equal to the initial phase φ0 of the harmonic motion, and then make A equal to the angle The angular velocity of frequency ω rotates counterclockwise around point O on the plane, and the vector made in this way is called the rotation vector. Obviously, the projection x=Acos(ωt+φ0) of the rotation vector on the x-axis at any time describes a simple harmonic motion. In step S303, path smoothing is performed on the motion path obtained in the previous step. The following takes the Floyd path smoothing algorithm as an example, but it is not limited to this algorithm. First, use the Freud algorithm to remove the adjacent collinear points in the above motion path, traverse all the data points, and calculate whether the directions of the two adjacent vectors are the same. Then, remove the redundant inflection points, traverse all the data points, and remove the points between the two points that can be passed directly. After the above two steps, a smooth motion path can be obtained. In step S304, the motion state of the terminal is identified. The specific identification method is described in detail below. In step S305, according to different motion states, count the number of times the cropped screen crosses the boundary. As shown in FIG. 7A, the above-mentioned cross-border is that after the cropped video frame undergoes a warping operation, part of the picture will exceed the original frame of the uncropped video frame. This phenomenon is called "cross-border". In step S306, it is judged whether to adjust the cutting ratio according to different cross-border times. If there is no need to adjust the cutting ratio, step S307 is entered. In step S307, the H matrix corresponding to each video frame is calculated, and rolling shutter correction is performed. The H matrix can be calculated by Rodrigo's formula, but it is not limited to this method. The rotation matrix is calculated by Rodrigo's formula, and then combined with the internal parameter matrix and the translation of OIS, the homography matrix H corresponding to each pixel of the input image is calculated. Because the CMOS terminal adopts a progressive scan method when shooting video, the high-frequency jitter terminal, that is, the jitter frequency is faster than the refresh time of one frame, which will cause the scene in the same frame to appear distorted, which is called the rolling door effect. The method of correcting the rolling door effect is to first divide the image into M strips, then smooth the R matrix of the first image between frames, and finally align the other R matrices in the video frame to the first strip uniformly. In order to obtain a smooth result, without causing the transitions between the individual bars in a frame to be too blunt, a Gaussian smoothing algorithm is used to perform path smoothing processing on each video frame. Finally, in step S314, the processing of the H matrix is completed. After the calculation of the H matrix, image warping (WARP) is performed and the image is output. In step S308, the display interface of the terminal prompts the user to enter the dynamic adjustment anti-shake mode, and the specific situation is described in detail below. In step S309, the user chooses whether to enter the dynamic adjustment anti-shake mode, and if it chooses to enter the dynamic adjustment anti-shake mode, proceed to the next step. If the user chooses not to enter the dynamic adjustment anti-shake mode, step S307 is entered. In step S310, the initial frame is determined, and the cropping ratio of the initial frame is calculated. In step S311, the trend of the movement of the terminal is calculated, and the cropping ratio of the frames other than the initial frame that has been collected is determined according to the trend of the movement. In step S312, a motion path smoothing is performed on the image cropped in step S305. In step S313, cut boundary protection is performed on each video frame processed in step S312 to avoid unretrievable pixels in the image, and then go to step S307, calculate the H matrix of the re-cut image, and make a rolling shutter Effect correction. Finally, proceed to step S314 to complete the image warping and output the image. The image warping described above can be an affine transformation of the H matrix. The affine transformation is a transformation from two-dimensional coordinates (x, y) to two-dimensional coordinates (u, v). ) Linear transformation. The characteristic of affine transformation is that the straight line is still a straight line after affine transformation; the relative position relationship between the straight lines remains unchanged, and the parallel line is still parallel after affine transformation, and the position order of the points on the straight line does not occur. Change; The three pairs of non-collinear corresponding points determine a unique affine transformation, that is, a two-dimensional coordinate is multiplied by a matrix, and the eigenvector of the matrix determines the direction of the image transformation. Among the above steps, some steps are not necessary, and may not be required, such as step S303, step S308, step S309, step S312, and step S313.

In some embodiments, corresponding to the above step S304, the terminal may perform motion state recognition based on the IMU. As shown in Figure 4A, the terminal performs root-mean-square and absolute integral calculations on these data based on the angular velocity and acceleration data collected by the IMU and the data collected by the gyroscope. The data can also be imported into the machine learning pre-training model for learning , Get the processed data information. The terminal classifies and counts the above information, and can identify different exercise intensities or exercise types. In FIG. 4A, the exercise state is divided into a few levels according to exercise intensity. In Level 0, the terminal is in a static state. Set up a motion state counter to count the number of motion states. As shown in Figure 4B, it is an example of different motion states. In the table in Figure 4B, the vertical axis is the gyroscope data, and the horizontal axis is the time. It can be seen from the figure that the left part of the figure has a smaller exercise intensity and is identified as walking, and the right part has a stronger exercise and is identified as running. Status (running). As shown in Figure 4C, it is the transition between different motion states. Step S401 is to maintain the same exercise state, step S402 is to switch from a low-intensity exercise state to a high-intensity exercise state, and step S403 is to switch from a high-intensity exercise state to a low-intensity exercise state. In different motion state transitions, the terminal will pop up on the display screen to prompt the user whether to enter the mode of dynamically adjusting the anti-shake, that is, the above step S308. If the user chooses to enter the dynamic adjustment anti-shake mode, the cropping ratio needs to be re-determined. And for each video frame whose cropping ratio has been re-determined, the path smoothing process is performed to keep the video display smooth without violent shaking.

In some other embodiments, corresponding to the above steps S304, S305, and S306, the terminal needs to adjust the cutting ratio according to the change of the motion state. As shown in Figure 5, the method steps for adjusting the cropping ratio of the video frame. In step S500, the terminal aligns the time stamp of the IMU data with the video frame. Step S501: Determine the current motion state of the terminal. In step S511, if the exercise intensity of the terminal remains unchanged, that is, the above step S401, then step S512 is entered. In step S512, the cropping ratio of the video frame is kept unchanged. In step S521, if the exercise intensity of the terminal increases, that is, the above step S402, proceed to step S522. In step S522, the initial number of cross-border cutout images is counted. Out-of-bounds means that after the cropped image is subjected to anti-shake correction, that is, after the image is distorted, the cropped image exceeds the boundary of the current display interface. That is, the currently displayed interface retrieves undefined pixels. In step S523, it is determined whether the initial statistical number of times the cropped image crosses the boundary is greater than a first threshold. In step S524, if the statistical number of times that the initial cropped image crosses the boundary is not greater than the first threshold, the initial cropping ratio is maintained. In step S525, if the initial counted number of times the cropped image crosses the boundary is greater than the first threshold, the cropping ratio is increased. In step S531, if the exercise intensity decreases, that is, the above step 3, go to step S532. In step S532, the initial cutting scale is first reduced. In step S533, count the number of times that the image crosses the boundary after the cropping ratio is reduced. In step S534, it is determined whether the number of times the image crosses the boundary after the cropping ratio is reduced is less than a second threshold. In step S535, if the number of crossovers of the image after the cropping ratio is reduced is not less than the second threshold, the current cropping ratio is maintained. In step S536, if the number of times the image cross-border is smaller than the second threshold after the cropping ratio is reduced, the cropping ratio is reduced.

In some embodiments, the display interface of the terminal may display to prompt the user whether to enter the dynamic anti-shake mode. As shown in Figure 6, when the user is recording a video, if the terminal determines that the cropping ratio needs to be adjusted, a prompt of "dynamic anti-shake" on/off will pop up on the video recording interface. The interface 601 is a recording screen of the terminal for video recording, and the interface 601 displays an output image processed by the EIS anti-shake algorithm. When the terminal determines that the current cropping ratio needs to be adjusted in the above step S308, the area 602 of the interface 601 automatically pops up a prompt whether to enable "dynamic anti-shake". The area 602 can be located at any position of the interface 601, which is not limited in this application. In this embodiment, the area 602 is located in the lower middle area of the interface 601. In the area 602, the text part of the prompt can be located anywhere in the area 602, which is not limited in this application, and the text part of the prompt is located above the area 602 in this embodiment. The text part of the prompt may be various words that have a similar meaning to "dynamic anti-shake". This application does not limit the type of language and the way of expression. This embodiment uses the combination of Chinese and English "dynamic anti-shake" as an example for description. Below the text prompt, there are prompts for opening and closing. This application does not limit the language types and expressions of opening and closing, as long as they can express the same meaning. The opening and closing prompts can use any font size that does not exceed the area 602, and this application is not limited. In this embodiment, the opening and closing prompt font size is smaller than the text prompt part as an example. When a prompt pops up in the area 602, the user can perform a touch or click operation. The user can touch the text prompt part in the area 602, and can also touch to turn on and off the prompt part. This application is not limited. In this embodiment, the user touches any part of the area 602 as an example. After the user touches any part of the area 602, the open and close prompt part will switch states. If it is open, the display color of the open part is different from the color of the closed part. If it is closed, the display color of the original open part Display to the closed part, and the display color of the original closed part is displayed to the open part. It is understandable that the user can also directly touch the opening and closing parts. If the user clicks the opening part, the “dynamic anti-shake” related functions are turned on, and if the user clicks the closed part, the “dynamic anti-shake” related functions are turned off.

In other embodiments, it is possible to re-determine the frame and the crop size at which the crop ratio takes effect according to the current path smoothness, exercise intensity, and the number of triggers for cross-border cropping. When the motion intensity represented by the data in the video frame buffer reaches a certain threshold, after the trimming ratio adjustment is triggered, the terminal can iteratively find a suitable trimming size according to a certain step size. In this embodiment, the trimming ratio is increased. Take an example for illustration. The terminal predicts the movement trend based on the data collected by the IMU, and calculates the current exercise intensity of the terminal. The ultimate goal of the terminal to re-determine the cutting ratio is to find the largest segment within the buffer length window according to its movement trend and exercise intensity, which can cause the cut to cross the boundary. The terminal can increase the cutting ratio stepping, and stepping refers to increasing the cutting ratio regularly. The terminal calculates the position of the boundary point of the output image in the input image according to the rotation vector of each video frame. From the calculation result, it can count which video frames are out of bounds, and can count the number of out of bounds video frames. By calculating the position of the boundary point of the output image in the input image, it is also possible to calculate the out-of-bounds size of each video frame and the number of adjustments required for each video frame to adjust the out-of-bounds part of the image to the display area. As shown in FIG. 7A, if an out-of-bounds situation occurs in the output image after calculation, it is adjusted according to the initially calculated motion vector, and it is found that some pixels of the output image are not defined in the input image. As shown in FIG. 7A, the video frame on the left is the input frame. After calculation and image distortion, the image on the right is obtained, that is, the output image is obtained. The situation shown in FIG. 7A is the hit boundary of the output image, that is, some pixels of the input image are not defined in the output image. The terminal adjusts the cutting ratio step by step. Each step calculates the number of cross-border cropping frames, calculates the size of each cross-border crop frame, and the number of times required for each cross-border crop frame to be adjusted to the display area. When the above conditions meet certain conditions, for example, when the number of times required for each cross-border cropping frame to be adjusted within the display area meets a preset threshold, the terminal stops stepping to adjust the cropping ratio. As shown in FIG. 7B, if the cutting ratio is increased, although the FOV will be reduced, it can ensure that fewer trigger cross-border situations (crop more, FOV shrink) are ensured. If you increase the cropping ratio and still cannot reduce the cross-border situations, you need to reduce the smoothness of the path. As shown in Figure 8A and Figure 8B, reduce the degree of path smoothness. If the path smoothness is reduced to meet certain conditions, such as less than a preset threshold, and the number of cross-borders still cannot meet the requirements, continue to increase the cutting ratio, and Perform the above process alternately and iterate. In the iterative process, the out-of-bounds size and the number of times each video frame needs to be adjusted to the display area can be used as feedback information and added to the iteration to adaptively speed up the iterative process until a video frame that meets the conditions is found. Cut proportions.

In some embodiments, corresponding to the foregoing step S310, the terminal may determine whether to adjust the cropping ratio according to the movement trend. As shown in FIG. 9, in step S900, the terminal determines whether to trigger the trimming ratio adjustment. If the trimming scale adjustment is triggered, step S910 is entered, where the input image and the output image of the video frame match the trimming dynamic model. In step S911, the cropping ratio of each video frame is calculated according to the cropping ratio dynamic model. In step S912, path smoothing and boundary maintaining processing is performed on the output map processed by the new cropping scale, that is, the iterative process in the above-mentioned embodiment is performed. In step S913, image warp is performed on the video frame cropped by the cropping ratio determined by the iterative process. In step S900, if the terminal determines that the trimming ratio adjustment has not been triggered, it proceeds to step S920 to maintain the current trimming ratio. In step S921, path smoothing and boundary maintaining processing are performed on the video frame image maintaining the current cropping ratio. In step S922, image warping (warp) is performed on the processed video frame. Specifically, as shown in Figure 10A, 2N+1 frames are stored in the buffer of the current terminal, where N is the distance from the current frame to the latest frame. Frame No. 1 in Figure 10A is the current frame, starting from 2N The image distortion begins at frame +1. As shown in Fig. 10B, the abscissa in the figure is time, and the ordinate is the number of times the boundary has been triggered. The curve 101 in the second quadrant of FIG. 10B represents a situation where the cropped video frame triggers an out-of-bounds when the cropping ratio is 20%. The curve 102 in the first quadrant of FIG. 10B represents a situation where the cropped video frame triggers an out-of-bounds when the cropping ratio is adjusted to 30%. The curve 103 in FIG. 10B is the change curve of the video frame after path smoothing. Furthermore, as shown in FIG. 11, the terminal adjusts the speed of the change of the cropping ratio according to the rate of change of the exercise intensity, so that the change of the viewing angle is smoother. The method of Fig. 11 can be used to indicate the change of the viewing angle, but this application is not limited, and straight lines, line segments, and piecewise functions can also be used for direct mapping. Frequent changes in the viewing angle will cause discomfort to the user, and the following strategies can be used to limit it. For example, in the process of video shooting, only one-way adjustment of the angle of view is allowed, and the crop ratio is not allowed to increase and then decrease or decrease and then increase. For another example, during the video shooting process, only a limited number of adjustments of the angle of view are allowed, such as N times, where N is a preset limited number. For another example, when counting the number of different motion states, when the number of consecutive times of staying in a certain motion state reaches a certain limited number of times, such as M times, the angle of view is allowed to be adjusted. The above adjustment of the viewing angle is the adjustment of the cutting ratio of the viewing angle.

In other embodiments, the mapping relationship between exercise intensity and re-determination of the crop ratio can be established according to the deep learning model. As shown in Figure 12, in step S1201, the terminal collects a certain amount of video frames, performs video segmentation in unit time, and collects IMU and gyroscope data, performs root-mean-square and absolute-value integrals on these data, or integrates these data The data is fed into the training model of machine learning. In step S1202, the terminal motion intensity in each video segment is classified and counted according to the data collected by the above-mentioned sensor. In step S1203, in the above-mentioned video clip set, count the number of triggering cross-border times when each video clip is cut with different cutting ratios, and record them. In step S1204, it is determined whether the number of times of triggering the out-of-bounds is less than a preset specific value, such as an X threshold, according to the cropping ratio used by different recorded video clips. If the number of triggering out-of-bounds is not less than the preset specific value, return to step S1203 to continue counting. If the number of triggering out-of-bounds is less than the preset specific value, step S1205 is entered. In step S1205, a neural network or a machine learning model is used to train the collected sensor data to obtain the corresponding relationship between the exercise intensity and the optimal cutting ratio. In step S1206, when it is detected for the first time that the cropping ratio needs to be changed according to the exercise state or exercise intensity, the terminal automatically prompts the user whether to enable the dynamic FOV adjustment mode, as shown in FIG. 6 specifically. If the user chooses to turn on the dynamic FOV adjustment mode, the cutting ratio will be dynamically adjusted according to the results calculated by the deep learning model.

In some embodiments, the terminal may be an electronic device. As shown in FIG. 13A, a schematic structural diagram of the electronic device 100 is shown. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2. , Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (subscriber identification module, SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.

It can be understood that the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100. In other embodiments of the present application, the electronic device 100 may include more or fewer components than those shown in the figure, or combine certain components, or split certain components, or arrange different components. The illustrated components can be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU), etc. Among them, the different processing units may be independent devices or integrated in one or more processors.

The controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching instructions and executing instructions.

A memory may also be provided in the processor 110 to store instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory can store instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.

In some embodiments, the processor 110 may include one or more interfaces. The interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, and a universal asynchronous transmitter/receiver (universal asynchronous) interface. receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / Or Universal Serial Bus (USB) interface, etc.

The I2C interface is a bidirectional synchronous serial bus, which includes a serial data line (SDA) and a serial clock line (SCL). In some embodiments, the processor 110 may include multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, charger, flash, camera 193, etc., respectively through different I2C bus interfaces. For example, the processor 110 may couple the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to implement the touch function of the electronic device 100. Both I2S interface and PCM interface can be used for audio communication.

The UART interface is a universal serial data bus used for asynchronous communication. The bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication.

The MIPI interface can be used to connect the processor 110 with the display screen 194, the camera 193 and other peripheral devices. The MIPI interface includes a camera serial interface (camera serial interface, CSI), a display serial interface (display serial interface, DSI), and so on. In some embodiments, the processor 110 and the camera 193 communicate through a CSI interface to implement the shooting function of the electronic device 100. The processor 110 and the display screen 194 communicate through a DSI interface to realize the display function of the electronic device 100.

The GPIO interface can be configured through software. The GPIO interface can be configured as a control signal or as a data signal. In some embodiments, the GPIO interface can be used to connect the processor 110 with the camera 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and so on. The GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.

It can be understood that the interface connection relationship between the modules illustrated in the embodiment of the present invention is merely a schematic description, and does not constitute a structural limitation of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.

The electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like. The GPU is an image processing microprocessor, which is connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations and is used for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

The display screen 194 is used to display images, videos, and the like. The display screen 194 includes a display panel. The display panel can use liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode). AMOLED, flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc. In some embodiments, the electronic device 100 may include one or N display screens 194, and N is a positive integer greater than one.

The electronic device 100 can implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.

The ISP is used to process the data fed back from the camera 193. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing and is converted into an image visible to the naked eye. ISP can also optimize the image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.

The camera 193 is used to capture still images or videos. The object generates an optical image through the lens and is projected to the photosensitive element. The photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other formats of image signals. In some embodiments, the electronic device 100 may include one or N cameras 193, and N is a positive integer greater than one.

Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects the frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.

Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.

NPU is a neural-network (NN) computing processor. By drawing on the structure of biological neural networks, for example, the transfer mode between human brain neurons, it can quickly process input information, and it can also continuously self-learn. Through the NPU, applications such as intelligent cognition of the electronic device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, and so on.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.

The internal memory 121 may be used to store computer executable program code, where the executable program code includes instructions. The internal memory 121 may include a storage program area and a storage data area. Among them, the storage program area can store an operating system, an application program (such as a sound playback function, an image playback function, etc.) required by at least one function, and the like. The data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100. In addition, the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like. The processor 110 executes various functional applications and data processing of the electronic device 100 by running instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

The pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be provided on the display screen 194. There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors and so on. The capacitive pressure sensor may include at least two parallel plates with conductive materials. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display screen 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A. In some embodiments, touch operations that act on the same touch position but have different touch operation strengths may correspond to different operation instructions. For example: when a touch operation whose intensity of the touch operation is less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.

The gyro sensor 180B may be used to determine the movement posture of the electronic device 100. In some embodiments, the angular velocity of the electronic device 100 around three axes (ie, x, y, and z axes) can be determined by the gyro sensor 180B. The gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 100 through reverse movement to achieve anti-shake. The gyro sensor 180B can also be used for navigation and somatosensory game scenes.

The acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and apply to applications such as horizontal and vertical screen switching, pedometers, and so on.

Distance sensor 180F, used to measure distance. The electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.

The proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light to the outside through the light emitting diode. The electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100. The electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power. The proximity light sensor 180G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.

The ambient light sensor 180L is used to sense the brightness of the ambient light. The electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived brightness of the ambient light. The ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures. The ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touch.

The fingerprint sensor 180H is used to collect fingerprints. The electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.

Touch sensor 180K, also called "touch device". The touch sensor 180K may be disposed on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called a “touch screen”. The touch sensor 180K is used to detect touch operations acting on or near it. The touch sensor can pass the detected touch operation to the application processor to determine the type of touch event. The visual output related to the touch operation can be provided through the display screen 194. In other embodiments, the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.

The button 190 includes a power-on button, a volume button, and so on. The button 190 may be a mechanical button. It can also be a touch button. The electronic device 100 may receive key input, and generate key signal input related to user settings and function control of the electronic device 100.

The indicator 192 may be an indicator light, which may be used to indicate the charging status, power change, or to indicate messages, missed calls, notifications, and so on.

The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. The embodiment of the present invention takes an Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100.

FIG. 2 is a block diagram of the software structure of the electronic device 100 according to an embodiment of the present invention.

The layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Communication between layers is through software interface. In some embodiments, the Android system is divided into four layers, from top to bottom, the application layer, the application framework layer, the Android runtime and system library, and the kernel layer.

The application layer can include a series of application packages.

As shown in Figure 2, the application package can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, etc.

The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions.

As shown in FIG. 13B, the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and so on.

The window manager is used to manage window programs. The window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, take a screenshot, etc.

The content provider is used to store and retrieve data and make these data accessible to applications. The data may include videos, images, audios, phone calls made and received, browsing history and bookmarks, phone book, etc.

The view system includes visual controls, such as controls that display text, controls that display pictures, and so on. The view system can be used to build applications. The display interface can be composed of one or more views. For example, a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.

The phone manager is used to provide the communication function of the electronic device 100. For example, the management of the call status (including connecting, hanging up, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.

The notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, and so on. The notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, text messages are prompted in the status bar, prompt sounds, electronic devices vibrate, and indicator lights flash.

Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.

The core library consists of two parts: one part is the function functions that the java language needs to call, and the other part is the core library of Android.

The application layer and application framework layer run in a virtual machine. The virtual machine executes the java files of the application layer and the application framework layer as binary files. The virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.

The system library can include multiple functional modules. For example: surface manager (surface manager), media library (Media Libraries), three-dimensional graphics processing library (for example: OpenGL ES), 2D graphics engine (for example: SGL), etc.

The surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.

The media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files. The media library can support multiple audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis, and layer processing.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is the layer between hardware and software. The kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.

In the following, the workflow of the software and hardware of the electronic device 100 will be exemplified in conjunction with capturing a photo scene.

When the touch sensor 180K receives a touch operation, the corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes the touch operation into the original input event (including touch coordinates, time stamp of the touch operation, etc.). The original input events are stored in the kernel layer. The application framework layer obtains the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation, and the control corresponding to the click operation is the control of the camera application icon as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer. The camera 193 captures still images or videos.

The above description is only the specific implementation of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Covered within the protection scope of the present application; the embodiments of the present application and the features in the embodiments can be combined with each other if there is no conflict. Therefore, the protection scope of this application should be the protection scope of the claims.

Claims

A mobile terminal, characterized by comprising a camera, a display screen, a motion sensor, and a processor;

The camera is used to record video; the display screen is used to display a video recording interface;

The motion sensor is used to continuously collect motion data when the camera records a video;

According to the exercise data, determine whether the exercise intensity changes;

If the exercise intensity changes, the video recording interface is used to display a prompt for starting dynamic anti-shake;

The display screen is used to receive the touch operation of clicking the prompt, and the processor is used to perform dynamic anti-shake.
The mobile terminal according to claim 1, wherein the motion sensor includes an inertial sensor, an acceleration sensor, and a gyroscope.
The mobile terminal according to claim 1, wherein the prompt for starting dynamic anti-shake includes: an explanation part of dynamic anti-shake and a switch control part.
The mobile terminal according to claim 1, wherein the camera generates a video frame when recording a video; the motion data is time stamp aligned with the video frame, and the time stamp alignment is based on the establishment of time stamps. The corresponding relationship between the motion data and the video frame; electronic image anti-shake processing is performed on the video frame, and the electronic image anti-shake is to cut the video frame, and the cut video The frame is warped; the rotation vector of the video frame is calculated; the path is smoothed according to the motion data; the path is smoothed to optimize the curve composed of the motion data; the motion state of the mobile terminal is determined; The number of cross-borders of the distorted video frame; if the number of cross-borders is greater than a first threshold, the cropping ratio is increased; if the number of cross-borders is less than or equal to the first threshold, the cropping ratio is maintained.
The mobile terminal according to claim 4, wherein the out-of-bounds means that some pixels of the video frame before the warping are not defined in the video frame after the warping.
A video processing method, characterized in that it comprises:

The camera of the mobile terminal collects video frames;

The motion sensor of the mobile terminal collects motion data;

The processor of the mobile terminal performs time stamp alignment on the video frame and the motion data;

The processor performs electronic image stabilization on the video frame, where the electronic image stabilization is to cut the video frame and warp the cut video frame;

Calculating the rotation vector of the video frame according to the motion data;

The processor recognizes the motion state of the mobile terminal;

Performing cropping processing on the video frame by the processor, and counting the number of out-of-bounds times of the video frame after the cropping processing;

The processor determines whether to adjust the cutting ratio according to the number of cross-border times;

If the number of cross-borders is less than or equal to the first threshold, the cropping ratio is maintained, and the processor calculates the H matrix corresponding to the video frame; and performs image warping processing according to the H matrix;

If the number of out-of-bounds times is greater than the first threshold, the processor calculates a new cropping ratio of the video frame, and generates an initial video frame at the new cropping ratio;

Determining, by the processor, a cropping ratio of each of the video frames according to changes in the motion intensity of the mobile terminal;

Calculating the H matrix corresponding to the video frame by the processor;

Image distortion processing is performed according to the H matrix.
The method according to claim 6, wherein the video frame collected by the camera of the mobile terminal is stored in a buffer of a memory.
The method according to claim 6, wherein the motion data comprises: acceleration and angular velocity of the mobile terminal.
The method according to claim 6, wherein the time stamp alignment is such that the processor uses spline interpolation to change the motion data from a discrete value to a continuous curve; and the processor performs processing on the continuous curve. Non-linear optimization obtains the time difference between the different continuous curves; the processor performs the non-linear optimization in a loop, and when the time difference meets a specific condition, the loop ends.
The method according to claim 6, wherein the processor performs path smoothing on the video frame according to the rotation vector.
The method according to claim 10, wherein the motion path smoothing is that the processor calculates the vector of every two adjacent data points in the motion data, and traverses all the data points; the processing The processor removes one of the two adjacent data points with the same vector; the processor removes the inflection point in the data curve composed of the motion data; the processor removes two of the data points Data points between data points that can be passed directly.
The method according to claim 6, wherein the out-of-bounds means that some pixels of the video frame before the cropping process are not defined in the video frame after the cropping process.
The method according to claim 6, wherein if the number of cross-border times is greater than the first threshold, the display screen of the mobile terminal displays an interface prompting the user to enable dynamic anti-shake;

The display screen receives the user's touch operation to turn on dynamic anti-shake.
The method according to claim 13, wherein the dynamic anti-shake is that the processor adjusts the cropping ratio according to the change of the exercise intensity.
A computer-readable storage medium, comprising instructions, characterized in that, when the instructions are executed on a mobile terminal, the mobile terminal is caused to execute the method according to any one of claims 6-14.