WO2020187065A1 - Video evaluation method, terminal, server, and related product - Google Patents
Video evaluation method, terminal, server, and related product Download PDFInfo
- Publication number
- WO2020187065A1 WO2020187065A1 PCT/CN2020/078320 CN2020078320W WO2020187065A1 WO 2020187065 A1 WO2020187065 A1 WO 2020187065A1 CN 2020078320 W CN2020078320 W CN 2020078320W WO 2020187065 A1 WO2020187065 A1 WO 2020187065A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- evaluated
- value
- frame
- target parameter
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N17/004—Diagnosis, testing or measuring for television systems or their details for digital television systems
Definitions
- This application relates to the field of video anti-shake, in particular to a video evaluation method, terminal, server and related products.
- One of the existing methods is to extract anti-shake processing.
- the feature points of adjacent frames in the later video are calculated and the homograpy matrix between adjacent frames is calculated, and then each component in the homograpy matrix is converted to the frequency domain for analysis, and the proportion of low frequency information to the entire frequency is calculated , The higher the ratio, the better the anti-shake method.
- this evaluation method is only a quantitative analysis from the level of video jitter frequency. Different videos with the same jitter frequency may actually have large jitter differences. For example, in the case of the same jitter frequency, the video jitter of the larger frame picture The amplitude of is obviously greater than the amplitude of video jitter with a smaller frame, so the accuracy of this evaluation method is low.
- the embodiments of the present application provide a video evaluation method, which improves the accuracy and comprehensiveness of video evaluation.
- the first aspect of this application provides a video evaluation method, including:
- the target parameter includes at least one of a jitter value, a cropping value, and a distortion value.
- the jitter value includes every two adjacent ones in the video to be evaluated.
- the average value of the jitter displacement between frames, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes the grid points on at least one curve in the curve grid matching the video to be evaluated.
- the average distance to the fitted straight line corresponding to the at least one curve, and then the video to be evaluated is evaluated according to the target parameter.
- the video can be evaluated from the aspect of the jitter amplitude based on the jitter value.
- This evaluation method is more accurate.
- combining the cropping value and the distortion value improves the comprehensiveness of the evaluation method.
- calculating the jitter value associated with the video to be evaluated includes:
- calculating the cropping value associated with the video to be evaluated includes:
- the first frame and the second frame are any two adjacent frames in the video to be evaluated. Two feature point matching;
- the cropping value is calculated according to the first distance and the second distance.
- calculating the distortion value associated with the video to be evaluated includes:
- the video to be evaluated may be the original video taken, that is, the evaluation method of this application is for the original video taken by the terminal, and the evaluation method of this application is used to determine the jitter of the video. Make an evaluation.
- the video to be evaluated is specifically the original video shot by the terminal. According to the quantitative index provided by the evaluation method of this application, the user can more accurately have an intuitive understanding of the jitter of the shot video, which improves The practicality of this program.
- the video to be evaluated may include at least a first video to be evaluated and a second video to be evaluated, and evaluating the video to be evaluated according to the target parameter includes:
- multiple videos to be evaluated can be compared and evaluated, so that the user can intuitively understand the jitter strength of the multiple videos to be evaluated, which expands the application scenarios of this solution.
- the video to be evaluated may be a video in which the original video has been processed by an anti-shake algorithm, that is, the anti-shake algorithm used in the video to be evaluated can be evaluated by the evaluation method of this application.
- the video to be evaluated is specifically the original video processed by the anti-shake algorithm, and the anti-shake algorithm can also be evaluated through this evaluation method, which improves the scalability of the solution.
- the video to be evaluated includes at least a first video to be evaluated and a second video to be evaluated, the first video to be evaluated uses a first anti-shake algorithm, and the second video to be evaluated
- the evaluation video adopts the second anti-shake algorithm, and evaluating the video to be evaluated according to the target parameter includes:
- multiple videos to be evaluated can be compared and evaluated, so that the user can intuitively understand the pros and cons of the anti-shake algorithms adopted by the multiple videos to be evaluated, which expands the application scenarios of this solution.
- the jitter value is smaller, the cropping value is smaller, and/or the distortion value is smaller, the jitter of the video to be evaluated is smaller.
- the jitter value is smaller, the crop value is smaller, and/or the distortion value is smaller, the better the anti-shake algorithm used for the video to be evaluated.
- an evaluation standard for videos based on target parameters is provided.
- the evaluation can be based on one of the above three target parameters, or it can be combined with more of the above three target parameters.
- the evaluation result can be directly output, or the target parameter value can be directly output for the user to evaluate, which makes the evaluation method of this scheme more flexible.
- the second aspect of the present application provides a terminal, including:
- Program codes are stored in the memory
- the third aspect of the present application provides a server, including:
- Program codes are stored in the memory
- the fourth aspect of the present application provides a computer-readable storage medium, including instructions, which, when the instructions run on a computer, cause the computer to execute the video provided by the first aspect or any one of the first aspect of the present application.
- the flow in the evaluation method is not limited to:
- the fifth aspect of the present application provides a computer program product, which when the instructions run on a computer, causes the computer to execute the process in the video evaluation method provided in the first aspect or any one of the first aspects of the application .
- the embodiment of the application provides a method for evaluating a video.
- a video to be evaluated is obtained, and then a target parameter associated with the video to be evaluated is calculated, and the video to be evaluated is evaluated according to the target parameter, where the target parameter may include jitter Value, crop value, and distortion value.
- the jitter value includes the average value of the jitter displacement between every two adjacent frames in the video to be evaluated.
- the pros and cons of the anti-shake method can be evaluated at the level of the jitter amplitude.
- the accuracy of the evaluation method is higher.
- the combination of the clipping value and the distortion value improves the comprehensiveness of the evaluation method.
- Figure 1 is a schematic diagram of two different shaking effects presented by shooting a video of the same scene
- FIG. 2 is a schematic diagram of an embodiment of the video evaluation method of this application.
- Figure 3 is a schematic diagram of the video being parsed into one frame by frame
- Figure 4 is a schematic diagram of calculating the jitter value
- Figure 5 is a comparison diagram of the complete frame picture and the cropped frame picture
- Fig. 6 is a schematic diagram of the distance from the feature point on the frame before cropping to the boundary of the frame
- FIG. 7 is a schematic diagram of the distance from the feature points on the frame picture to the frame picture boundary after cropping
- Fig. 8 is a schematic diagram of frame picture distortion
- Figure 9 is a schematic diagram of comparison between a straight grid and a curved grid
- Figure 10 is a schematic diagram of a fitted straight line corresponding to a curve
- FIG. 11 is a schematic diagram of an embodiment of a terminal of this application.
- Figure 12 is a schematic diagram of an embodiment of the application server
- Figure 13 is a schematic diagram of the structure of the application server
- Figure 14 is a schematic diagram of the structure of the terminal of this application.
- the embodiments of the present application provide a video evaluation method, which improves the accuracy and comprehensiveness of video evaluation.
- Figure 1 shows two different jitter effects presented by shooting a video of the same scene. The user can visually and intuitively evaluate that the jitter in the figure below is obviously stronger than the figure above. However, comparing with human eyes alone still has great limitations, so a set of quantitative evaluation standards is needed.
- One of the existing methods is to extract the feature points of adjacent frames in the anti-shake processed video and calculate the homography matrix between adjacent frames, and then convert each component in the homography matrix to the frequency domain for analysis. Calculate the proportion of low frequency information to the entire frequency. The higher the proportion, the better the anti-shake method.
- this evaluation method is only a quantitative analysis from the level of video jitter frequency. Different videos with the same jitter frequency may actually have large jitter differences. For example, in the case of the same jitter frequency, the video jitter of the larger frame picture The amplitude of is obviously greater than the amplitude of video jitter with a smaller frame, so the accuracy of this evaluation method is low.
- an embodiment of the present application provides a video evaluation method, which is described in detail below, as shown in Figure 2:
- the video evaluation method in this application can be applied to a terminal or a server.
- the terminal can acquire a video to be evaluated through its own shooting, and of course, it can also acquire the video to be evaluated through other methods such as downloading.
- the server can receive the video to be evaluated sent by the terminal. This application does not limit the specific method of obtaining the video to be evaluated.
- the video to be evaluated may be the original video shot by the terminal, or the original video processed by the anti-shake algorithm.
- users can use a third-party anti-shake algorithm to anti-shake the original video shot by the terminal.
- the original video shot by the terminal can also be processed after the internal anti-shake processing of the terminal.
- the specific video is not limited here.
- the target parameter may specifically include at least one of a jitter value, a crop value, and a distortion value.
- the jitter value includes the average value of the jitter displacement between every two adjacent frames in the video to be evaluated
- the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated
- the distortion value includes matching with the video to be evaluated The average distance from a grid point on at least one curve in the curve grid to the fitted straight line corresponding to at least one curve.
- the target parameter is the jitter value.
- the jitter value defined in this application is based on the jitter when the moving direction of the screen changes during shooting. For example, the screen moves in the positive direction of the X axis at the beginning of the shooting. At this time, it is not regarded as jitter, but if the screen moves later If there is displacement in directions other than the positive X-axis direction (for example, the negative X-axis, Y-axis, or Z-axis direction), it will be considered that the video has jittered.
- directions other than the positive X-axis direction for example, the negative X-axis, Y-axis, or Z-axis direction
- a piece of video can be parsed as a frame by frame. If the video is jittery, there may be feature points in two adjacent frames (9 features as shown in Figure 2). The position of dot) in the frame has changed. By extracting the characteristic points of each frame, the homography matrix between every two adjacent frames can be calculated. Among them, the homography matrix can be understood as describing the mapping relationship between points on the same plane in different images. If the coordinates of the vertices on the previous frame are known, then according to the homography matrix between adjacent frames Then, the coordinates of the vertices that match the vertices on the previous frame can be calculated.
- the displacement vector of the current frame relative to the previous frame can be obtained.
- the displacement vector of the next frame relative to the current frame can be obtained.
- the target parameter is the clipping value.
- the video that is generally subjected to anti-shake processing will be appropriately cropped in advance, and only a part of the frame of the sensor (the effective frame shown in Figure 5) is used. Accordingly, the final video screen presented to the user is also A certain percentage of cropping. If there is jitter between two adjacent frames, then the next frame may be cropped compared to the previous frame. Therefore, the cropping value is also used as one of the evaluation indicators of the video.
- FIG. 6 and FIG. 7 are two adjacent frames in the video shot in the same scene. It can be seen that FIG. 7 is appropriately cropped relative to FIG. 6.
- First extract the matching feature points in the two adjacent frames For example, the first feature point in Figure 6 and the second feature point in Figure 7 are both cups in the image, and then the first feature points to Figure 6 are calculated separately.
- the first distance between the border of the frame and the second distance from the second feature point to the border of the frame shown in FIG. 7 are shown, and the cropping value between the two adjacent frames can be calculated according to the first distance and the second distance.
- the cropping value can be the cropping variation, that is, the difference between the first distance and the second distance; in addition, the cropping value can also be in other forms, for example, the cropping percentage, that is, the first distance and the The ratio of the second distance is not specifically limited here. It is understandable that the cropping value associated with the entire video to be evaluated can be obtained by summing the cropping values between every two frames of the entire video and then performing an average value.
- the number of feature points extracted on each frame of image may be one or multiple, which is not specifically limited here.
- the distance from the feature point to the frame boundary may refer to the distance from the feature point to any boundary in the frame picture, which is not specifically limited here.
- the target parameter is the distortion value.
- jitter occurs when shooting a video, it may appear that the next frame is distorted relative to the previous frame between two adjacent frames.
- the image in the next frame may be distorted.
- the silhouette of the photographed building is no longer a regular straight line, but has a certain degree of curvature.
- the image distortion is quantified, and the quantization result is defined as the distortion value.
- the coordinates of the corresponding grid points on the curved grid can be obtained, and then the coordinates of the corresponding grid points on the curved grid can be generated.
- Corresponding to the fitted straight line and finally calculate the average distance from each grid point on the curve to the fitted straight line to obtain the distortion value.
- the target parameter associated with the video to be evaluated is calculated, it is necessary to further evaluate the video to be evaluated according to the target parameter.
- the jitter value is smaller, the cropping value is smaller, or the distortion value is smaller, the jitter of the video to be evaluated is smaller.
- the anti-shake function can also indicate that the terminal's own anti-shake function is better. For the video to be evaluated as the original video processed by the anti-shake algorithm, if the jitter value is smaller, the cropping value is smaller, or the distortion value is smaller, the better the anti-shake algorithm is adopted for the video to be evaluated.
- the video to be evaluated can be evaluated according to one of the above three target parameters, or multiple different types of target parameters can be integrated to evaluate the video to be evaluated.
- the specifics are not limited here.
- the evaluation results can be distinguished by different levels, for example, it can be divided into three levels of "good, medium, and poor". Of course, the evaluation results can also be distinguished by other forms such as scoring, and the specifics are not limited here.
- users can select a certain video to evaluate according to their own needs, or input multiple different videos to evaluate by comparison. For example, the input evaluation result is "Video A is better than Video B", which means that video A is relative to video B The jitter of is smaller, or the anti-shake algorithm adopted by video A is better than that of video B, which is not limited here.
- the evaluation method of a single video is relatively straightforward, that is, input the video and output the evaluation result. For example, if the user enters a video to be evaluated and selects the terminal device that shoots the video, the evaluation result corresponding to the specific value of each target parameter is preset in the system. For example, the evaluation result corresponding to "Crop Value 0-10%" is “good” , The evaluation result corresponding to "cut value 10%-20%” is “medium”, and the evaluation result corresponding to "cut value exceeds 20%” is “poor", so the system can generate evaluation results based on the calculated target parameters.
- the comparative evaluation of multiple different videos can be divided into the following situations. First, different videos shot by the same terminal; second, different videos shot by different terminals; third, the same original video processed by different anti-shake algorithms; fourth, different original videos processed by different anti-shake algorithms Video.
- the user wants to evaluate two different videos (Video A and Video B) shot by the same terminal. Since the frame sizes of the videos shot by the same terminal are the same, the user only needs to enter the system of Video A and Video B. Feedback evaluation results through calculation and analysis.
- video A and video B are two different original videos shot by the same terminal, then the evaluation result can be "Video A jitter is less than video B" or video A and video B have undergone different anti-shake algorithms After the two pieces of video are processed, the evaluation result can be "the anti-shake algorithm of video A is better than that of video B" and so on.
- a user wants to evaluate videos (video C and video D) shot by different terminals. Since the frame sizes of videos shot by different terminals may be different, the user can input video C and video D at the same time. For terminals that shoot video C and video D, the system feeds back the evaluation results through calculation and analysis.
- the system integrates the above three different types of target parameters to evaluate, then the following three parameters may be inconsistent in size comparison: For example, for video A and video B, the jitter value of video A is smaller than that of video The jitter value of B, but the crop value of video A is greater than the crop value of video B, and the distortion value of video A is greater than the distortion value of video B. Then the system can set the weights corresponding to the three parameters first. For example, if the three parameters are arranged in order of importance, the jitter value has the highest importance, followed by the clipping value, and then the distortion value. Then the weights corresponding to the three parameters can be jitter Value (weight 60%), crop value (weight 30%), distortion value (weight 10%).
- the system no longer compares the three target parameters of video A and video B, but calculates by weighting
- the average score method calculates the weighted values of video A and video B respectively, and then compares the weighted values of video A and video B and outputs the evaluation result.
- the system can also directly feed back the calculated value of the target parameter to the user for evaluation by the user himself, which is not specifically limited here.
- evaluation method in this application is applicable to various types of anti-shake methods for videos.
- it can also evaluate the pan/tilt used when shooting videos. There are no restrictions.
- the video to be evaluated is first obtained, and then the target parameter associated with the video to be evaluated is calculated, and the video to be evaluated is evaluated according to the target parameter, where the target parameter may include a jitter value, a crop value, and a distortion value ,
- the jitter value includes the average value of the jitter displacement between every two adjacent frames in the video to be evaluated.
- the pros and cons of the anti-shake method can be evaluated at the level of the jitter amplitude. This evaluation method is more accurate
- the combination of clipping value and distortion value improves the comprehensiveness of the evaluation method.
- the device that executes the above video evaluation method is a terminal, and an embodiment of the terminal in this application includes:
- the obtaining unit 1101 is configured to obtain a video to be evaluated
- the calculation unit 1102 is used to calculate a target parameter associated with the video to be evaluated.
- the target parameter includes at least one of a jitter value, a crop value, and a distortion value.
- the jitter value includes the jitter between every two adjacent frames in the video to be evaluated.
- the average value of the displacement, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes the grid point on at least one curve of the curve grid matching the video to be evaluated to at least one curve The average distance of the corresponding fitted straight line;
- the evaluation unit 1103 is configured to evaluate the video to be evaluated according to target parameters.
- the steps performed by the obtaining unit 1101 are similar to the step 201 in the embodiment shown in FIG. 2
- the steps performed by the calculating unit 1102 are similar to the step 202 in the embodiment shown in FIG. 2
- the steps performed by the evaluation unit 1103 are similar to those in the above figure.
- Step 203 in the embodiment shown in 2 is similar, and details are not repeated here.
- the device that executes the above-mentioned video evaluation method is a server, and an embodiment of the server in this application includes:
- the obtaining unit 1201 is configured to obtain a video to be evaluated
- the calculation unit 1202 is used to calculate a target parameter associated with the video to be evaluated.
- the target parameter includes at least one of a jitter value, a cropping value, and a distortion value.
- the jitter value includes the jitter between every two adjacent frames in the video to be evaluated.
- the average value of the displacement, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes the grid point on at least one curve of the curve grid matching the video to be evaluated to at least one curve The average distance of the corresponding fitted straight line;
- the evaluation unit 1203 is configured to evaluate the video to be evaluated according to target parameters.
- the steps performed by the acquiring unit 1201 are similar to step 201 in the embodiment shown in FIG. 2 above, the steps performed by the calculating unit 1202 are similar to step 202 in the embodiment shown in FIG. 2 above, and the steps performed by the evaluation unit 1203 are similar to those in the above figure.
- Step 203 in the embodiment shown in 2 is similar, and details are not repeated here.
- server and terminal in the embodiment of the present application are described above from the perspective of modular functional entities, and the server and terminal in the embodiment of the present application are described from the perspective of hardware processing below:
- FIG. 13 is a schematic diagram of a server structure provided by an embodiment of the present application.
- the server 1300 may have relatively large differences due to different configurations or performance, and may include one or more central processing units (CPU) 1322 (for example, , One or more processors) and memory 1332, and one or more storage media 1330 (for example, one or more storage devices) that store application programs 1342 or data 1344.
- the memory 1332 and the storage medium 1330 may be short-term storage or persistent storage.
- the program stored in the storage medium 1330 may include one or more modules (not shown in the figure), and each module may include a series of command operations on the server.
- the central processing unit 1322 may be configured to communicate with the storage medium 1330, and execute a series of instruction operations in the storage medium 1330 on the server 1300.
- the central processing unit 1322 can execute all or part of the actions in the embodiment shown in FIG. 2 according to instruction operations, and details are not described herein again.
- the server 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input and output interfaces 1358, and/or one or more operating systems 1341, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
- operating systems 1341 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
- the embodiment of the present application also provides a terminal. As shown in FIG. 14, for ease of description, only the parts related to the embodiment of the present application are shown. For specific technical details that are not disclosed, please refer to the method part of the embodiment of the present application.
- the terminal can be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), a car computer, etc. Take the terminal as a mobile phone as an example:
- FIG. 14 shows a block diagram of a part of the structure of a mobile phone related to a terminal provided in an embodiment of the present application.
- the mobile phone includes: a radio frequency (RF) circuit 1410, a memory 1420, an input unit 1430, a display unit 1440, a sensor 1450, an audio circuit 1460, a wireless fidelity (WiFi) module 1470, and a processor 1480 , And power supply 1490 and other components.
- RF radio frequency
- the RF circuit 1410 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the downlink information of the base station, it is processed by the processor 1480; in addition, the designed uplink data is sent to the base station.
- the RF circuit 1410 includes but is not limited to an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like.
- the RF circuit 1410 can also communicate with the network and other devices through wireless communication.
- the above wireless communication can use any communication standard or protocol, including but not limited to the global system of mobile communication (GSM), general packet radio service (GPRS), code division multiple access (code division multiple access, GSM) Multiple access, CDMA), wideband code division multiple access (WCDMA), long term evolution (LTE), email, short messaging service (SMS), etc.
- GSM global system of mobile communication
- GPRS general packet radio service
- code division multiple access code division multiple access
- GSM Code division multiple access
- CDMA code division multiple access
- WCDMA wideband code division multiple access
- LTE long term evolution
- email short messaging service
- the memory 1420 may be used to store software programs and modules.
- the processor 1480 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1420.
- the memory 1420 may mainly include a storage program area and a storage data area.
- the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of mobile phones.
- the memory 1420 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
- the input unit 1430 can be used to receive input digital or character information, and to generate key signal input related to the user settings and function control of the mobile phone.
- the input unit 1430 may include a touch panel 1431 and other input devices 1432.
- the touch panel 1431 also known as a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1431 or near the touch panel 1431. Operation), and drive the corresponding connection device according to the preset program.
- the touch panel 1431 may include two parts: a touch detection device and a touch controller.
- the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1480, and can receive commands sent by the processor 1480 and execute them.
- the touch panel 1431 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave.
- the input unit 1430 may also include other input devices 1432.
- other input devices 1432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, joystick, and the like.
- the display unit 1440 may be used to display information input by the user or information provided to the user and various menus of the mobile phone.
- the display unit 1440 may include a display panel 1441.
- the display panel 1441 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.
- the touch panel 1431 can cover the display panel 1441. When the touch panel 1431 detects a touch operation on or near it, it transmits it to the processor 1480 to determine the type of the touch event, and then the processor 1480 responds to the touch event. Type provides corresponding visual output on the display panel 1441.
- the touch panel 1431 and the display panel 1441 are used as two independent components to implement the input and input functions of the mobile phone, but in some embodiments, the touch panel 1431 and the display panel 1441 can be integrated Realize the input and output functions of mobile phones.
- the mobile phone may also include at least one sensor 1450, such as a light sensor, a motion sensor, and other sensors.
- the light sensor can include an ambient light sensor and a proximity sensor.
- the ambient light sensor can adjust the brightness of the display panel 1441 according to the brightness of the ambient light.
- the proximity sensor can close the display panel 1441 and/or when the mobile phone is moved to the ear. Or backlight.
- the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when stationary, and can be used to identify mobile phone posture applications (such as horizontal and vertical screen switching, related Games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, percussion), etc.; as for other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which can be configured in mobile phones, we will not here Repeat.
- mobile phone posture applications such as horizontal and vertical screen switching, related Games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, percussion), etc.
- vibration recognition related functions such as pedometer, percussion
- other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which can be configured in mobile phones, we will not here Repeat.
- the audio circuit 1460, the speaker 1461, and the microphone 1462 can provide an audio interface between the user and the mobile phone.
- the audio circuit 1460 can transmit the electrical signal converted from the received audio data to the speaker 1461, which is converted into a sound signal for output by the speaker 1461; on the other hand, the microphone 1462 converts the collected sound signal into an electrical signal, and the audio circuit 1460 After being received, it is converted into audio data, and then processed by the audio data output processor 1480, and sent to, for example, another mobile phone via the RF circuit 1410, or the audio data is output to the memory 1420 for further processing.
- WiFi is a short-distance wireless transmission technology.
- the mobile phone can help users send and receive e-mails, browse web pages, and access streaming media through the WiFi module 1470. It provides users with wireless broadband Internet access.
- FIG. 14 shows the WiFi module 1470, it is understandable that it is not a necessary component of the mobile phone, and can be omitted as needed without changing the essence of the application.
- the processor 1480 is the control center of the mobile phone. It uses various interfaces and lines to connect the various parts of the entire mobile phone. It executes by running or executing software programs and/or modules stored in the memory 1420, and calling data stored in the memory 1420. Various functions and processing data of the mobile phone can be used to monitor the mobile phone as a whole.
- the processor 1480 may include one or more processing units; preferably, the processor 1480 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, and application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1480.
- the mobile phone also includes a power supply 1490 (such as a battery) for supplying power to various components.
- a power supply 1490 (such as a battery) for supplying power to various components.
- the power supply can be logically connected to the processor 1480 through a power management system, so that functions such as charging, discharging, and power management can be managed through the power management system.
- the mobile phone may also include a camera, a Bluetooth module, etc., which will not be repeated here.
- the processor 1480 is specifically configured to perform all or part of the actions in the embodiment shown in FIG.
- the disclosed system, device, and method may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program code .
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Image Analysis (AREA)
Abstract
Disclosed in the present application are a video evaluation method, a terminal, a server, and a related product, for improving the accuracy and comprehensiveness of video evaluation. The method in the embodiments of the present application comprises: first acquiring a video to be evaluated; then calculating target parameters associated with the video to be evaluated, the target parameters comprising at least one of a jitter value, a crop value, and a distortion value, the jitter value comprising the average value of the jitter displacement between each two adjacent frames in the video to be evaluated, the crop value comprising the cropping amount between each two adjacent frames in the video to be evaluated, and the distortion value comprising the average distance from a grid point on at least one curve in a curve grid matching the video to be evaluated to a fitted straight line corresponding to the at least one curve; and, on the basis of the target parameters, evaluating the video to be evaluated.
Description
本申请要求在2019年3月15日提交中国国家知识产权局、申请号为201910202676.1、发明名称为“一种视频的评价方法、终端、服务器及相关产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of a Chinese patent application filed with the State Intellectual Property Office of China, the application number is 201910202676.1, and the invention title is "a video evaluation method, terminal, server and related products" on March 15, 2019, all of which The content is incorporated in this application by reference.
本申请涉及视频防抖领域,尤其涉及一种视频的评价方法、终端、服务器及相关产品。This application relates to the field of video anti-shake, in particular to a video evaluation method, terminal, server and related products.
随着网络速度和终端计算能力的增强,移动终端成为现代人们集娱乐、工作和学习活动的通用设备,扮演者个人智能助理的角色。生活短视频以及户外运动视频已经成为现代人消遣娱乐的主流方式。视频拍摄过程中难以避免会有抖动的画面,这样会大大降低视频的观看效果。With the increase in network speed and terminal computing capabilities, mobile terminals have become a universal device for modern people to integrate entertainment, work and learning activities, playing the role of personal intelligent assistants. Short life videos and outdoor sports videos have become the mainstream way of entertainment for modern people. It is difficult to avoid jittery images during the video shooting process, which will greatly reduce the viewing effect of the video.
目前关于视频防抖的方法很多,但是不同的视频防抖方法所达到的效果有好有坏,因此需要视频防抖方法的优劣进行量化的评价,现有的一种方式是提取防抖处理后的视频中相邻帧的特征点并计算相邻帧之间的单应性矩阵(homograpy),之后把单应性矩阵中各个分量转换到频域内进行分析,统计低频信息占整个频率的比例,比例越高就说明该防抖方法越好。At present, there are many methods for video anti-shake, but different video anti-shake methods can achieve good or bad effects. Therefore, it is necessary to quantify the quality of video anti-shake methods. One of the existing methods is to extract anti-shake processing. The feature points of adjacent frames in the later video are calculated and the homograpy matrix between adjacent frames is calculated, and then each component in the homograpy matrix is converted to the frequency domain for analysis, and the proportion of low frequency information to the entire frequency is calculated , The higher the ratio, the better the anti-shake method.
然而,该评价方式只是从视频抖动频率的层面进行的量化分析,相同抖动频率的不同视频实际上可能有很大的抖动差异,例如,在抖动频率相同的情况下,帧画面较大的视频抖动的幅度明显要大于帧画面较小的视频抖动的幅度,因此该评价方式的准确性较低。However, this evaluation method is only a quantitative analysis from the level of video jitter frequency. Different videos with the same jitter frequency may actually have large jitter differences. For example, in the case of the same jitter frequency, the video jitter of the larger frame picture The amplitude of is obviously greater than the amplitude of video jitter with a smaller frame, so the accuracy of this evaluation method is low.
发明内容Summary of the invention
本申请实施例提供了一种视频的评价方法,提高了对视频评价的准确性和全面性。The embodiments of the present application provide a video evaluation method, which improves the accuracy and comprehensiveness of video evaluation.
本申请第一方面提供了一种视频的评价方法,包括:The first aspect of this application provides a video evaluation method, including:
首先获取待评价视频,之后计算与该待评价视频相关联的目标参数,该目标参数包括抖动数值、裁剪数值以及畸变数值中的至少一项,其中,抖动数值包括待评价视频中每相邻两帧之间的抖动位移量的平均值,裁剪数值包括待评价视频中每相邻两帧之间的裁剪量,畸变数值包括与待评价视频匹配的曲线网格中至少一条曲线上的网格点到与所述至少一条曲线对应的拟合直线的平均距离,进而根据目标参数评价待评价视频。First obtain the video to be evaluated, and then calculate the target parameter associated with the video to be evaluated. The target parameter includes at least one of a jitter value, a cropping value, and a distortion value. The jitter value includes every two adjacent ones in the video to be evaluated. The average value of the jitter displacement between frames, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes the grid points on at least one curve in the curve grid matching the video to be evaluated The average distance to the fitted straight line corresponding to the at least one curve, and then the video to be evaluated is evaluated according to the target parameter.
在该实施方式中,根据抖动数值可以从抖动幅度的方面对视频进行评价,该评价方式的准确性更高,另外,结合裁剪数值以及畸变数值提高了该评价方法的全面性。In this embodiment, the video can be evaluated from the aspect of the jitter amplitude based on the jitter value. This evaluation method is more accurate. In addition, combining the cropping value and the distortion value improves the comprehensiveness of the evaluation method.
可选的,在一些可能的实施方式中,计算与待评价视频相关联的抖动数值包括:Optionally, in some possible implementation manners, calculating the jitter value associated with the video to be evaluated includes:
获取待评价视频中第一帧和第二帧之间的单应性矩阵,第一帧和第二帧为待评价视频中任意的相邻两帧;Acquire the homography matrix between the first frame and the second frame in the video to be evaluated, where the first frame and the second frame are any two adjacent frames in the video to be evaluated;
确定第一帧中第一顶点集合的第一坐标集合;Determine the first coordinate set of the first vertex set in the first frame;
根据第一坐标集合和单应性矩阵计算第二帧中第二顶点集合的第二坐标集合,第一顶 点集合与第二顶点集合匹配;Calculate the second coordinate set of the second vertex set in the second frame according to the first coordinate set and the homography matrix, and the first vertex set matches the second vertex set;
根据第一坐标集合和第二坐标集合计算抖动数值。Calculate the jitter value according to the first coordinate set and the second coordinate set.
可选的,在一些可能的实施方式中,计算与待评价视频相关联的裁剪数值包括:Optionally, in some possible implementation manners, calculating the cropping value associated with the video to be evaluated includes:
计算待评价视频中第一帧上的第一特征点到第一帧的边界的第一距离;Calculate the first distance from the first feature point on the first frame of the video to be evaluated to the boundary of the first frame;
计算待评价视频中第二帧上的第二特征点到第二帧的边界的第二距离,第一帧和第二帧为待评价视频中任意的相邻两帧,第一特征点与第二特征点匹配;Calculate the second distance from the second feature point on the second frame of the video to be evaluated to the boundary of the second frame. The first frame and the second frame are any two adjacent frames in the video to be evaluated. Two feature point matching;
根据第一距离与第二距离计算裁剪数值。The cropping value is calculated according to the first distance and the second distance.
可选的,在一些可能的实施方式中,计算与待评价视频相关联的畸变数值包括:Optionally, in some possible implementation manners, calculating the distortion value associated with the video to be evaluated includes:
根据曲线网格中目标曲线上的网格点获取与目标曲线对应的拟合直线;Obtain the fitted straight line corresponding to the target curve according to the grid points on the target curve in the curve grid;
计算网格点到拟合直线的平均距离得到畸变数值。Calculate the average distance from the grid point to the fitted straight line to get the distortion value.
在上述三个的实施方式中,分别提供了计算抖动数值、裁剪数值以及畸变数值的具体实现方式,提高了本方案的可实现性。In the above three implementation manners, specific implementation manners for calculating the jitter value, the clipping value, and the distortion value are respectively provided, which improves the feasibility of the solution.
可选的,在一些可能的实施方式中,该待评价视频可以是拍摄的原始视频,即本申请的评价方法针对的是终端所拍摄的原始视频,采用本申请的评价方法对视频的抖动情况进行评价。Optionally, in some possible implementations, the video to be evaluated may be the original video taken, that is, the evaluation method of this application is for the original video taken by the terminal, and the evaluation method of this application is used to determine the jitter of the video. Make an evaluation.
在该实施方式中,待评价视频具体为终端所拍摄的原始视频,根据本申请的评价方法所提供的量化指标,用户可以更准确的对所拍摄视频的抖动情况有一个直观的了解,提高了本方案的实用性。In this embodiment, the video to be evaluated is specifically the original video shot by the terminal. According to the quantitative index provided by the evaluation method of this application, the user can more accurately have an intuitive understanding of the jitter of the shot video, which improves The practicality of this program.
可选的,在一些可能的实施方式中,待评价视频至少可以包括第一待评价视频和第二待评价视频,根据所述目标参数评价所述待评价视频包括:Optionally, in some possible implementation manners, the video to be evaluated may include at least a first video to be evaluated and a second video to be evaluated, and evaluating the video to be evaluated according to the target parameter includes:
根据与所述第一待评价视频相关联的第一目标参数以及与与所述第二待评价视频相关联的第二目标参数对比评价所述第一待评价视频和所述第二待评价视频。Compare and evaluate the first video to be evaluated and the second video to be evaluated according to a first target parameter associated with the first video to be evaluated and a second target parameter associated with the second video to be evaluated .
在该实施方式中,可以对多个待评价视频进行对比评价,使得用户可以直观的了解到多个待评价视频的抖动强弱,扩展了本方案的应用场景。In this embodiment, multiple videos to be evaluated can be compared and evaluated, so that the user can intuitively understand the jitter strength of the multiple videos to be evaluated, which expands the application scenarios of this solution.
可选的,在一些可能的实施方式中,该待评价视频可以是原始视频经过防抖算法处理后的视频,即可以通过本申请的评价方法对待评价视频所采用的防抖算法进行评价。Optionally, in some possible implementation manners, the video to be evaluated may be a video in which the original video has been processed by an anti-shake algorithm, that is, the anti-shake algorithm used in the video to be evaluated can be evaluated by the evaluation method of this application.
在该实施方式中,待评价视频具体为原始视频经过防抖算法处理后的视频,通过该评价方式还可以对防抖算法进行评价,提高了本方案的扩展性。In this embodiment, the video to be evaluated is specifically the original video processed by the anti-shake algorithm, and the anti-shake algorithm can also be evaluated through this evaluation method, which improves the scalability of the solution.
可选的,在一些可能的实施方式中,所述待评价视频至少包括第一待评价视频和第二待评价视频,所述第一待评价视频采用第一防抖算法,所述第二待评价视频采用第二防抖算法,根据所述目标参数评价所述待评价视频包括:Optionally, in some possible implementation manners, the video to be evaluated includes at least a first video to be evaluated and a second video to be evaluated, the first video to be evaluated uses a first anti-shake algorithm, and the second video to be evaluated The evaluation video adopts the second anti-shake algorithm, and evaluating the video to be evaluated according to the target parameter includes:
根据与所述第一待评价视频相关联的第一目标参数以及与与所述第二待评价视频相关联的第二目标参数对比评价所述第一防抖算法和所述第二防抖算法。Compare and evaluate the first anti-shake algorithm and the second anti-shake algorithm according to the first target parameter associated with the first video to be evaluated and the second target parameter associated with the second video to be evaluated .
在该实施方式中,可以对多个待评价视频进行对比评价,使得用户可以直观的了解到多个待评价视频分别采用的防抖算法的优劣,扩展了本方案的应用场景。In this embodiment, multiple videos to be evaluated can be compared and evaluated, so that the user can intuitively understand the pros and cons of the anti-shake algorithms adopted by the multiple videos to be evaluated, which expands the application scenarios of this solution.
可选的,在一些可能的实施方式中,若抖动数值越小,裁剪数值越小和/或畸变数值越小,则待评价视频的抖动越小。Optionally, in some possible implementation manners, if the jitter value is smaller, the cropping value is smaller, and/or the distortion value is smaller, the jitter of the video to be evaluated is smaller.
可选的,在一些可能的实施方式中,若抖动数值越小,裁剪数值越小和/或畸变数值 越小,则待评价视频所采用的防抖算法越好。Optionally, in some possible implementation manners, if the jitter value is smaller, the crop value is smaller, and/or the distortion value is smaller, the better the anti-shake algorithm used for the video to be evaluated.
在上述两个的实施方式中,提供了一种根据目标参数对视频的评价标准,其中,可以根据上述三个目标参数中的其中一个参数来评价,也可以结合上述三个目标参数中的多个参数来评价,另外,可以直接输出评价结果,也可以输出直接输出目标参数值以供用户自己评价,使得本方案的评价方法更灵活。In the above two implementation manners, an evaluation standard for videos based on target parameters is provided. The evaluation can be based on one of the above three target parameters, or it can be combined with more of the above three target parameters. In addition, the evaluation result can be directly output, or the target parameter value can be directly output for the user to evaluate, which makes the evaluation method of this scheme more flexible.
本申请第二方面提供了一种终端,包括:The second aspect of the present application provides a terminal, including:
处理器、存储器、总线以及输入输出接口;Processor, memory, bus and input and output interface;
所述存储器中存储有程序代码;Program codes are stored in the memory;
所述处理器调用所述存储器中的程序代码时执行如本申请第一方面或第一方面任一实施方式所执行的步骤。When the processor invokes the program code in the memory, the steps executed in the first aspect or any one of the first aspects of the application are executed.
本申请第三方面提供了一种服务器,包括:The third aspect of the present application provides a server, including:
处理器、存储器、总线以及输入输出接口;Processor, memory, bus and input and output interface;
所述存储器中存储有程序代码;Program codes are stored in the memory;
所述处理器调用所述存储器中的程序代码时执行如本申请第一方面或第一方面任一实施方式所执行的步骤。When the processor invokes the program code in the memory, the steps executed in the first aspect or any one of the first aspects of the application are executed.
本申请第四方面提供了一种计算机可读存储介质,包括指令,当所述指令在计算机上运行时,使得所述计算机执行本申请第一方面或第一方面任一实施方式提供的视频的评价方法中的流程。The fourth aspect of the present application provides a computer-readable storage medium, including instructions, which, when the instructions run on a computer, cause the computer to execute the video provided by the first aspect or any one of the first aspect of the present application. The flow in the evaluation method.
本申请第五方面提供了一种计算机程序产品,当所述指令在计算机上运行时,使得所述计算机执行本申请第一方面或第一方面任一实施方式提供的视频的评价方法中的流程。The fifth aspect of the present application provides a computer program product, which when the instructions run on a computer, causes the computer to execute the process in the video evaluation method provided in the first aspect or any one of the first aspects of the application .
从以上技术方案可以看出,本申请实施例具有以下优点:It can be seen from the above technical solutions that the embodiments of the present application have the following advantages:
本申请实施例提供了一种视频的评价方法,首先获取待评价视频,之后计算得到与待评价视频相关联的目标参数,并根据目标参数评价该待评价视频,其中,目标参数中可以包括抖动数值、裁剪数值以及畸变数值,抖动数值包括待评价视频中每相邻两帧之间的抖动位移量的平均值,根据抖动数值可以在抖动幅度的层面对防抖方式的优劣进行评价,该评价方式的准确性更高,另外,结合裁剪数值以及畸变数值提高了该评价方法的全面性。The embodiment of the application provides a method for evaluating a video. First, a video to be evaluated is obtained, and then a target parameter associated with the video to be evaluated is calculated, and the video to be evaluated is evaluated according to the target parameter, where the target parameter may include jitter Value, crop value, and distortion value. The jitter value includes the average value of the jitter displacement between every two adjacent frames in the video to be evaluated. According to the jitter value, the pros and cons of the anti-shake method can be evaluated at the level of the jitter amplitude. The accuracy of the evaluation method is higher. In addition, the combination of the clipping value and the distortion value improves the comprehensiveness of the evaluation method.
图1为对同一场景拍摄视频所呈现的两种不同的抖动效果的示意图;Figure 1 is a schematic diagram of two different shaking effects presented by shooting a video of the same scene;
图2为本申请视频的评价方法的一个实施例示意图;FIG. 2 is a schematic diagram of an embodiment of the video evaluation method of this application;
图3为视频解析为一帧一帧画面的示意图;Figure 3 is a schematic diagram of the video being parsed into one frame by frame;
图4为计算抖动数值的示意图;Figure 4 is a schematic diagram of calculating the jitter value;
图5为完整帧画面与裁剪后帧画面的对比图;Figure 5 is a comparison diagram of the complete frame picture and the cropped frame picture;
图6为裁剪前帧画面上特征点到帧画面边界的距离的示意图;Fig. 6 is a schematic diagram of the distance from the feature point on the frame before cropping to the boundary of the frame;
图7为裁剪后帧画面上特征点到帧画面边界的距离的示意图;FIG. 7 is a schematic diagram of the distance from the feature points on the frame picture to the frame picture boundary after cropping;
图8为帧画面发生畸变的示意图;Fig. 8 is a schematic diagram of frame picture distortion;
图9为直线网格和曲线网格的对比示意图;Figure 9 is a schematic diagram of comparison between a straight grid and a curved grid;
图10为与曲线对应的拟合直线的示意图;Figure 10 is a schematic diagram of a fitted straight line corresponding to a curve;
图11为本申请终端的一个实施例示意图;FIG. 11 is a schematic diagram of an embodiment of a terminal of this application;
图12为本申请服务器的一个实施例示意图;Figure 12 is a schematic diagram of an embodiment of the application server;
图13为本申请服务器的结构示意图;Figure 13 is a schematic diagram of the structure of the application server;
图14为本申请终端的结构示意图。Figure 14 is a schematic diagram of the structure of the terminal of this application.
本申请实施例提供了一种视频的评价方法,提高了对视频评价的准确性和全面性。The embodiments of the present application provide a video evaluation method, which improves the accuracy and comprehensiveness of video evaluation.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the specification and claims of this application and the above-mentioned drawings are used to distinguish similar objects, without having to use To describe a specific order or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances so that the embodiments described herein can be implemented in an order other than the content illustrated or described herein. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions. For example, a process, method, system, product or device that includes a series of steps or units is not necessarily limited to the clearly listed Those steps or units may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.
众所周知,抖动一直是视频拍摄的天敌,因此防抖直到今天也依然是前沿技术,当前的防抖技术通常包括硬件防抖和软件防抖两个方面。硬件防抖一般是在极度有限的机身空间里塞进更大的感光元件以及更大的光学防抖模块,光学防抖模块主要提升的是拍照的稳定性,面对视频拍摄时更大幅度的相机晃动效果并不理想。软件防抖一般指基于算法的电子防抖,在拍摄视频时预先对画面进行适当裁剪,只利用传感器的一部分画幅,由于需要裁剪掉一部分的画幅以供给防抖算法使用,相应地最终呈现给用户的视频画面也是经过了一定比例的裁剪,并不能充分展示广角镜头宽广的视野。As we all know, jitter has always been the natural enemy of video shooting, so anti-shake is still a cutting-edge technology to this day. Current anti-shake technology usually includes hardware anti-shake and software anti-shake. Hardware anti-shake is generally used to pack a larger photosensitive element and a larger optical anti-shake module into the extremely limited space of the body. The main thing of the optical anti-shake module is to improve the stability of taking pictures, which is more significant when facing video shooting. The camera shake effect is not ideal. Software anti-shake generally refers to algorithm-based electronic anti-shake. When shooting a video, the picture is appropriately cropped in advance, and only a part of the frame of the sensor is used. Because a part of the frame needs to be cropped to provide the anti-shake algorithm, it is finally presented to the user accordingly The video screen of the video is also cropped to a certain proportion, and it cannot fully show the wide field of view of the wide-angle lens.
通过上述描述可知,当前有很多不同的防抖方式,但是也都并非可以完美解决视频抖动的问题,那么也就需要对视频的抖动或者视频所采用的防抖方式进行评价。如图1所示为对同一场景拍摄视频所呈现的两种不同的抖动效果,用户可以通过视觉直观的评价出下图的抖动明显比上图强烈。不过单纯通过人眼的比较还是有很大的局限性,因此需要一套量化的评价标准。It can be seen from the above description that there are many different anti-shake methods, but none of them can perfectly solve the problem of video jitter, so it is necessary to evaluate the video jitter or the anti-shake method adopted by the video. Figure 1 shows two different jitter effects presented by shooting a video of the same scene. The user can visually and intuitively evaluate that the jitter in the figure below is obviously stronger than the figure above. However, comparing with human eyes alone still has great limitations, so a set of quantitative evaluation standards is needed.
现有的一种方式是提取防抖处理后的视频中相邻帧的特征点并计算相邻帧之间的单应性矩阵,之后把单应性矩阵中各个分量转换到频域内进行分析,统计低频信息占整个频率的比例,比例越高就说明该防抖方法越好。然而,该评价方式只是从视频抖动频率的层面进行的量化分析,相同抖动频率的不同视频实际上可能有很大的抖动差异,例如,在抖动频率相同的情况下,帧画面较大的视频抖动的幅度明显要大于帧画面较小的视频抖动的幅度,因此该评价方式的准确性较低。One of the existing methods is to extract the feature points of adjacent frames in the anti-shake processed video and calculate the homography matrix between adjacent frames, and then convert each component in the homography matrix to the frequency domain for analysis. Calculate the proportion of low frequency information to the entire frequency. The higher the proportion, the better the anti-shake method. However, this evaluation method is only a quantitative analysis from the level of video jitter frequency. Different videos with the same jitter frequency may actually have large jitter differences. For example, in the case of the same jitter frequency, the video jitter of the larger frame picture The amplitude of is obviously greater than the amplitude of video jitter with a smaller frame, so the accuracy of this evaluation method is low.
为此,本申请实施例提供了一种视频的评价方法,下面进行详细介绍,如图2所示:To this end, an embodiment of the present application provides a video evaluation method, which is described in detail below, as shown in Figure 2:
201、获取待评价视频。201. Obtain a video to be evaluated.
本申请中的视频评价方法即可应用于终端,也可以应用于服务器。对于终端来说,终端可以通过自身拍摄获取一段待评价视频,当然也可以通过下载等其他方式获取该待评价视频。对于服务器来说,服务器可以接收终端发送的待评价视频。对于待评价视频的具体获取方式,本申请不做限定。The video evaluation method in this application can be applied to a terminal or a server. For the terminal, the terminal can acquire a video to be evaluated through its own shooting, and of course, it can also acquire the video to be evaluated through other methods such as downloading. For the server, the server can receive the video to be evaluated sent by the terminal. This application does not limit the specific method of obtaining the video to be evaluated.
可选的,待评价视频可以是终端拍摄的原始视频,也可以是原始视频经过防抖算法处理后的视频。例如,用户为了提高视频的防抖效果,可以采用第三方的防抖算法对终端所拍摄的原始视频进行防抖处理,当然,终端所拍摄的原始视频也可以是经过了终端内部防抖处理后的视频,具体此处不做限定。Optionally, the video to be evaluated may be the original video shot by the terminal, or the original video processed by the anti-shake algorithm. For example, in order to improve the anti-shake effect of the video, users can use a third-party anti-shake algorithm to anti-shake the original video shot by the terminal. Of course, the original video shot by the terminal can also be processed after the internal anti-shake processing of the terminal. The specific video is not limited here.
202、计算与待评价视频相关联的目标参数。202. Calculate target parameters associated with the video to be evaluated.
本实施例中,目标参数具体可以包括抖动数值、裁剪数值以及畸变数值中的至少一项。其中,抖动数值包括待评价视频中每相邻两帧之间的抖动位移量的平均值,裁剪数值包括待评价视频中每相邻两帧之间的裁剪量,畸变数值包括与待评价视频匹配的曲线网格中至少一条曲线上的网格点到与至少一条曲线对应的拟合直线的平均距离。In this embodiment, the target parameter may specifically include at least one of a jitter value, a crop value, and a distortion value. Among them, the jitter value includes the average value of the jitter displacement between every two adjacent frames in the video to be evaluated, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes matching with the video to be evaluated The average distance from a grid point on at least one curve in the curve grid to the fitted straight line corresponding to at least one curve.
下面分别对上述三种类型目标参数的具体算法进行详细说明:The specific algorithms of the above three types of target parameters are described in detail below:
1、目标参数为抖动数值。1. The target parameter is the jitter value.
从实际拍摄视频和视觉感受的方面来看,只有拍摄视频时画面移动的方向发生变化了才能给人视觉造成抖动的感觉,当画面连续沿着同一个方向移动时给人的是稳定的视觉感受。因此,本申请所定义的抖动数值是基于拍摄时画面移动方向发生了变化的抖动,例如,拍摄开始时画面是朝着X轴的正方向移动,此时并不视作抖动,但是如果之后画面在除了X轴正方向之外的其他方向上有位移(例如,X轴负方向、Y轴或Z轴方向),那么将视作视频发生了抖动。From the perspective of actual video shooting and visual experience, only when the direction of the screen movement changes when shooting the video can it give people a sense of jitter. When the screen continues to move in the same direction, it gives people a stable visual experience. . Therefore, the jitter value defined in this application is based on the jitter when the moving direction of the screen changes during shooting. For example, the screen moves in the positive direction of the X axis at the beginning of the shooting. At this time, it is not regarded as jitter, but if the screen moves later If there is displacement in directions other than the positive X-axis direction (for example, the negative X-axis, Y-axis, or Z-axis direction), it will be considered that the video has jittered.
请参阅图3,需要说明的是,一段视频可以解析为一帧一帧的画面,若视频发生抖动那么就可能出现相邻的两帧中的特征点(如图2中所示的9个特征点)在帧画面中的位置发生了变化。通过提取每一帧的特征点可以计算得到每相邻两帧之间的单应性矩阵(homography matrix)。其中,单应性矩阵可以理解为描述同一个平面的点在不同图像之间的映射关系,那么如果已知了前一帧画面上顶点的坐标,进而根据相邻帧之间的单应性矩阵即可计算得到后一帧画面上与前一帧上顶点所匹配的顶点坐标。Please refer to Figure 3. It should be noted that a piece of video can be parsed as a frame by frame. If the video is jittery, there may be feature points in two adjacent frames (9 features as shown in Figure 2). The position of dot) in the frame has changed. By extracting the characteristic points of each frame, the homography matrix between every two adjacent frames can be calculated. Among them, the homography matrix can be understood as describing the mapping relationship between points on the same plane in different images. If the coordinates of the vertices on the previous frame are known, then according to the homography matrix between adjacent frames Then, the coordinates of the vertices that match the vertices on the previous frame can be calculated.
基于上述说明,下面通过示例对抖动数值的计算方式进行介绍:Based on the above description, the calculation method of the jitter value is introduced below through an example:
请参阅图4,假设当前帧相对于前一帧是在同一方向上移动的两帧画面,而后一帧相对于当前帧存在抖动,因此需要求出后一帧相对于当前帧的抖动数值。首先可以定义前一帧四个顶点A,B,C,D的坐标,随后根据前一帧与当前帧之间的单应性矩阵计算出当前帧对应的四个顶点的坐标,同理可以进一步根据当前帧和后一帧之间的单应性矩阵计算出后一帧对应的四个顶点的坐标。那么通过计算当前帧四个顶点的坐标与前一帧四个顶点的坐标的差值即可得到当前帧相对于前一帧的位移矢量
通过计算后一帧四个顶点的坐标与当前帧四个顶点的坐标的差值即可得到后一帧相对于当前帧的位移矢量
以当前帧位移矢量建立坐标系o-xy,然后把后一帧位移矢量
进行分解,其中,分解出的方向不同于矢量
的矢量
就是后一帧相对于当前帧的抖动数值。可以理解的是,对整个视频每两帧之间的抖动数值求和后再做均值即可得到与整个待评价视频相关联的抖动数值。
Refer to Figure 4, assuming that the current frame is two frames moving in the same direction relative to the previous frame, and the latter frame has jitter relative to the current frame, so the jitter value of the latter frame relative to the current frame needs to be obtained. First, you can define the coordinates of the four vertices A, B, C, and D in the previous frame, and then calculate the coordinates of the four vertices corresponding to the current frame according to the homography matrix between the previous frame and the current frame. The same goes for further The coordinates of the four vertices corresponding to the next frame are calculated according to the homography matrix between the current frame and the next frame. Then by calculating the difference between the coordinates of the four vertices of the current frame and the coordinates of the four vertices of the previous frame, the displacement vector of the current frame relative to the previous frame can be obtained. By calculating the difference between the coordinates of the four vertices of the next frame and the coordinates of the four vertices of the current frame, the displacement vector of the next frame relative to the current frame can be obtained Establish the coordinate system o-xy with the displacement vector of the current frame, and then set the displacement vector of the next frame Decompose, where the decomposed direction is different from the vector Vector of It is the jitter value of the next frame relative to the current frame. It is understandable that the jitter value associated with the entire video to be evaluated can be obtained by summing the jitter values between every two frames of the entire video and then doing an average value.
需要说明的是,上述计算每一帧上四个顶点的坐标只是提供了一种示例,在实际操作中也可以提取更多数量的顶点来计算相应的坐标,具体此处不做限定。It should be noted that the foregoing calculation of the coordinates of the four vertices on each frame is only an example. In actual operation, a larger number of vertices can also be extracted to calculate the corresponding coordinates, which is not specifically limited here.
2、目标参数为裁剪数值。2. The target parameter is the clipping value.
如图5所示,一般经过防抖处理的视频会预先对画面进行适当裁剪,只利用传感器的 一部分画幅(图5中所示的有效画幅),相应地最终呈现给用户的视频画面也是经过了一定比例的裁剪。若相邻的两帧之间如果有抖动发生,那么后一帧相对于前一帧有可能是经过了裁剪处理的。因此这里将裁剪数值也作为视频的其中一项评价指标。As shown in Figure 5, the video that is generally subjected to anti-shake processing will be appropriately cropped in advance, and only a part of the frame of the sensor (the effective frame shown in Figure 5) is used. Accordingly, the final video screen presented to the user is also A certain percentage of cropping. If there is jitter between two adjacent frames, then the next frame may be cropped compared to the previous frame. Therefore, the cropping value is also used as one of the evaluation indicators of the video.
基于上述说明,下面通过示例对裁剪数值的计算方式进行介绍:Based on the above description, the calculation method of the cropping value is introduced through the example below:
请参阅图6和图7,图6和图7是对同一场景所拍摄的视频中相邻的两帧,可以看出图7相对于图6进行了适当的裁剪。首先提取上述相邻两帧中相匹配的特征点,例如,图6中的第一特征点和图7中的第二特征点都是图像中的杯子,之后分别计算第一特征点到图6所示帧的边界的第一距离和第二特征点到图7所示帧的边界的第二距离,进而根据第一距离和第二距离即可计算上述相邻两帧之间的裁剪数值。具体地,该裁剪数值可以是裁剪变化量,即第一距离与第二距离的差值;除此之外,该裁剪数值也可以是其他形式,例如也可以是裁剪百分比,即第一距离与第二距离的比值,具体此处不做限定。可以理解的是,对整个视频每两帧之间的裁剪数值求和后再做均值即可得到与整个待评价视频相关联的裁剪数值。Please refer to FIG. 6 and FIG. 7. FIG. 6 and FIG. 7 are two adjacent frames in the video shot in the same scene. It can be seen that FIG. 7 is appropriately cropped relative to FIG. 6. First extract the matching feature points in the two adjacent frames. For example, the first feature point in Figure 6 and the second feature point in Figure 7 are both cups in the image, and then the first feature points to Figure 6 are calculated separately. The first distance between the border of the frame and the second distance from the second feature point to the border of the frame shown in FIG. 7 are shown, and the cropping value between the two adjacent frames can be calculated according to the first distance and the second distance. Specifically, the cropping value can be the cropping variation, that is, the difference between the first distance and the second distance; in addition, the cropping value can also be in other forms, for example, the cropping percentage, that is, the first distance and the The ratio of the second distance is not specifically limited here. It is understandable that the cropping value associated with the entire video to be evaluated can be obtained by summing the cropping values between every two frames of the entire video and then performing an average value.
需要说明的是,在实际应用中,每一帧图像上提取的特征点的数量可以是1个也可以是多个,具体此处不做限定。另外,特征点到帧边界的距离可以是指特征点到帧画面中任意一条边界的距离,具体此处不做限定。It should be noted that, in actual applications, the number of feature points extracted on each frame of image may be one or multiple, which is not specifically limited here. In addition, the distance from the feature point to the frame boundary may refer to the distance from the feature point to any boundary in the frame picture, which is not specifically limited here.
3、目标参数为畸变数值。3. The target parameter is the distortion value.
请参阅图8,若拍摄视频时发生抖动,有可能出现相邻的两个帧画面之间后一帧相对于前一帧图像上出现扭曲的情况,例如图8中,后一帧图像中所拍摄到的楼的轮廓不再是规则的直线,发生了一定程度的弯曲。本申请中对这种图像扭曲的情况进行了量化,并把该量化结果定义为畸变数值。Please refer to Figure 8. If jitter occurs when shooting a video, it may appear that the next frame is distorted relative to the previous frame between two adjacent frames. For example, in Figure 8, the image in the next frame may be distorted. The silhouette of the photographed building is no longer a regular straight line, but has a certain degree of curvature. In this application, the image distortion is quantified, and the quantization result is defined as the distortion value.
基于上述说明,下面通过示例对畸变数值的计算方式进行介绍:Based on the above description, the calculation method of the distortion value is introduced below through an example:
请参阅图9,假设前一帧图像正常而后一帧图像发生了扭曲,那么可以在前一帧上布局均匀的直线网格,并求出每个网格点(各直线的交点)的坐标,另外计算得到这两帧之间的单应性矩阵,直线网格上的网格点通过单应性矩阵转换到后一帧上,由于单应性矩阵的非线性特定导致原来共线的交点变换后不再共线,因此与后一帧图像匹配的不再是直线网格而是曲线网格。请参阅图10,进一步根据直线网格上各网格点的坐标以及单应性矩阵即可求得曲线网格上对应的各网格点的坐标,之后可以生成与曲线网格中的曲线所对应的拟合直线,最终再计算该曲线上各网格点到该拟合直线的平均距离得到畸变数值。Refer to Figure 9. Assuming that the previous frame of image is normal and the next frame of image is distorted, then a uniform linear grid can be laid out on the previous frame, and the coordinates of each grid point (the intersection of the straight lines) can be calculated. In addition, the homography matrix between the two frames is calculated, and the grid points on the linear grid are transformed to the next frame through the homography matrix. The non-linearity of the homography matrix results in the transformation of the original collinear intersection point. The rear is no longer collinear, so the matching with the next frame of image is no longer a straight grid but a curved grid. Please refer to Figure 10, further according to the coordinates of the grid points on the linear grid and the homography matrix, the coordinates of the corresponding grid points on the curved grid can be obtained, and then the coordinates of the corresponding grid points on the curved grid can be generated. Corresponding to the fitted straight line, and finally calculate the average distance from each grid point on the curve to the fitted straight line to obtain the distortion value.
需要说明的是,在实际应用中,可以只选取曲线网格中的某一条曲线来计算畸变数值,也可以选取多条曲线来计算畸变数值的平均值,具体此处不做限定。It should be noted that in practical applications, only one curve in the curve grid can be selected to calculate the distortion value, or multiple curves can be selected to calculate the average value of the distortion value, which is not specifically limited here.
203、根据目标参数评价待评价视频。203. Evaluate the video to be evaluated according to the target parameter.
本实施例中,在计算出与待评价视频相关联的目标参数后,需要进一步根据目标参数评价待评价视频。通过上述描述可以理解,对于待评价视频为终端拍摄的原始视频,若抖动数值越小,裁剪数值越小或者畸变数值越小,则该待评价视频的抖动越小,当然若终端自身具备相应的防抖功能,也可以说明该终端自身的防抖功能越好。对于待评价视频为原始视频经过防抖算法处理后的视频,若抖动数值越小,裁剪数值越小或者畸变数值越小,则该待评价视频所采用的防抖算法越好。In this embodiment, after the target parameter associated with the video to be evaluated is calculated, it is necessary to further evaluate the video to be evaluated according to the target parameter. It can be understood from the above description that for the original video captured by the terminal for the video to be evaluated, if the jitter value is smaller, the cropping value is smaller, or the distortion value is smaller, the jitter of the video to be evaluated is smaller. Of course, if the terminal itself has the corresponding The anti-shake function can also indicate that the terminal's own anti-shake function is better. For the video to be evaluated as the original video processed by the anti-shake algorithm, if the jitter value is smaller, the cropping value is smaller, or the distortion value is smaller, the better the anti-shake algorithm is adopted for the video to be evaluated.
需要说明的是,在实际应用中,可以根据上述三个目标参数中的其中一个来评价待评价视频,也可以综合其中多个不同类型的目标参数来评价待评价视频,具体此处不做限定。另外,对于评价的结果可以通过不同的等级进行区分,例如可以分为“好,中,差”三个等级,当然评价结果也可以通过评分等其他形式进行区分,具体此处不做限定。并且,用户可以根据自身需要选择某一个视频来评价,也可以输入多个不同的视频通过比较进行评价,例如输入的评价结果为“视频A优于视频B”,即说明视频A相对于视频B的抖动更小,或者视频A相对于视频B所采用的防抖算法更好,具体此处不做限定。It should be noted that in practical applications, the video to be evaluated can be evaluated according to one of the above three target parameters, or multiple different types of target parameters can be integrated to evaluate the video to be evaluated. The specifics are not limited here. . In addition, the evaluation results can be distinguished by different levels, for example, it can be divided into three levels of "good, medium, and poor". Of course, the evaluation results can also be distinguished by other forms such as scoring, and the specifics are not limited here. In addition, users can select a certain video to evaluate according to their own needs, or input multiple different videos to evaluate by comparison. For example, the input evaluation result is "Video A is better than Video B", which means that video A is relative to video B The jitter of is smaller, or the anti-shake algorithm adopted by video A is better than that of video B, which is not limited here.
下面通过一些示例对视频评价的过程进行介绍:Here are some examples to introduce the process of video evaluation:
1、单一视频的评价方式较为直接,即输入视频并输出评价结果。例如,用户输入待评价视频并选择拍摄该视频的终端设备,系统中预先设置有各目标参数的具体数值所对应的评价结果,如“裁剪数值0-10%”对应的评价结果为“好”,“裁剪数值10%-20%”对应的评价结果为“中”,“裁剪数值超过20%”对应的评价结果为“差”,因此系统根据计算得到的目标参数即可生成评价结果。1. The evaluation method of a single video is relatively straightforward, that is, input the video and output the evaluation result. For example, if the user enters a video to be evaluated and selects the terminal device that shoots the video, the evaluation result corresponding to the specific value of each target parameter is preset in the system. For example, the evaluation result corresponding to "Crop Value 0-10%" is "good" , The evaluation result corresponding to "cut value 10%-20%" is "medium", and the evaluation result corresponding to "cut value exceeds 20%" is "poor", so the system can generate evaluation results based on the calculated target parameters.
2、多个不同视频的对比评价可以分为分为以下几种情况。第一、同一终端拍摄的不同视频;第二,不同终端拍摄的不同视频;第三,同一原始视频经过不同防抖算法处理后的视频;第四,不同的原始视频经过不同防抖算法处理后的视频。例如,用户想要对同一终端拍摄的两段不同的视频(视频A和视频B)进行评价,由于同一终端拍摄出来的视频的帧画面大小一致,用户只需输入视频A和视频B系统即可通过计算分析反馈评价结果,如视频A和视频B是同一终端拍摄的两段不同的原始视频,那么评价结果可以是“视频A抖动小于视频B”或者视频A和视频B是经过不同防抖算法处理后的两段视频,那么评价结果可以是“视频A的防抖算法优于视频B的防抖算法”等。又例如,用户想要对不同终端拍摄的视频(视频C和视频D)进行评价,由于不同终端拍摄出来的视频的帧画面大小有可能不同,用户在输入视频C和视频D的同时还可以输入分别拍摄视频C和视频D的终端,系统通过计算分析反馈评价结果。2. The comparative evaluation of multiple different videos can be divided into the following situations. First, different videos shot by the same terminal; second, different videos shot by different terminals; third, the same original video processed by different anti-shake algorithms; fourth, different original videos processed by different anti-shake algorithms Video. For example, the user wants to evaluate two different videos (Video A and Video B) shot by the same terminal. Since the frame sizes of the videos shot by the same terminal are the same, the user only needs to enter the system of Video A and Video B. Feedback evaluation results through calculation and analysis. For example, video A and video B are two different original videos shot by the same terminal, then the evaluation result can be "Video A jitter is less than video B" or video A and video B have undergone different anti-shake algorithms After the two pieces of video are processed, the evaluation result can be "the anti-shake algorithm of video A is better than that of video B" and so on. For another example, a user wants to evaluate videos (video C and video D) shot by different terminals. Since the frame sizes of videos shot by different terminals may be different, the user can input video C and video D at the same time. For terminals that shoot video C and video D, the system feeds back the evaluation results through calculation and analysis.
需要说明的是,若系统综合上述三种不同类型的目标参数来评价,那么有可能出现以下三个参数大小对比不统一的情况:例如,对于视频A和视频B,视频A的抖动数值小于视频B的抖动数值,不过视频A的裁剪数值大于视频B的裁剪数值,并且视频A的畸变数值大于视频B的畸变数值。那么系统可以优先设置有三个参数对应的权重,如三个参数按重要性排列的话,抖动数值的重要性最高,其次是裁剪数值,再次是畸变数值,那么三个参数分别对应的权重可以是抖动数值(权重60%),裁剪数值(权重30%),畸变数值(权重10%),在这种情况下,系统不再分别对比视频A和视频B的三个目标参数,而是通过加权求平均分的方式分别计算出视频A和视频B的加权数值,进而对视频A和视频B的加权数值进行比较并输出评价结果。另外,系统也可以将计算得到的目标参数的数值直接反馈给用户,由用户自己来评价,具体此处不做限定。It should be noted that if the system integrates the above three different types of target parameters to evaluate, then the following three parameters may be inconsistent in size comparison: For example, for video A and video B, the jitter value of video A is smaller than that of video The jitter value of B, but the crop value of video A is greater than the crop value of video B, and the distortion value of video A is greater than the distortion value of video B. Then the system can set the weights corresponding to the three parameters first. For example, if the three parameters are arranged in order of importance, the jitter value has the highest importance, followed by the clipping value, and then the distortion value. Then the weights corresponding to the three parameters can be jitter Value (weight 60%), crop value (weight 30%), distortion value (weight 10%). In this case, the system no longer compares the three target parameters of video A and video B, but calculates by weighting The average score method calculates the weighted values of video A and video B respectively, and then compares the weighted values of video A and video B and outputs the evaluation result. In addition, the system can also directly feed back the calculated value of the target parameter to the user for evaluation by the user himself, which is not specifically limited here.
需要说明的是,本申请的评价方式适用于各种类型针对于视频的防抖方式,除了上述提到的防抖算法外,例如还可以对拍摄视频时所使用的云台进行评价,具体此处不做限定。It should be noted that the evaluation method in this application is applicable to various types of anti-shake methods for videos. In addition to the anti-shake algorithm mentioned above, for example, it can also evaluate the pan/tilt used when shooting videos. There are no restrictions.
本申请实施例中,首先获取待评价视频,之后计算得到与待评价视频相关联的目标参数,并根据目标参数评价该待评价视频,其中,目标参数中可以包括抖动数值、裁剪数值 以及畸变数值,抖动数值包括待评价视频中每相邻两帧之间的抖动位移量的平均值,根据抖动数值可以在抖动幅度的层面对防抖方式的优劣进行评价,该评价方式的准确性更高,另外,结合裁剪数值以及畸变数值提高了该评价方法的全面性。In the embodiment of this application, the video to be evaluated is first obtained, and then the target parameter associated with the video to be evaluated is calculated, and the video to be evaluated is evaluated according to the target parameter, where the target parameter may include a jitter value, a crop value, and a distortion value , The jitter value includes the average value of the jitter displacement between every two adjacent frames in the video to be evaluated. According to the jitter value, the pros and cons of the anti-shake method can be evaluated at the level of the jitter amplitude. This evaluation method is more accurate In addition, the combination of clipping value and distortion value improves the comprehensiveness of the evaluation method.
上面对本申请中一种视频的评价方法进行了描述,下面对本申请中执行该视频的评价方法的设备进行描述。The above describes a video evaluation method in the present application, and the following describes the device for executing the video evaluation method in the present application.
请参阅图11,执行上述视频的评价方法的设备为终端,本申请中终端的一个实施例包括:Referring to FIG. 11, the device that executes the above video evaluation method is a terminal, and an embodiment of the terminal in this application includes:
获取单元1101、用于获取待评价视频;The obtaining unit 1101 is configured to obtain a video to be evaluated;
计算单元1102、用于计算与待评价视频相关联的目标参数,目标参数包括抖动数值、裁剪数值以及畸变数值中的至少一项,抖动数值包括待评价视频中每相邻两帧之间的抖动位移量的平均值,裁剪数值包括待评价视频中每相邻两帧之间的裁剪量,畸变数值包括与待评价视频匹配的曲线网格中至少一条曲线上的网格点到与至少一条曲线对应的拟合直线的平均距离;The calculation unit 1102 is used to calculate a target parameter associated with the video to be evaluated. The target parameter includes at least one of a jitter value, a crop value, and a distortion value. The jitter value includes the jitter between every two adjacent frames in the video to be evaluated. The average value of the displacement, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes the grid point on at least one curve of the curve grid matching the video to be evaluated to at least one curve The average distance of the corresponding fitted straight line;
评价单元1103、用于根据目标参数评价待评价视频。The evaluation unit 1103 is configured to evaluate the video to be evaluated according to target parameters.
具体地,获取单元1101执行的步骤与上述图2所示实施例中步骤201类似,计算单元1102执行的步骤与上述图2所示实施例中步骤202类似,评价单元1103执行的步骤与上述图2所示实施例中步骤203类似,具体此处不再赘述。Specifically, the steps performed by the obtaining unit 1101 are similar to the step 201 in the embodiment shown in FIG. 2, the steps performed by the calculating unit 1102 are similar to the step 202 in the embodiment shown in FIG. 2, and the steps performed by the evaluation unit 1103 are similar to those in the above figure. Step 203 in the embodiment shown in 2 is similar, and details are not repeated here.
请参阅图12,执行上述视频的评价方法的设备为服务器,本申请中服务器的一个实施例包括:Referring to FIG. 12, the device that executes the above-mentioned video evaluation method is a server, and an embodiment of the server in this application includes:
获取单元1201、用于获取待评价视频;The obtaining unit 1201 is configured to obtain a video to be evaluated;
计算单元1202、用于计算与待评价视频相关联的目标参数,目标参数包括抖动数值、裁剪数值以及畸变数值中的至少一项,抖动数值包括待评价视频中每相邻两帧之间的抖动位移量的平均值,裁剪数值包括待评价视频中每相邻两帧之间的裁剪量,畸变数值包括与待评价视频匹配的曲线网格中至少一条曲线上的网格点到与至少一条曲线对应的拟合直线的平均距离;The calculation unit 1202 is used to calculate a target parameter associated with the video to be evaluated. The target parameter includes at least one of a jitter value, a cropping value, and a distortion value. The jitter value includes the jitter between every two adjacent frames in the video to be evaluated. The average value of the displacement, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes the grid point on at least one curve of the curve grid matching the video to be evaluated to at least one curve The average distance of the corresponding fitted straight line;
评价单元1203、用于根据目标参数评价待评价视频。The evaluation unit 1203 is configured to evaluate the video to be evaluated according to target parameters.
具体地,获取单元1201执行的步骤与上述图2所示实施例中步骤201类似,计算单元1202执行的步骤与上述图2所示实施例中步骤202类似,评价单元1203执行的步骤与上述图2所示实施例中步骤203类似,具体此处不再赘述。Specifically, the steps performed by the acquiring unit 1201 are similar to step 201 in the embodiment shown in FIG. 2 above, the steps performed by the calculating unit 1202 are similar to step 202 in the embodiment shown in FIG. 2 above, and the steps performed by the evaluation unit 1203 are similar to those in the above figure. Step 203 in the embodiment shown in 2 is similar, and details are not repeated here.
上面从模块化功能实体的角度对本申请实施例中的服务器及终端进行了描述,下面从硬件处理的角度对本申请施例中的服务器及终端进行描述:The server and terminal in the embodiment of the present application are described above from the perspective of modular functional entities, and the server and terminal in the embodiment of the present application are described from the perspective of hardware processing below:
图13是本申请实施例提供的一种服务器结构示意图,该服务器1300可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1322(例如,一个或一个以上处理器)和存储器1332,一个或一个以上存储应用程序1342或数据1344的存储介质1330(例如一个或一个以上海量存储设备)。其中,存储器1332和存储介质1330可以是短暂存储或持久存储。存储在存储介质1330的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1322可以设置为与存储介质1330通信,在服务器1300 上执行存储介质1330中的一系列指令操作。FIG. 13 is a schematic diagram of a server structure provided by an embodiment of the present application. The server 1300 may have relatively large differences due to different configurations or performance, and may include one or more central processing units (CPU) 1322 (for example, , One or more processors) and memory 1332, and one or more storage media 1330 (for example, one or more storage devices) that store application programs 1342 or data 1344. Among them, the memory 1332 and the storage medium 1330 may be short-term storage or persistent storage. The program stored in the storage medium 1330 may include one or more modules (not shown in the figure), and each module may include a series of command operations on the server. Further, the central processing unit 1322 may be configured to communicate with the storage medium 1330, and execute a series of instruction operations in the storage medium 1330 on the server 1300.
该中央处理器1322可以根据指令操作执行如图2所示实施例中的全部或部分动作,具体此处不再赘述。The central processing unit 1322 can execute all or part of the actions in the embodiment shown in FIG. 2 according to instruction operations, and details are not described herein again.
服务器1300还可以包括一个或一个以上电源1326,一个或一个以上有线或无线网络接口1350,一个或一个以上输入输出接口1358,和/或,一个或一个以上操作系统1341,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。The server 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input and output interfaces 1358, and/or one or more operating systems 1341, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
本申请实施例还提供了一种终端,如图14所示,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端可以为包括手机、平板电脑、个人数字助理(personal digital assistant,PDA)、销售终端(point of sales,POS)、车载电脑等任意终端设备,以终端为手机为例:The embodiment of the present application also provides a terminal. As shown in FIG. 14, for ease of description, only the parts related to the embodiment of the present application are shown. For specific technical details that are not disclosed, please refer to the method part of the embodiment of the present application. The terminal can be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), a car computer, etc. Take the terminal as a mobile phone as an example:
图14示出的是与本申请实施例提供的终端相关的手机的部分结构的框图。参考图14,手机包括:射频(radio frequency,RF)电路1410、存储器1420、输入单元1430、显示单元1440、传感器1450、音频电路1460、无线保真(wireless fidelity,WiFi)模块1470、处理器1480、以及电源1490等部件。本领域技术人员可以理解,图14中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。FIG. 14 shows a block diagram of a part of the structure of a mobile phone related to a terminal provided in an embodiment of the present application. Referring to FIG. 14, the mobile phone includes: a radio frequency (RF) circuit 1410, a memory 1420, an input unit 1430, a display unit 1440, a sensor 1450, an audio circuit 1460, a wireless fidelity (WiFi) module 1470, and a processor 1480 , And power supply 1490 and other components. Those skilled in the art can understand that the structure of the mobile phone shown in FIG. 14 does not constitute a limitation on the mobile phone, and may include more or less components than shown in the figure, or combine some components, or arrange different components.
下面结合图14对手机的各个构成部件进行具体的介绍:The following describes the components of the mobile phone in detail with reference to Figure 14:
RF电路1410可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器1480处理;另外,将设计上行的数据发送给基站。通常,RF电路1410包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(low noise amplifier,LNA)、双工器等。此外,RF电路1410还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(global system of mobile communication,GSM)、通用分组无线服务(general packet radio service,GPRS)、码分多址(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA)、长期演进(long term evolution,LTE)、电子邮件、短消息服务(short messaging service,SMS)等。The RF circuit 1410 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the downlink information of the base station, it is processed by the processor 1480; in addition, the designed uplink data is sent to the base station. Generally, the RF circuit 1410 includes but is not limited to an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 1410 can also communicate with the network and other devices through wireless communication. The above wireless communication can use any communication standard or protocol, including but not limited to the global system of mobile communication (GSM), general packet radio service (GPRS), code division multiple access (code division multiple access, GSM) Multiple access, CDMA), wideband code division multiple access (WCDMA), long term evolution (LTE), email, short messaging service (SMS), etc.
存储器1420可用于存储软件程序以及模块,处理器1480通过运行存储在存储器1420的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器1420可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1420可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 1420 may be used to store software programs and modules. The processor 1480 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1420. The memory 1420 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of mobile phones. In addition, the memory 1420 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
输入单元1430可用于接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元1430可包括触控面板1431以及其他输入设备1432。触控面板1431,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1431上或在触控面板1431附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1431可包括触 摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1480,并能接收处理器1480发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1431。除了触控面板1431,输入单元1430还可以包括其他输入设备1432。具体地,其他输入设备1432可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。The input unit 1430 can be used to receive input digital or character information, and to generate key signal input related to the user settings and function control of the mobile phone. Specifically, the input unit 1430 may include a touch panel 1431 and other input devices 1432. The touch panel 1431, also known as a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1431 or near the touch panel 1431. Operation), and drive the corresponding connection device according to the preset program. Optionally, the touch panel 1431 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1480, and can receive commands sent by the processor 1480 and execute them. In addition, the touch panel 1431 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1431, the input unit 1430 may also include other input devices 1432. Specifically, other input devices 1432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, joystick, and the like.
显示单元1440可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元1440可包括显示面板1441,可选的,可以采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置显示面板1441。进一步的,触控面板1431可覆盖显示面板1441,当触控面板1431检测到在其上或附近的触摸操作后,传送给处理器1480以确定触摸事件的类型,随后处理器1480根据触摸事件的类型在显示面板1441上提供相应的视觉输出。虽然在图14中,触控面板1431与显示面板1441是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板1431与显示面板1441集成而实现手机的输入和输出功能。The display unit 1440 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The display unit 1440 may include a display panel 1441. Optionally, the display panel 1441 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc. Further, the touch panel 1431 can cover the display panel 1441. When the touch panel 1431 detects a touch operation on or near it, it transmits it to the processor 1480 to determine the type of the touch event, and then the processor 1480 responds to the touch event. Type provides corresponding visual output on the display panel 1441. Although in FIG. 14, the touch panel 1431 and the display panel 1441 are used as two independent components to implement the input and input functions of the mobile phone, but in some embodiments, the touch panel 1431 and the display panel 1441 can be integrated Realize the input and output functions of mobile phones.
手机还可包括至少一种传感器1450,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1441的亮度,接近传感器可在手机移动到耳边时,关闭显示面板1441和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。The mobile phone may also include at least one sensor 1450, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor can include an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 1441 according to the brightness of the ambient light. The proximity sensor can close the display panel 1441 and/or when the mobile phone is moved to the ear. Or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when stationary, and can be used to identify mobile phone posture applications (such as horizontal and vertical screen switching, related Games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, percussion), etc.; as for other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which can be configured in mobile phones, we will not here Repeat.
音频电路1460、扬声器1461,传声器1462可提供用户与手机之间的音频接口。音频电路1460可将接收到的音频数据转换后的电信号,传输到扬声器1461,由扬声器1461转换为声音信号输出;另一方面,传声器1462将收集的声音信号转换为电信号,由音频电路1460接收后转换为音频数据,再将音频数据输出处理器1480处理后,经RF电路1410以发送给比如另一手机,或者将音频数据输出至存储器1420以便进一步处理。The audio circuit 1460, the speaker 1461, and the microphone 1462 can provide an audio interface between the user and the mobile phone. The audio circuit 1460 can transmit the electrical signal converted from the received audio data to the speaker 1461, which is converted into a sound signal for output by the speaker 1461; on the other hand, the microphone 1462 converts the collected sound signal into an electrical signal, and the audio circuit 1460 After being received, it is converted into audio data, and then processed by the audio data output processor 1480, and sent to, for example, another mobile phone via the RF circuit 1410, or the audio data is output to the memory 1420 for further processing.
WiFi属于短距离无线传输技术,手机通过WiFi模块1470可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图14示出了WiFi模块1470,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变申请的本质的范围内而省略。WiFi is a short-distance wireless transmission technology. The mobile phone can help users send and receive e-mails, browse web pages, and access streaming media through the WiFi module 1470. It provides users with wireless broadband Internet access. Although FIG. 14 shows the WiFi module 1470, it is understandable that it is not a necessary component of the mobile phone, and can be omitted as needed without changing the essence of the application.
处理器1480是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器1420内的软件程序和/或模块,以及调用存储在存储器1420内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器1480可包括一个或多个处理单元;优选的,处理器1480可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主 要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1480中。The processor 1480 is the control center of the mobile phone. It uses various interfaces and lines to connect the various parts of the entire mobile phone. It executes by running or executing software programs and/or modules stored in the memory 1420, and calling data stored in the memory 1420. Various functions and processing data of the mobile phone can be used to monitor the mobile phone as a whole. Optionally, the processor 1480 may include one or more processing units; preferably, the processor 1480 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, and application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1480.
手机还包括给各个部件供电的电源1490(比如电池),优选的,电源可以通过电源管理系统与处理器1480逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The mobile phone also includes a power supply 1490 (such as a battery) for supplying power to various components. Preferably, the power supply can be logically connected to the processor 1480 through a power management system, so that functions such as charging, discharging, and power management can be managed through the power management system.
尽管未示出,手机还可以包括摄像头、蓝牙模块等,在此不再赘述。Although not shown, the mobile phone may also include a camera, a Bluetooth module, etc., which will not be repeated here.
在本申请实施例中,处理器1480具体用于执行图2所示实施例中的全部或部分动作,具体此处不再赘述。In the embodiment of the present application, the processor 1480 is specifically configured to perform all or part of the actions in the embodiment shown in FIG.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,read-only memory)、随机存取存储器(RAM,random access memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program code .
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that: The technical solutions recorded in the embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (14)
- 一种视频的评价方法,其特征在于,包括:A video evaluation method, characterized in that it includes:获取待评价视频;Get the video to be evaluated;计算与所述待评价视频相关联的目标参数,所述目标参数包括抖动数值、裁剪数值以及畸变数值中的至少一项,所述抖动数值包括所述待评价视频中每相邻两帧之间的抖动位移量的平均值,所述裁剪数值包括所述待评价视频中每相邻两帧之间的裁剪量,所述畸变数值包括与所述待评价视频匹配的曲线网格中至少一条曲线上的网格点到与所述至少一条曲线对应的拟合直线的平均距离;Calculate a target parameter associated with the video to be evaluated, where the target parameter includes at least one of a jitter value, a crop value, and a distortion value, and the jitter value includes every two adjacent frames in the video to be evaluated The average value of the amount of jitter displacement, the cropping value includes the cropping amount between every two adjacent frames in the video to be evaluated, and the distortion value includes at least one curve in a curve grid matching the video to be evaluated The average distance from the grid points on the above to the fitting straight line corresponding to the at least one curve;根据所述目标参数评价所述待评价视频。Evaluate the video to be evaluated according to the target parameter.
- 根据权利要求1所述的方法,其特征在于,所述目标参数为所述抖动数值,计算与所述待评价视频相关联的目标参数包括:The method according to claim 1, wherein the target parameter is the jitter value, and calculating the target parameter associated with the video to be evaluated comprises:获取所述待评价视频中第一帧和第二帧之间的单应性矩阵,所述第一帧和所述第二帧为所述待评价视频中任意的相邻两帧;Acquiring a homography matrix between a first frame and a second frame in the video to be evaluated, where the first frame and the second frame are any two adjacent frames in the video to be evaluated;确定所述第一帧中第一顶点集合的第一坐标集合;Determining a first coordinate set of a first vertex set in the first frame;根据所述第一坐标集合和所述单应性矩阵计算所述第二帧中第二顶点集合的第二坐标集合,所述第一顶点集合与所述第二顶点集合匹配;Calculating a second coordinate set of a second vertex set in the second frame according to the first coordinate set and the homography matrix, where the first vertex set matches the second vertex set;根据所述第一坐标集合和所述第二坐标集合计算所述抖动数值。Calculating the jitter value according to the first coordinate set and the second coordinate set.
- 根据权利要求1所述的方法,其特征在于,所述目标参数为所述裁剪数值,计算与所述待评价视频相关联的目标参数包括:The method according to claim 1, wherein the target parameter is the cropping value, and calculating the target parameter associated with the video to be evaluated comprises:计算所述待评价视频中第一帧上的第一特征点到所述第一帧的边界的第一距离;Calculating the first distance from the first feature point on the first frame of the video to be evaluated to the boundary of the first frame;计算所述待评价视频中第二帧上的第二特征点到所述第二帧的边界的第二距离,所述第一帧和所述第二帧为所述待评价视频中任意的相邻两帧,所述第一特征点与所述第二特征点匹配;Calculate the second distance from the second feature point on the second frame of the video to be evaluated to the boundary of the second frame, where the first frame and the second frame are any phases in the video to be evaluated Two adjacent frames, the first feature point matches the second feature point;根据所述第一距离与所述第二距离计算所述裁剪数值。The cropping value is calculated according to the first distance and the second distance.
- 根据权利要求1所述的方法,其特征在于,所述目标参数为所述畸变数值,计算与所述待评价视频相关联的目标参数包括:The method according to claim 1, wherein the target parameter is the distortion value, and calculating the target parameter associated with the video to be evaluated comprises:根据所述曲线网格中目标曲线上的网格点获取与所述目标曲线对应的拟合直线;Obtaining a fitting straight line corresponding to the target curve according to the grid points on the target curve in the curve grid;计算所述网格点到所述拟合直线的平均距离得到所述畸变数值。The average distance from the grid point to the fitted straight line is calculated to obtain the distortion value.
- 根据权利要求1至4中任一项所述的方法,其特征在于,所述待评价的视频为拍摄的原始视频。The method according to any one of claims 1 to 4, wherein the video to be evaluated is an original video taken.
- 根据权利要求5所述的方法,其特征在于,所述待评价视频至少包括第一待评价视频和第二待评价视频,根据所述目标参数评价所述待评价视频包括:The method according to claim 5, wherein the video to be evaluated includes at least a first video to be evaluated and a second video to be evaluated, and evaluating the video to be evaluated according to the target parameter comprises:根据与所述第一待评价视频相关联的第一目标参数以及与与所述第二待评价视频相关联的第二目标参数比较所述第一待评价视频和所述第二待评价视频。The first video to be evaluated and the second video to be evaluated are compared according to a first target parameter associated with the first video to be evaluated and a second target parameter associated with the second video to be evaluated.
- 根据权利要求1至4中任一项所述的方法,其特征在于,所述待评价视频为拍摄的原始视频经过防抖算法处理后的视频。The method according to any one of claims 1 to 4, wherein the video to be evaluated is a video obtained by processing the original video taken by an anti-shake algorithm.
- 根据权利要求7所述的方法,其特征在于,所述待评价视频至少包括第一待评价视频和第二待评价视频,所述第一待评价视频采用第一防抖算法,所述第二待评价视频采用第二防抖算法,根据所述目标参数评价所述待评价视频包括:The method according to claim 7, wherein the video to be evaluated includes at least a first video to be evaluated and a second video to be evaluated, the first video to be evaluated uses a first anti-shake algorithm, and the second The video to be evaluated adopts the second anti-shake algorithm, and evaluating the video to be evaluated according to the target parameter includes:根据与所述第一待评价视频相关联的第一目标参数以及与所述第二待评价视频相关联的第二目标参数,比较所述第一防抖算法和所述第二防抖算法。The first anti-shake algorithm and the second anti-shake algorithm are compared according to a first target parameter associated with the first video to be evaluated and a second target parameter associated with the second video to be evaluated.
- 根据权利要求5所述的方法,其特征在于,若所述抖动数值越小,所述裁剪数值越小和/或所述畸变数值越小,则所述待评价视频的抖动越小。The method according to claim 5, wherein if the jitter value is smaller, the cropping value is smaller and/or the distortion value is smaller, the jitter of the video to be evaluated is smaller.
- 根据权利要求7所述的方法,其特征在于,若所述抖动数值越小,所述裁剪数值越小和/或所述畸变数值越小,则所述待评价视频所采用的所述防抖算法越好。8. The method according to claim 7, wherein if the jitter value is smaller, the cropping value is smaller and/or the distortion value is smaller, the image stabilization used by the video to be evaluated is The better the algorithm.
- 一种终端,其特征在于,包括:A terminal, characterized in that it comprises:处理器、存储器、总线以及输入输出接口;Processor, memory, bus and input and output interface;所述存储器中存储有程序代码;Program codes are stored in the memory;所述处理器调用所述存储器中的程序代码时执行如权利要求1至10中任一项所述的方法。When the processor calls the program code in the memory, the method according to any one of claims 1 to 10 is executed.
- 一种服务器,其特征在于,包括:A server, characterized in that it comprises:处理器、存储器、总线以及输入输出接口;Processor, memory, bus and input and output interface;所述存储器中存储有程序代码;Program codes are stored in the memory;所述处理器调用所述存储器中的程序代码时执行如权利要求1至10中任一项所述的方法。When the processor calls the program code in the memory, the method according to any one of claims 1 to 10 is executed.
- 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1至10中任意一项所述的方法。A computer-readable storage medium, comprising instructions, characterized in that, when the instructions are run on a computer, the computer executes the method according to any one of claims 1 to 10.
- 一种包含指令的计算机程序产品,其特征在于,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1至10中任意一项所述的方法。A computer program product containing instructions, characterized in that, when the instructions are run on a computer, the computer is caused to execute the method according to any one of claims 1 to 10.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910202676.1A CN110062222B (en) | 2019-03-15 | 2019-03-15 | Video evaluation method, terminal, server and related products |
CN201910202676.1 | 2019-03-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020187065A1 true WO2020187065A1 (en) | 2020-09-24 |
Family
ID=67316142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/078320 WO2020187065A1 (en) | 2019-03-15 | 2020-03-07 | Video evaluation method, terminal, server, and related product |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110062222B (en) |
WO (1) | WO2020187065A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110062222B (en) * | 2019-03-15 | 2021-06-29 | 华为技术有限公司 | Video evaluation method, terminal, server and related products |
CN110351579B (en) * | 2019-08-16 | 2021-05-28 | 深圳特蓝图科技有限公司 | Intelligent video editing method |
CN112584341B (en) * | 2019-09-30 | 2022-12-27 | 华为云计算技术有限公司 | Communication method and device |
CN114401395A (en) * | 2021-12-30 | 2022-04-26 | 中铁第四勘察设计院集团有限公司 | Method and system for detecting loose installation of camera based on video intelligent analysis |
CN117012228A (en) * | 2023-07-28 | 2023-11-07 | 支付宝(杭州)信息技术有限公司 | Method and device for training evaluation model and evaluating video quality |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006295626A (en) * | 2005-04-12 | 2006-10-26 | Canon Inc | Fish-eye image processing apparatus, method thereof and fish-eye imaging apparatus |
CN102572501A (en) * | 2010-12-23 | 2012-07-11 | 华东师范大学 | Video quality evaluation method and device capable of taking network performance and video self-owned characteristics into account |
CN103700069A (en) * | 2013-12-11 | 2014-04-02 | 武汉工程大学 | ORB (object request broker) operator-based reference-free video smoothness evaluation method |
CN105812788A (en) * | 2016-03-24 | 2016-07-27 | 北京理工大学 | Video stability quality assessment method based on interframe motion amplitude statistics |
CN110062222A (en) * | 2019-03-15 | 2019-07-26 | 华为技术有限公司 | A kind of evaluation method of video, terminal, server and Related product |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104349039B (en) * | 2013-07-31 | 2017-10-24 | 展讯通信(上海)有限公司 | Video anti-fluttering method and device |
CN104135597B (en) * | 2014-07-04 | 2017-12-15 | 上海交通大学 | A kind of video jitter automatic testing method |
CN106251317B (en) * | 2016-09-13 | 2018-12-18 | 野拾(北京)电子商务有限公司 | Space photography stabilization processing method and processing device |
CN107046640B (en) * | 2017-02-23 | 2018-09-07 | 北京理工大学 | It is a kind of based on interframe movement slickness without reference video stabilised quality evaluation method |
-
2019
- 2019-03-15 CN CN201910202676.1A patent/CN110062222B/en active Active
-
2020
- 2020-03-07 WO PCT/CN2020/078320 patent/WO2020187065A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006295626A (en) * | 2005-04-12 | 2006-10-26 | Canon Inc | Fish-eye image processing apparatus, method thereof and fish-eye imaging apparatus |
CN102572501A (en) * | 2010-12-23 | 2012-07-11 | 华东师范大学 | Video quality evaluation method and device capable of taking network performance and video self-owned characteristics into account |
CN103700069A (en) * | 2013-12-11 | 2014-04-02 | 武汉工程大学 | ORB (object request broker) operator-based reference-free video smoothness evaluation method |
CN105812788A (en) * | 2016-03-24 | 2016-07-27 | 北京理工大学 | Video stability quality assessment method based on interframe motion amplitude statistics |
CN110062222A (en) * | 2019-03-15 | 2019-07-26 | 华为技术有限公司 | A kind of evaluation method of video, terminal, server and Related product |
Non-Patent Citations (1)
Title |
---|
LIU, SHUAICHENG ET AL.: "Bundled Camera Paths for Video Stabilization", ACM TRANSACTIONS ON GRAPHICS, vol. 32, no. 4, 31 July 2013 (2013-07-31), XP055282111, DOI: 20200604082319Y * |
Also Published As
Publication number | Publication date |
---|---|
CN110062222B (en) | 2021-06-29 |
CN110062222A (en) | 2019-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020187065A1 (en) | Video evaluation method, terminal, server, and related product | |
CN110058694B (en) | Sight tracking model training method, sight tracking method and sight tracking device | |
CN109598749B (en) | Parameter configuration method, device, equipment and medium for three-dimensional face model | |
CN108989672B (en) | Shooting method and mobile terminal | |
CN110163045B (en) | Gesture recognition method, device and equipment | |
CN108037845B (en) | Display control method, mobile terminal and computer-readable storage medium | |
CN108566510B (en) | Flexible screen control method, mobile terminal and readable storage medium | |
CN108038825B (en) | Image processing method and mobile terminal | |
CN110035176B (en) | Brightness adjusting method of mobile terminal, mobile terminal and storage medium | |
CN110198413B (en) | Video shooting method, video shooting device and electronic equipment | |
CN110213485B (en) | Image processing method and terminal | |
CN109462745B (en) | White balance processing method and mobile terminal | |
CN107644395B (en) | Image processing method and mobile device | |
CN107809583A (en) | Take pictures processing method, mobile terminal and computer-readable recording medium | |
CN107621228A (en) | A kind of object measuring method, camera terminal and computer-readable recording medium | |
CN109819163A (en) | A kind of image processing control, terminal and computer readable storage medium | |
CN108184052A (en) | A kind of method of video record, mobile terminal and computer readable storage medium | |
CN113179370A (en) | Photographing method, mobile terminal and readable storage medium | |
CN109889695A (en) | A kind of image-region determines method, terminal and computer readable storage medium | |
CN108200332A (en) | A kind of pattern splicing method, mobile terminal and computer readable storage medium | |
CN107610057B (en) | Depth map repairing method, terminal and computer readable storage medium | |
CN108881721A (en) | A kind of display methods and terminal | |
CN108536513B (en) | Picture display direction adjusting method and mobile terminal | |
CN110168599B (en) | Data processing method and terminal | |
CN107179830B (en) | Information processing method for motion sensing application, mobile terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20773662 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20773662 Country of ref document: EP Kind code of ref document: A1 |