CN109495772B - Video quality sequencing method and system - Google Patents

Video quality sequencing method and system Download PDF

Info

Publication number
CN109495772B
CN109495772B CN201710811287.XA CN201710811287A CN109495772B CN 109495772 B CN109495772 B CN 109495772B CN 201710811287 A CN201710811287 A CN 201710811287A CN 109495772 B CN109495772 B CN 109495772B
Authority
CN
China
Prior art keywords
video data
video
evaluation information
pairs
sequencing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710811287.XA
Other languages
Chinese (zh)
Other versions
CN109495772A (en
Inventor
徐浩晖
江文斐
梅大为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN201710811287.XA priority Critical patent/CN109495772B/en
Publication of CN109495772A publication Critical patent/CN109495772A/en
Application granted granted Critical
Publication of CN109495772B publication Critical patent/CN109495772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/458Scheduling content for creating a personalised stream, e.g. by combining a locally stored advertisement with an incoming stream; Updating operations, e.g. for OS modules ; time-related management operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Abstract

The embodiment of the application discloses a method and a system for sequencing video quality, wherein the method provides a video data set, and the video data set comprises at least one group of video data pairs; the method comprises the following steps: reading a set of target video data pairs in the video data set; receiving evaluation information published by the user aiming at the target video data pair; constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair; and correcting the sequencing model according to the difference value between the sequencing result and the evaluation information, so that when the training sample is input into the corrected sequencing model again, the obtained sequencing result is consistent with the evaluation information. The technical scheme provided by the application can improve the sequencing efficiency.

Description

Video quality sequencing method and system
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and a system for ordering video quality.
Background
With the continuous development of computer technology, many video editing technologies emerge. These video editing techniques include, for example, video transcoding, video compression, video beautification, and the like.
Currently, when an original video recorded by a video recording device is uploaded to a video playing website, the video playing website usually limits the data size and format of the video, so that the original video needs to be compressed and/or transcoded. Video processed by different compression and transcoding techniques will generally exhibit different effects. For example, changes may occur in picture clarity and/or speech recognition.
In addition, in order to make the video picture more attractive to the user, the original video can be beautified. Likewise, videos processed using different beautification techniques may exhibit different effects.
In order to select a video editing technology that can be accepted by a user from a plurality of video editing technologies, the video editing technology can be currently implemented by a video quality sorting method. Specifically, the user can be simultaneously provided with the edited video that has been processed by the video editing technique and the original video that has not been processed by the video editing technique. After the user views the edited video and the original video, the edited video and the original video can be respectively scored based on the picture quality of the two videos, so that the two videos are sequenced according to the scoring result. Therefore, whether the currently adopted video editing technology is accepted by the user can be judged according to the sequencing result.
However, in the prior art, the sorting method for the video quality is usually performed manually by watching the video, which is not only more costly but also less efficient.
Disclosure of Invention
The embodiment of the application aims to provide a video quality sequencing method and a video quality sequencing system, which can improve the efficiency of sequencing.
In order to achieve the above object, in an embodiment of the present application, a video data set is provided, where the video data set includes at least one group of video data pairs, and two video data in the video data pairs have the same original video data; wherein at least one of the two video data is obtained by processing the original video data, and the method comprises: reading a group of target video data pairs in the video data set, and sequentially displaying videos represented by two video data in the target video data pairs to a user; receiving evaluation information issued by the user for the target video data pair, wherein the evaluation information is used for judging video data with better quality in the target video data pair; constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair; and correcting the sequencing model according to the difference value between the sequencing result and the evaluation information, so that when the training sample is input into the corrected sequencing model again, the obtained sequencing result is consistent with the evaluation information.
In order to achieve the above object, the present application further provides a video quality ranking system, which includes a memory and a processor, where the memory stores therein a computer program and a video data set, where the video data set includes at least one group of video data pairs, and two video data in the video data pairs have the same original video data; at least one of the two video data is obtained by processing the original video data; the computer program, when executed by the processor, implements the functions of: reading a group of target video data pairs in the video data set, and sequentially displaying videos represented by two video data in the target video data pairs to a user; receiving evaluation information issued by the user for the target video data pair, wherein the evaluation information is used for judging video data with better quality in the target video data pair; constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair; and correcting the sequencing model according to the difference value between the sequencing result and the evaluation information, so that when the training sample is input into the corrected sequencing model again, the obtained sequencing result is consistent with the evaluation information.
Therefore, according to the technical scheme provided by the application, after videos represented by two video data in a group of target video data pairs are displayed to a user, evaluation information issued by the user aiming at the target video data pairs can be received. The rating information may characterize the video data that the user finds to be of better quality. If the video data with better quality is processed video data, the video editing technology adopted in the processing is approved by the user. Otherwise, it indicates that the user does not recognize the video editing technique. Therefore, a large number of video data pairs which can be compared can be provided for the user, and evaluation information fed back by the user is received. The ranking model may then be trained using the video data in the video data pairs as training samples. When training, the ranking model may output ranking results for the video data pairs. But since the initial ranking model may be inaccurate, the ranking result may not be consistent with the evaluation information fed back by the user. In this case, the ranking model may be corrected according to a difference value between the ranking result and the evaluation information, so that the corrected ranking model can correctly predict the ranking result of the pair of video data. Therefore, after the training process is completed, when the quality of the video represented by the new video data pair needs to be sequenced, the new video data pair can be directly used as the input data of the sequencing model, and the sequencing result of the new video data pair can be output through the trained sequencing model. Therefore, the technical scheme provided by the application can be combined with the evaluation information of the user and the machine learning method to train the sequencing model with higher precision, so that new video data pairs can be automatically sequenced, and the efficiency of video quality sequencing is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a video quality ranking method according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram illustrating interception of video clip data according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram illustrating determination of a plurality of video segment data according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a video quality ranking system according to an embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application shall fall within the scope of protection of the present application.
The application provides a video quality sequencing method which can be applied to terminal equipment with a data processing function. The terminal device may be, for example, a desktop computer, a notebook computer, a tablet computer, a workstation, etc. In this embodiment, a video data set may be provided in the terminal device, and the video data set includes at least one group of video data pairs. The pair of video data may include two video data that can be compared. Both video data of the pair of video data may have the same original video data. The video data in the video data pair may be the original video data itself, or may be obtained by processing the original video data. For example, the original video data is data recorded by a video recording apparatus, which has not been subjected to a video editing process. Then, in the video data pair, one video data may be the original video data, and the other video data may be the video data obtained by processing the original video data by the video editing technology. Of course, both the video data in the video data pair may be the video data obtained after being processed by the video editing technology, but the video editing technologies adopted by the two video data may be different. For example, one of the video data pairs may be obtained by performing picture compression on the original video data, and the other video data may be obtained by performing picture beautification on the original video data. In an actual application scenario, in order to reflect a difference between two video data in a video data pair, it may be ensured that at least one of the two video data is obtained by processing the original video data. Because two video data correspond to the same original video data, the video content and the video duration corresponding to the two video data can be the same, but the displayed pictures or the voice effects are different.
Referring to fig. 1, the method for sorting video quality provided by the present application may include the following steps.
S1: and reading a group of target video data pairs in the video data set, and sequentially displaying videos represented by two video data in the target video data pairs to a user.
In this embodiment, the terminal device may read a set of target video data pairs in the video data set from a memory. The target video data pair may be a random set of video data pairs in the video data set. After reading the target video data pair, the two video data contained therein may be rendered in sequence, so that a video representative of the video data may be presented in a display. Thus, the user can watch the corresponding video through the display of the terminal equipment. In addition, the user can also use an electronic device different from the terminal device. After the terminal device reads the target video data, the target video data can be sent to the electronic device used by the user, so that rendering and displaying can be performed at the electronic device.
In this embodiment, the order of the two video data presented to the user may be random, so as to avoid the situation that the user guesses the video data corresponding to the currently viewed video, which results in an inaccurate ordering result. In addition, when a video represented by a certain video data pair is watched by the user, the video data pair can not be provided for the user any more, so that the evaluation made after the user is familiar with the video content is not accurate enough. In addition, pause, review, and fast forward may be prohibited during video playback to ensure the integrity of the video and to ensure that the video cannot be viewed repeatedly.
In one embodiment, it is considered that if the user is allowed to watch a video for a long time, the user is likely to remember only a part of the video content having a short last time, and rating information is made based on the part of the video content. In view of this, in the present embodiment, video content with a shorter duration can be provided to the user each time, thereby ensuring the accuracy of the evaluation information made by the user. Specifically, after reading a target video data pair to be provided to a user, the terminal device may determine whether a playing time length corresponding to video data in the target video data pair is greater than or equal to a specified time length threshold. The specified duration threshold may be a preset time constant. For example, the specified duration threshold may be 20 seconds. When the playing time length corresponding to the video data in the target video data pair is greater than or equal to the specified time length threshold, a section of video segment data can be intercepted from the video data and provided for a user. Specifically, referring to fig. 2, the video segment data with the same playing position and the same playing time duration may be respectively captured from the two video data of the target video data pair to form a video segment data pair. The videos represented by the two video segment data in the video segment data pair have the same start time node and end time node. For example, if the playing time lengths corresponding to two video data in the target video data pair read by the terminal device are both 5 minutes, and the specified time length threshold is 30 seconds, then the video segment data with the playing time of 5 minutes to 5 minutes and 25 seconds can be respectively intercepted from the two video data, so as to obtain a video segment data pair composed of the two video segment data. It should be noted that, in an actual application scenario, in order to ensure randomness of the captured video segment data, a start time node of the captured video may be randomly determined, and a duration of the captured video may be randomly determined within a certain duration range. For example, the duration may range from 10 seconds to 30 seconds, so that the duration of the finally intercepted video may be a random value from 10 seconds to 30 seconds.
In this embodiment, after the pair of video segment data is captured, the video segments represented by two video segment data in the pair of video segment data may be sequentially displayed to the user. In particular, the two video clip data may be rendered and presented to the user for viewing in a random order.
S3: and receiving evaluation information issued by the user aiming at the target video data pair, wherein the evaluation information is used for judging the video data with better quality in the target video data pair.
In this embodiment, after viewing two videos represented by a target video data pair, a user may present evaluation information for the target video data pair. The evaluation information may be used to determine the video data with the better quality in the target video data pair. In particular, the user may select for the provided options after viewing both videos. The options may include, for example, "first video better," second video better, "and" two videos do not differ significantly. After the user selects the corresponding option, the terminal device may receive the evaluation information corresponding to the option selected by the user, so that the evaluation result of the user on two video data in the target video data pair may be obtained.
In an embodiment, when the playing time corresponding to the video data in the target video data is long and video segment data needs to be intercepted from the video data, video segments represented by two video segment data in the intercepted video segment data pair may be displayed to a user for watching. In this way, after the user views the video segment, the user can select the corresponding option to obtain the segment evaluation information of the video data pair. The clip evaluation information may be used to determine the video clip data with the highest quality among the video clip data.
In this embodiment, after the user determines the video segment data with the better quality among the video segment data, the video segment data with the better quality among the target video data may be determined according to the determination result. In one application scenario, in order to speed up the determination process of video quality, only one set of video segment data pairs may be cut out from a target video data pair, and segment evaluation information published by a user for the video segment data pairs may be used as evaluation information of the target video data pair. Specifically, the video data to which the video clip data with the better quality belongs may be regarded as the video data with the better quality in the target video data pair. For example, the target video data pair includes video data a and video data B, from which video clip data a1 may be cut out, and from which video clip data B1 may be cut out. If the user's clip evaluation information indicates that video clip data a1 is preferred over video clip data B1, video data a may be considered to be preferred over video data B.
In another embodiment, to avoid errors caused by a single evaluation of the video segment data, multiple sets of video segment data pairs may be intercepted from the target video data pair, and the evaluation information of the target video data pair may be obtained by combining the segment evaluation information of the multiple sets of video segment data pairs.
In the present embodiment, the number of pairs of video section data cut out from a pair of target video data may be at least two groups. In this way, for each set of video clip data pairs, local rating information published by the user for the video clip data pairs may be obtained. The local evaluation information may be used to characterize video data to which the video segment data with better quality belongs. For example, 10 sets of video clip data pairs are cut out from the target video data pair, and the user can issue local evaluation information for each set of video clip data pairs. The local evaluation information may determine, in addition to the video segment data with the better quality among the video segment data, the video data to which the video segment data with the better quality belongs. For example, in the example of the above-described embodiment, when the video clip data a1 is better than the video clip data B1, the local evaluation information may indicate that the video clip data of the better quality belongs to the video clip a.
In this way, after evaluating each set of video segment data pairs, a plurality of local evaluation information may be obtained, which may characterize the respective video data. In the present embodiment, the same number of local evaluation information may be counted. The same local evaluation information may refer to local evaluation information that the represented video data is the same video data. For example, referring to fig. 3, of the 4 pieces of local evaluation information, it can be statistically derived that the number of local evaluation information indicating that video segment data with better quality belongs to video data a is 3, and the number of local evaluation information indicating that video segment data with better quality belongs to video data B is 1. At this time, the video data represented by the local evaluation information with the largest quantity can be used as the video data with the better quality in the target video data pair by adopting a principle that a small number of video data are subject to majority. For example, in the above example, the video data a may be the video data with the better quality. Of course, in an actual application scenario, there may be more than one local evaluation information with the largest number, and in this case, the method for evaluating by continuing to add users may be performed until one local evaluation information with the largest number is obtained.
S5: and constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair.
In the present embodiment, a large number of pairs of video data can be provided in step S1, and thus evaluation information corresponding to these pairs of video data can be obtained in step S3. The video data pairs and the corresponding evaluation information can be used as training samples for machine learning, so that a model which can accord with the evaluation behavior habit of a user is obtained through training, the model can evaluate the input video data, and the sequencing result of the video data is output.
In this embodiment, a training sample may be constructed based on the video data in the target video data pair and input to a ranking model. In an actual application scenario, the ranking model may adopt a RankSVM model, and the training samples may be constructed as training samples suitable for the RankSVM model. Specifically, when constructing the training sample, a first feature vector of the first video data and a second feature vector of the second video data in the target video data pair may be extracted, respectively. The feature vector may be obtained by performing feature extraction on a picture in a video represented by the video data by using a Convolutional Neural Network (CNN). After obtaining the first and second feature vectors, forward and reverse training samples may be constructed based on the extracted first and second feature vectors. Wherein the ordering of the feature vectors in the forward training samples and the reverse training samples may be different. Specifically, in the forward training sample, the first feature vector precedes the second feature vector. For example, the ith constructed forward training sample can be represented as (x)i,yi) Wherein x isiRepresenting a first feature vector, yiRepresenting the second feature vector. Accordingly, the second feature vector may precede the first feature vector in the reverse training sample. For example, the constructed ith inverse training sample may be represented as (y)i,xi)。
In this embodiment, the forward training sample and the reverse training sample may be associated with respective theoretical values. Specifically, the forward training sample is associated with a first theoretical value and the reverse training sample is associated with a second theoretical value. The theoretical value can represent a theoretical sorting result corresponding to a sorting mode of the feature vectors in the training sample. For example, if the evaluation information published by the user for the first video data and the second video data indicates that the quality of the first video data is better, the forward training sample may have a first theoretical value of 1 because the first feature value is ranked before the second feature value, which indicates that the ranking order of the feature values in the forward training sample is the same as the quality ranking order of the actual video data. And the second eigenvalue in the reverse training sample is arranged before the first eigenvalue, so that a second theoretical value associated with the reverse training sample can be-1, which indicates that the arrangement order of the eigenvalues in the reverse training sample is opposite to the quality order of actual video data. Thus, the first theoretical value and the second theoretical value may be different from each other.
In this embodiment, after obtaining a forward training sample and a reverse training sample, the forward training sample and the reverse training sample may be respectively input into a ranking model to obtain a first actual value corresponding to the forward training sample and a second actual value corresponding to the reverse training sample. It should be noted that, in the initial training phase, the parameters in the ranking model may not be accurate, and therefore the first actual value may be different from the first theoretical value, and the second actual value may also be different from the second theoretical value.
S7: and correcting the sequencing model according to the difference value between the sequencing result and the evaluation information, so that when the training sample is input into the corrected sequencing model again, the obtained sequencing result is consistent with the evaluation information.
In this embodiment, the parameters of the ranking model are default values set during the construction, and these default values may not be completely adapted to the current training sample. Therefore, the ranking model needs to be corrected according to the ranking result output by the ranking model and the actual evaluation information of the user.
In this embodiment, the sorting result output by the sorting model may be a numerical value, and the numerical value may represent the sorting manner of the two video data. For example, a1 may indicate that the first video data is ranked before the second video data, and a1 may indicate that the second video data is ranked before the first video data. When the sorting result output by the sorting model is inconsistent with the actual evaluation information published by the user for the two video data, it indicates that the parameters in the sorting model need to be adjusted. Specifically, a difference value between the sorting result and the evaluation information may be calculated, and the difference value may be input to the sorting model as feedback information, so that the sorting model may adjust internal parameters. In an actual application scenario, the parameter inside the ranking model may be one or more parameter matrices, and values in the parameter matrices may be adjusted according to a certain value interval. Thus, when the sorting model receives the difference value between the sorting result and the evaluation information, the value to be adjusted can be determined according to the size of the difference value and the value interval. For example, if the difference is 3 times the maximum allowable error, the value to be adjusted may be set to 3 times the interval between the values. Thus, the larger the difference value is, the larger the value to be set and adjusted may be.
In this embodiment, when the input training samples are a forward training sample and a reverse training sample, the ranking model may be corrected by using the training results of the forward training sample and the training results of the reverse training sample, respectively. Specifically, after the forward training sample and the reverse training sample are input into a ranking model to obtain a first actual value corresponding to the forward training sample and a second actual value corresponding to the reverse training sample, a first difference value between the first actual value and the first theoretical value may be calculated, and a parameter in the ranking model is corrected by using the first difference value. Likewise, a second difference value between the second actual value and the second theoretical value may be calculated, and the parameter in the ranking model may be corrected using the second difference value.
In this embodiment, after the training samples are corrected, the training samples may be input into the corrected ranking model again, and then correction may be performed again according to the obtained ranking results until the obtained ranking results match the evaluation information after the training samples are input. In this way, through training of a large amount of data, the parameters in the ranking model can be made to be suitable for most video data.
In one embodiment, after the training phase of the ranking model is completed, the ranking model can be used to rank two video data to be ranked. The two pieces of video data to be sorted may be data that has not been evaluated by the user, and since the parameters in the sorting model have been adjusted in the training phase, the sorting model may process the two pieces of video data to be sorted based on the parameters adapted to the evaluation behavior of the user, thereby obtaining the sorting result corresponding to the two pieces of video data to be sorted.
In one embodiment, multiple video editing methods may be used to process the same original video data, thereby obtaining different processed video data. To determine the merits of these video editing methods, it is possible to sort the processed video data. At this time, the number of video data to be sorted may be more than two, but at least three. In this case, to ensure that the sorting model can work normally, the video data to be sorted may be grouped into pairs of sub-video data two by two. For example, there are currently three video data A, B, C to be sorted in total, and then three sub-video data pairs of (a, B), (a, C), and (B, C) may be formed. Two video data may be included in each sub video data pair. In this way, the sub-video data pairs can be ordered in turn using the ordering model described above. Specifically, the sub-video data pairs may be sequentially input into the corrected sorting model, so as to obtain a sorting result of two video data in the sub-video data pairs. For example, the above-described sorting results of the three pairs of sub-video data may be a better than B, A better than C and B better than C, respectively. Then, based on the sorting result of the sub video data pairs, the video data to be sorted may be sorted. For example, the final ranking result may be A-best, B-best, and C-worst. In this way, the quality of the videos can be ranked, so that the advantages and disadvantages of various video editing methods can be reflected.
In one embodiment, it is considered that the environment of the user when watching the video may have an influence on the final rating information. For example, when two different video data are played by a higher resolution device and a lower resolution device, the results presented may be completely different. When a user views a video represented by these two video data on a device with a lower resolution, the difference between the two videos may be perceived as less obvious. However, when a user views a video represented by the two video data on a higher resolution device, a more noticeable difference can be found. Therefore, when receiving the evaluation information published by the user for the target video data pair, the environment information where the video represented by the target video data pair is played can be recorded together, and the environment information is associated with the evaluation information. The environment information may be, for example, the resolution information, model information of a terminal device that plays a video, size information of a display screen, or the like. In this way, after the introduction of the environment information, the ranking model corrected based on the evaluation information can be used to rank pairs of video data in an environment characterized by the environment information. For example, the sorting model is obtained by training for the playing environment of a large-screen mobile phone, and then the sorting model can be specially used for sorting videos played on the large-screen mobile phone when video data are sorted subsequently, so that the sorting model can have higher sorting precision.
Referring to fig. 4, the present application further provides a video quality ranking system, which includes a memory and a processor, where the memory stores a computer program and a video data set, the video data set includes at least one group of video data pairs, and two video data in the video data pairs have the same original video data; at least one of the two video data is obtained by processing the original video data; the computer program, when executed by the processor, implements the following functions.
S1: and reading a group of target video data pairs in the video data set, and sequentially displaying videos represented by two video data in the target video data pairs to a user.
S3: and receiving evaluation information issued by the user aiming at the target video data pair, wherein the evaluation information is used for judging the video data with better quality in the target video data pair.
S5: and constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair.
S7: and correcting the sequencing model according to the difference value between the sequencing result and the evaluation information, so that when the training sample is input into the corrected sequencing model again, the obtained sequencing result is consistent with the evaluation information.
In this embodiment, presenting to the user videos of two video data representations in the target video data pair comprises:
when the playing time length corresponding to the video data in the target video data pair is greater than or equal to a specified time length threshold value, respectively intercepting video segment data pairs with the same playing position and the same playing time length from the two video data of the target video data pair, and sequentially displaying video segments represented by the two video segment data in the video segment data pairs to a user.
In this embodiment, the computer program, when executed by the processor, further implements the following functions:
receiving segment evaluation information of the user aiming at the video segment data pairs, wherein the segment evaluation information is used for judging the video segment data with better quality in the video segment data pairs;
and taking the video data to which the video segment data with the better quality belongs as the video data with the better quality in the target video data pair.
In this embodiment, the computer program, when executed by the processor, further implements the following functions:
respectively extracting a first feature vector of first video data and a second feature vector of second video data in the target video data pair, and constructing a forward training sample and a reverse training sample based on the extracted first feature vector and second feature vector; wherein the first feature vector in the forward training samples precedes the second feature vector, and the second feature vector in the reverse training samples precedes the first feature vector.
In this embodiment, the forward training sample is associated with a first theoretical value, and the reverse training sample is associated with a second theoretical value; wherein the first theoretical value and the second theoretical value are different;
accordingly, the computer program, when executed by the processor, further implements the functions of:
respectively inputting the forward training samples and the reverse training samples into a sequencing model to obtain first actual values corresponding to the forward training samples and second actual values corresponding to the reverse training samples;
calculating a first difference value between the first actual value and the first theoretical value, and correcting the parameters in the sequencing model by using the first difference value;
and calculating a second difference value between the second actual value and the second theoretical value, and correcting the parameters in the sequencing model by using the second difference value.
In this embodiment, the computer program, when executed by the processor, further implements the following functions:
recording environment information of the target video data to the represented video when the target video data is played, and associating the environment information with the evaluation information;
accordingly, the ranking model corrected based on the evaluation information is used to rank the pairs of video data in the environment characterized by the environment information.
In this embodiment, the Memory includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card).
In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth.
The specific functions implemented by the memory and the processor of the video quality ranking system provided in the embodiments of the present description may be explained in comparison with the foregoing embodiments in the present description, and can achieve the technical effects of the foregoing embodiments, and will not be described herein again.
Therefore, according to the technical scheme provided by the application, after videos represented by two video data in a group of target video data pairs are displayed to a user, evaluation information issued by the user aiming at the target video data pairs can be received. The rating information may characterize the video data that the user finds to be of better quality. If the video data with better quality is processed video data, the video editing technology adopted in the processing is approved by the user. Otherwise, it indicates that the user does not recognize the video editing technique. Therefore, a large number of video data pairs which can be compared can be provided for the user, and evaluation information fed back by the user is received. The ranking model may then be trained using the video data in the video data pairs as training samples. When training, the ranking model may output ranking results for the video data pairs. But since the initial ranking model may be inaccurate, the ranking result may not be consistent with the evaluation information fed back by the user. In this case, the ranking model may be corrected according to a difference value between the ranking result and the evaluation information, so that the corrected ranking model can correctly predict the ranking result of the pair of video data. Therefore, after the training process is completed, when the quality of the video represented by the new video data pair needs to be sequenced, the new video data pair can be directly used as the input data of the sequencing model, and the sequencing result of the new video data pair can be output through the trained sequencing model. Therefore, the technical scheme provided by the application can be combined with the evaluation information of the user and the machine learning method to train the sequencing model with higher precision, so that new video data pairs can be automatically sequenced, and the efficiency of video quality sequencing is improved.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardbyscript Description Language (vhr Description Language), and the like, which are currently used by Hardware compiler-software (Hardware Description Language-software). It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
It is also known to the person skilled in the art that, in addition to implementing the video quality ordering system in the form of pure computer readable program code, it is entirely possible to logically program the method steps such that the video quality ordering system implements the same functions in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a video quality ranking system may therefore be considered as a hardware component and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, embodiments of the ranking system for video quality may be explained with reference to the introduction of embodiments of the method described above.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Although the present application has been described in terms of embodiments, those of ordinary skill in the art will recognize that there are numerous variations and permutations of the present application without departing from the spirit of the application, and it is intended that the appended claims encompass such variations and permutations without departing from the spirit of the application.

Claims (15)

1. A method for ordering video quality is characterized in that a video data set is provided, the video data set comprises at least one group of video data pairs, and two video data in the video data pairs have the same original video data; wherein at least one of the two video data is obtained by processing the original video data, and the method comprises:
reading a group of target video data pairs in the video data set, and sequentially displaying videos represented by two video data in the target video data pairs to a user; wherein the target video data pair is a random set of video data pairs in the video data set;
receiving evaluation information issued by the user for the target video data pair, wherein the evaluation information is used for judging video data with better quality in the target video data pair;
constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair;
correcting the ranking model according to the difference value between the ranking result and the evaluation information, so that when the training sample is input into the corrected ranking model again, the obtained ranking result is consistent with the evaluation information; repeatedly utilizing the video data pairs in the video data set to train to obtain a trained sequencing model; the trained ranking model is used for ranking the quality of the video data to be ranked.
2. The method of claim 1, wherein presenting to a user videos that are representative of both video data in the target video data pair comprises:
when the playing time length corresponding to the video data in the target video data pair is greater than or equal to a specified time length threshold value, respectively intercepting video segment data pairs with the same playing position and the same playing time length from the two video data of the target video data pair, and sequentially displaying video segments represented by the two video segment data in the video segment data pairs to a user.
3. The method of claim 2, wherein receiving rating information of the user for the target video data pair comprises:
receiving segment evaluation information of the user aiming at the video segment data pairs, wherein the segment evaluation information is used for judging the video segment data with better quality in the video segment data pairs;
and taking the video data to which the video segment data with the better quality belongs as the video data with the better quality in the target video data pair.
4. The method of claim 2, wherein the number of said pairs of video segment data is at least two groups; accordingly, the method further comprises:
local evaluation information issued by the user aiming at the video segment data pair is obtained, and the local evaluation information is used for representing video data to which the video segment data with better quality belongs;
and counting the number of the same local evaluation information, and taking the video data represented by the local evaluation information with the largest number as the video data with the better quality in the target video data pair.
5. The method of claim 1, wherein constructing training samples based on the video data in the target video data pair comprises:
respectively extracting a first feature vector of first video data and a second feature vector of second video data in the target video data pair, and constructing a forward training sample and a reverse training sample based on the extracted first feature vector and second feature vector; wherein the first feature vector in the forward training samples precedes the second feature vector, and the second feature vector in the reverse training samples precedes the first feature vector.
6. The method of claim 5, wherein the forward training samples are associated with a first theoretical value and the reverse training samples are associated with a second theoretical value; wherein the first theoretical value and the second theoretical value are different; the first theoretical value is used for representing a theoretical sorting result corresponding to the sorting mode of the feature vectors in the forward training sample, and the second theoretical value is used for representing a theoretical sorting result corresponding to the sorting mode of the feature vectors in the reverse training sample; wherein the first theoretical value and the second theoretical value are determined according to the evaluation information;
correspondingly, inputting the training samples into a ranking model, and obtaining a ranking result comprises:
respectively inputting the forward training samples and the reverse training samples into a sequencing model to obtain first actual values corresponding to the forward training samples and second actual values corresponding to the reverse training samples;
correspondingly, correcting the ranking model according to the difference value between the ranking result and the evaluation information comprises:
calculating a first difference value between the first actual value and the first theoretical value, and correcting the parameters in the sequencing model by using the first difference value;
and calculating a second difference value between the second actual value and the second theoretical value, and correcting the parameters in the sequencing model by using the second difference value.
7. The method of claim 1, wherein after correcting the ranking model by the training samples, the method further comprises:
and inputting the two video data to be sorted into the trained sorting model to obtain sorting results corresponding to the two video data to be sorted.
8. The method according to claim 1 or 7, wherein when the number of video data to be sorted is at least three, the method further comprises:
forming sub video data pairs by the video data to be sequenced in pairs, and inputting the sub video data pairs into the sequencing model after correction in sequence to obtain sequencing results of two video data in the sub video data pairs;
and sequencing the video data to be sequenced based on the sequencing result of the sub video data pairs.
9. The method of claim 1, wherein upon receiving rating information published by the user for the target video data pair, the method further comprises:
recording environment information of the target video data to the represented video when the target video data is played, and associating the environment information with the evaluation information;
accordingly, the ranking model corrected based on the evaluation information is used to rank the pairs of video data in the environment characterized by the environment information.
10. A video quality ranking system, comprising a memory and a processor, wherein the memory stores a computer program and a video data set, wherein the video data set comprises at least one set of video data pairs, and wherein two video data of the video data pairs have the same original video data; at least one of the two video data is obtained by processing the original video data; the computer program, when executed by the processor, implements the functions of:
reading a group of target video data pairs in the video data set, and sequentially displaying videos represented by two video data in the target video data pairs to a user; wherein the target video data pair is a random set of video data pairs in the video data set;
receiving evaluation information issued by the user for the target video data pair, wherein the evaluation information is used for judging video data with better quality in the target video data pair;
constructing a training sample based on the video data in the target video data pair, and inputting the training sample into a sequencing model to obtain a sequencing result, wherein the sequencing result is used for representing the video data with better quality in the target video data pair;
correcting the ranking model according to the difference value between the ranking result and the evaluation information, so that when the training sample is input into the corrected ranking model again, the obtained ranking result is consistent with the evaluation information; repeatedly utilizing the video data pairs in the video data set to train to obtain a trained sequencing model; the trained ranking model is used for ranking the quality of the video data to be ranked.
11. The system of claim 10, wherein presenting to a user videos of both video data representations in the target video data pair comprises:
when the playing time length corresponding to the video data in the target video data pair is greater than or equal to a specified time length threshold value, respectively intercepting video segment data pairs with the same playing position and the same playing time length from the two video data of the target video data pair, and sequentially displaying video segments represented by the two video segment data in the video segment data pairs to a user.
12. The system of claim 11, wherein the computer program, when executed by the processor, further performs the functions of:
receiving segment evaluation information of the user aiming at the video segment data pairs, wherein the segment evaluation information is used for judging the video segment data with better quality in the video segment data pairs;
and taking the video data to which the video segment data with the better quality belongs as the video data with the better quality in the target video data pair.
13. The system of claim 10, wherein the computer program, when executed by the processor, further performs the functions of:
respectively extracting a first feature vector of first video data and a second feature vector of second video data in the target video data pair, and constructing a forward training sample and a reverse training sample based on the extracted first feature vector and second feature vector; wherein the first feature vector in the forward training samples precedes the second feature vector, and the second feature vector in the reverse training samples precedes the first feature vector.
14. The system of claim 13, wherein the forward training samples are associated with a first theoretical value and the reverse training samples are associated with a second theoretical value; wherein the first theoretical value and the second theoretical value are different; the first theoretical value is used for representing a theoretical sorting result corresponding to the sorting mode of the feature vectors in the forward training sample, and the second theoretical value is used for representing a theoretical sorting result corresponding to the sorting mode of the feature vectors in the reverse training sample; wherein the first theoretical value and the second theoretical value are determined according to the evaluation information;
accordingly, the computer program, when executed by the processor, further implements the functions of:
respectively inputting the forward training samples and the reverse training samples into a sequencing model to obtain first actual values corresponding to the forward training samples and second actual values corresponding to the reverse training samples;
calculating a first difference value between the first actual value and the first theoretical value, and correcting the parameters in the sequencing model by using the first difference value;
and calculating a second difference value between the second actual value and the second theoretical value, and correcting the parameters in the sequencing model by using the second difference value.
15. The system of claim 10, wherein the computer program, when executed by the processor, further performs the functions of:
recording environment information of the target video data to the represented video when the target video data is played, and associating the environment information with the evaluation information;
accordingly, the ranking model corrected based on the evaluation information is used to rank the pairs of video data in the environment characterized by the environment information.
CN201710811287.XA 2017-09-11 2017-09-11 Video quality sequencing method and system Active CN109495772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710811287.XA CN109495772B (en) 2017-09-11 2017-09-11 Video quality sequencing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710811287.XA CN109495772B (en) 2017-09-11 2017-09-11 Video quality sequencing method and system

Publications (2)

Publication Number Publication Date
CN109495772A CN109495772A (en) 2019-03-19
CN109495772B true CN109495772B (en) 2021-10-15

Family

ID=65687646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710811287.XA Active CN109495772B (en) 2017-09-11 2017-09-11 Video quality sequencing method and system

Country Status (1)

Country Link
CN (1) CN109495772B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114741048A (en) * 2022-05-20 2022-07-12 中译语通科技股份有限公司 Sample sorting method and device, computer equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101282481A (en) * 2008-05-09 2008-10-08 中国传媒大学 Method for evaluating video quality based on artificial neural net
CN101426150A (en) * 2008-12-08 2009-05-06 青岛海信电子产业控股股份有限公司 Video image quality evaluation method and system
CN101540048A (en) * 2009-04-21 2009-09-23 北京航空航天大学 Image quality evaluating method based on support vector machine
CN101853400A (en) * 2010-05-20 2010-10-06 武汉大学 Multiclass image classification method based on active learning and semi-supervised learning
CN104318562A (en) * 2014-10-22 2015-01-28 百度在线网络技术(北京)有限公司 Method and device for confirming quality of internet images

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015087903A (en) * 2013-10-30 2015-05-07 ソニー株式会社 Apparatus and method for information processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101282481A (en) * 2008-05-09 2008-10-08 中国传媒大学 Method for evaluating video quality based on artificial neural net
CN101426150A (en) * 2008-12-08 2009-05-06 青岛海信电子产业控股股份有限公司 Video image quality evaluation method and system
CN101540048A (en) * 2009-04-21 2009-09-23 北京航空航天大学 Image quality evaluating method based on support vector machine
CN101853400A (en) * 2010-05-20 2010-10-06 武汉大学 Multiclass image classification method based on active learning and semi-supervised learning
CN104318562A (en) * 2014-10-22 2015-01-28 百度在线网络技术(北京)有限公司 Method and device for confirming quality of internet images

Also Published As

Publication number Publication date
CN109495772A (en) 2019-03-19

Similar Documents

Publication Publication Date Title
US11917223B2 (en) Methods, systems, and media for presenting media content items belonging to a media content group
WO2015081776A1 (en) Method and apparatus for processing video images
CN109508406B (en) Information processing method and device and computer readable storage medium
CN107707967A (en) The determination method, apparatus and computer-readable recording medium of a kind of video file front cover
CN111385606A (en) Video preview method and device and intelligent terminal
CN109996122B (en) Video recommendation method and device, server and storage medium
CN111479169A (en) Video comment display method, electronic equipment and computer storage medium
CN109922334A (en) A kind of recognition methods and system of video quality
CN109743589B (en) Article generation method and device
US10021433B1 (en) Video-production system with social-media features
CN110019954A (en) A kind of recognition methods and system of the user that practises fraud
CN112507163A (en) Duration prediction model training method, recommendation method, device, equipment and medium
CN111279709A (en) Providing video recommendations
CN110418191A (en) A kind of generation method and device of short-sighted frequency
CN104702986A (en) Ranking method and device of program list
CN111683274A (en) Bullet screen advertisement display method, device and equipment and computer readable storage medium
CN107770624A (en) It is a kind of it is live during multimedia file player method, device and storage medium
CN113613075A (en) Video recommendation method and device and cloud server
CN106815284A (en) The recommendation method and recommendation apparatus of news video
CN105045882A (en) Hot word processing method and device
CN108153882A (en) A kind of data processing method and device
TW201907323A (en) Method for displaying and providing video result item, client, and server
CN109495772B (en) Video quality sequencing method and system
CN108985244B (en) Television program type identification method and device
CN112492382B (en) Video frame extraction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200514

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Applicant before: Youku network technology (Beijing) Co., Ltd

GR01 Patent grant
GR01 Patent grant