WO2021135983A1 - Video transcoding method and apparatus, server and storage medium - Google Patents

Video transcoding method and apparatus, server and storage medium Download PDF

Info

Publication number
WO2021135983A1
WO2021135983A1 PCT/CN2020/137513 CN2020137513W WO2021135983A1 WO 2021135983 A1 WO2021135983 A1 WO 2021135983A1 CN 2020137513 W CN2020137513 W CN 2020137513W WO 2021135983 A1 WO2021135983 A1 WO 2021135983A1
Authority
WO
WIPO (PCT)
Prior art keywords
transcoding
transcoded
video
rate
gears
Prior art date
Application number
PCT/CN2020/137513
Other languages
French (fr)
Chinese (zh)
Inventor
刘晓娟
Original Assignee
百果园技术(新加坡)有限公司
刘晓娟
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 刘晓娟 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2021135983A1 publication Critical patent/WO2021135983A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • H04N17/004Diagnosis, testing or measuring for television systems or their details for digital television systems

Definitions

  • the embodiments of the present application relate to the field of video processing, for example, to a method, device, server, and storage medium for video transcoding.
  • CRF Constant rate factor
  • the video playback quality is mainly the objective playback quality of the video, and due to the contrast sensitivity and brightness nonlinearity of the human eye to the video , Frequency sensitivity and masking effect, users will have a big difference in the subjective playback quality of different content video streams after transcoding with the same bitrate. For example, users are more sensitive to the distortion of the video picture under slow motion. , And it is not easy to detect the distortion of the video picture under intense exercise. Therefore, for any video, the corresponding preset fixed CRF is used for multi-bitrate transcoding at different bitrates. It cannot be guaranteed that the different videos watched by the user can reach The same subjective quality.
  • the source video stream perform perceptual analysis on the content of the source video stream to determine the video category to which the source video stream belongs, such as movies, sports, or animation, etc., and then use the CRF pairs that are set in advance for different bitrates under the video category.
  • the source video stream is transcoded at multiple bit rates.
  • the corresponding CRF is selected according to different user bandwidths to deliver the video, but it cannot accurately measure the user's different content under the same video category.
  • the subjective playback quality of the video stream is performed by setting the corresponding CRF under different video categories.
  • the embodiments of the application provide a method, device, server and storage medium for video transcoding to ensure that the subjective quality of different videos to be transcoded is consistent after being transcoded at the same bit rate gear, and the rationality of bit rate allocation is improved. .
  • the embodiment of the present application provides a method for video transcoding.
  • the method includes: determining each of the to-be-transcoded video to be transcoded in different to-be-transcoded rate gears according to the picture coding characteristics of the to-be-transcoded video
  • the transcoding factor used when the subjective quality index specified by each of the to-be-transcoding rate gears is reached after being transcoded in the rate gear; the different to-be-transcoded rate gears and the corresponding transcoding factors are used to compare the to-be-transcoded rate gears and the corresponding transcoding factors.
  • the embodiment of the application provides a video transcoding device, the device includes: a transcoding factor determination module configured to determine the video to be transcoded at different transcoding rate files according to the picture coding characteristics of the video to be transcoded The transcoding factor used when each of the to-be-transcoded rate gears is transcoded and reaches the subjective quality index specified by each of the to-be-transcoded rate gears; the video transcoding module is set to use different to-be-transcoded The code rate gear and the corresponding transcoding factor transcode the video to be transcoded.
  • An embodiment of the present application provides a server, which includes: one or more processors; a storage device configured to store one or more programs; when the one or more programs are used by the one or more processors When executed, the one or more processors are caused to implement the video transcoding method described in any embodiment of the present application.
  • An embodiment of the present application provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for video transcoding described in any embodiment of the present application is implemented.
  • FIG. 1A is a flowchart of a method for video transcoding provided in Embodiment 1 of this application;
  • FIG. 1B is a schematic diagram of the principle of the video transcoding process provided in Embodiment 1 of this application;
  • FIG. 2A is a flowchart of a video transcoding method provided in Embodiment 2 of this application;
  • 2B is a schematic diagram of the principle of the video transcoding process provided in the second embodiment of the application.
  • FIG. 3A is a flowchart of a video transcoding method provided in Embodiment 3 of this application.
  • FIG. 3B is a schematic diagram of the principle of the video transcoding process provided in Embodiment 3 of this application.
  • FIG. 4 is a schematic structural diagram of a video transcoding device provided in Embodiment 4 of this application.
  • FIG. 5 is a schematic structural diagram of a server provided in Embodiment 5 of this application.
  • FIG. 1A is a flowchart of a method for video transcoding provided in Embodiment 1 of this application. This embodiment can be applied to the case of performing multi-rate transcoding on any video.
  • the video transcoding method provided in this embodiment can be executed by the video transcoding device provided in this embodiment of the application, and the device can be implemented in software and/or hardware, and is integrated in the server that executes the method.
  • the server can be a back-end server that stores different video data.
  • the method may include the following steps:
  • the server will perform multi-rate transcoding on the source video according to the preset multiple bit-rate gears, so that subsequent delivery to users will be adaptive.
  • the source video at the bitrate.
  • the video to be transcoded is a source video that needs to be transcoded at multiple rates under any content type uploaded to the server by other users.
  • Picture coding feature refers to the basic parameters included in the source video at a specific bit rate gear that can evaluate the objective encoding quality of multiple video frames in the source video after encoding at the specific bit rate gear, for example, the encoded video The peak signal to noise ratio (PSNR) of the frame, coding rate, and coding quantization parameters, etc.
  • PSNR peak signal to noise ratio
  • the picture coding feature is a feature set that can characterize the temporal and spatial complexity of the video picture.
  • the picture coding feature corresponds to
  • the specific bitrate can be the coding bitrate used when the user uploads the source video, or the specific bitrate used by the server for preliminary transcoding after receiving the source video.
  • the to-be-transcoded rate gears are a variety of transcoding rates set in advance for the source video that can match the real-time changing network bandwidth.
  • the subjective quality index refers to the subjective playback quality required for the user’s viewing experience when the video to be transcoded is transcoded at different transcoding rate gears and then played on the user terminal. Due to the integration of video quality evaluation methods (Video Multimethod Assessment) Fusion (VMAF) algorithm can better measure the relationship between the source video content and the subjective viewing experience of different users. Therefore, the subjective quality index in this embodiment can be represented by VMAF scores to measure the subjective subjectiveness of the source video after transcoding. Playback quality.
  • Video Multimethod Assessment Video Multimethod Assessment
  • the transcoding factor in this embodiment is used for different transcoding rate gears. Control the transcoding quality of videos to be transcoded with different content.
  • a variety of different transcoding factors are preset under each rate gear to be transcoded, so as to subsequently filter out and specify the subjective quality under each gear rate to be transcoded.
  • the transcoding factor adapted to the index is then used to transcode the source video under the to-be-transcoded rate gear.
  • the transcoding factor in this embodiment may be a plurality of CRFs preset in each gear to be transcoded.
  • the server after the server obtains the video to be transcoded, it first determines that the video to be transcoded is encoded at a specific bitrate, and can evaluate multiple pictures of the encoded video quality of the video to be transcoded. Encoding characteristics, and as shown in Figure 1B, in each to-be-transcoded rate gear, the to-be-transcoded rate gear and different transcoding factors set under the to-be-transcoded rate gear can be used for the to-be-transcoded rate gear.
  • the transcoded video is transcoded multiple times with different transcoding factors under the same transcoding rate gear, and the picture encoding characteristics of the video to be transcoded are searched for after transcoding under different transcoding factors in the transcoding rate gear Corresponding feature transcoding effect, and determine whether the feature transcoding effect after selecting different transcoding factors in the to-be-transcoding rate gear reaches the subjective quality index specified by the to-be-transcoding rate gear.
  • the transcoding factor selected when the specified subjective quality index is reached under the transcoding rate gear is used as the transcoding factor for the video to be transcoded in this embodiment to be adapted under the transcoding rate gear to be subsequently transcoded
  • the video is actually transcoded; according to the above steps, the transcoding factor that the video to be transcoded adapts to under different transcoding rate gears is determined to ensure the accuracy of the selected transcoding factor under different bit rate gears.
  • the transcoding factor that the video to be transcoded is adapted to in each of the different transcoding rate gears to be transcoded, it can be used in each transcoding rate gear to be transcoded.
  • the to-be-transcoded rate gear and the corresponding adapted transcoding factor are used to transcode the to-be-transcoded video to realize the multi-rate transcoding of the to-be-transcoded video, so as to allocate the bit rate reasonably and ensure that the different videos to be transcoded are in
  • the subjective quality after transcoding in each of the different bit rate gears is consistent, avoiding unnecessary bit rate waste and saving bandwidth resources.
  • the subjective quality index required to be achieved by the transcoded video is specified in advance under each to-be-transcoded rate gear, so that the same subjective quality is set for the video to be transcoded under the same bit-rate gear. Indicators to ensure that the subjective quality of the video to be transcoded remains the same after being transcoded at the same bit rate gear. At this time, in each gear to be transcoded, a variety of transcoding factors are selected for the transcoding of the video to be transcoded.
  • the effect of any video to be transcoded after transcoding according to the preset transcoding factor is compared with the specified subjective quality index to accurately measure any
  • the subjective playback quality of transcoded videos under different bit rate gears to ensure the accuracy of the selected transcoding factors under different bit rate gears, and then use different to-be-transcoded rate gears and corresponding transcoding factors to treat transcoding
  • the video is transcoded with multiple bit rates, thereby realizing the reasonable allocation of bit rates, avoiding unnecessary bit rate waste and saving bandwidth resources.
  • FIG. 2A is a flowchart of a video transcoding method provided in Embodiment 2 of this application
  • FIG. 2B is a schematic diagram of the principle of a video transcoding process provided in Embodiment 2 of this application.
  • This embodiment is described on the basis of the above-mentioned embodiment. This embodiment mainly explains the process of determining the transcoding factor adapted to the video to be transcoded under different to-be-transcoding rate gears.
  • this embodiment may include the following steps:
  • this embodiment uses a pre-trained neural network model to determine the transcoding factors adapted under different transcoding rate gears, it is first necessary to obtain the video to be transcoded in multiple transcoding rate gears.
  • this embodiment can encode the image encoding feature of the video to be transcoded at a specific bit rate gear, and the bit rate value of the bit rate gear to be transcoded.
  • the subjective quality indicators specified in advance under the to-be-transcoded rate gear are feature fusion to generate the comprehensive transcoding features of the to-be-transcoded video under the to-be-transcoded rate gear; according to the above steps, the to-be-transcoded video is generated at each A comprehensive transcoding feature under the to-be-transcoding rate gear.
  • the picture encoding feature of the video to be transcoded, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear are combined to obtain the video to be transcoded
  • the comprehensive transcoding feature under the to-be-transcoded rate gear may include: the picture encoding characteristics of the video to be transcoded, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear to expand dimensions Fusion, to obtain the comprehensive transcoding feature of the video to be transcoded at the rate to be transcoded.
  • the comprehensive transcoding features of the video to be transcoded under different transcoding rates can include features in multiple dimensions, so that the subsequent fusion analysis of a large number of dimensional features; therefore
  • the image encoding feature of the video to be transcoded, the bit rate value of the to-be-transcoded rate gear, and the subjective quality index specified under the to-be-transcoded rate gear can be sequentially performed four-order cross product operation, and the to-be-transcoded video
  • the picture coding features of the coded video, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear are merged, so that the characteristic dimension of the integrated transcoding characteristic after fusion is further compared with the directly merged characteristic dimension Expand to obtain a comprehensive transcoding feature of the video to be transcoded after dimension expansion and fusion at the to-be-transcoding rate gear.
  • the comprehensive transcoding feature after dimension expansion and fusion contains a large
  • S220 Perform a transcoding judgment on the comprehensive transcoding features of the video to be transcoded at different levels of the transcoding rate through the pre-built transcoding classification model, and determine that the video to be transcoded adapts under the different levels of the transcoding rate to be transcoded The transcoding factor.
  • this embodiment selects a large number of video samples in advance, and sets the transcoding factor labels adapted to the multiple video samples under different to-be-transcoded rate gears, and then trains the transcoding classification model in this embodiment. Until the transcoding classification model can accurately determine the transcoding factor that any video adapts under different to-be-transcoding rate gears. After determining the comprehensive transcoding features of the video to be transcoded under different transcoding rate gears, as shown in Figure 2B, the comprehensive transcoding features under the different transcoding rate gears can be input into the pre-built transcoding in turn. In the code classification model, the transcoding classification model is used to make corresponding transcoding judgments on the comprehensive transcoding features under different transcoding rate gears, so as to determine the adaptation of the video to be transcoded under different transcoding rate gears. Transcoding factor.
  • the transcoding factor of, the transcoding factor is used as the corresponding sample label, and for each video sample, the video sample’s picture coding feature, the transcoding rate gear to be transcoded and the specified subjective quality index are subjected to a four-order cross product operation , Get the comprehensive transcoding feature after the expansion and fusion, and then use the comprehensive transcoding feature of a large number of video samples after the expansion and fusion as the corresponding training sample set, and multiply the training sample set through the initially set transcoding classification model. Classification training, and then continuously update the network parameters in the transcoding classification model until the training is completed.
  • the pre-built transcoding classification model is used to determine the comprehensive transcoding features of the video to be transcoded at different transcoding rate gears to determine the transcoding to be transcoded.
  • the transcoding factor that the video adapts to different transcoding rate gears may include: for each transcoding rate gear, the comprehensive transcoding feature of the video to be transcoded in the transcoding rate gear is input
  • the transcoding classification model the classification scores of the video to be transcoded under different preset transcoding factors are obtained; the preset transcoding factor corresponding to the video to be transcoded with the highest classification score is used as the video to be transcoded in the to-be-transcoded video
  • the transcoding factor adapted in the rate gear is used to determine the comprehensive transcoding features of the video to be transcoded at different transcoding rate gears to determine the transcoding to be transcoded.
  • transcoding classification model of this embodiment multiple transcoding factors are preset, and the comprehensive transcoding features of the video to be transcoded at each transcoding rate gear to be transcoded are input into the transcoding classification model, and the transcoding The code classification model analyzes the comprehensive transcoding features of each rate to be transcoded to output the classification scores of the video to be transcoded under different preset transcoding factors in the transcoding classification model, and then select different The preset transcoding factor corresponding to the to-be-transcoded video with the highest classification score under the to-be-transcoding rate gear is used as the transcoding factor for the to-be-transcoded video to be adapted under the to-be-transcoding rate gear.
  • the transcoding classification model in this embodiment may be composed of two or more layers of transcoding classification sub-models.
  • a three-layer small neural network is used.
  • the first layer can be the series connection of full connection and convolution.
  • the second and third layers are the series connection of batch normalization, full connection and convolution respectively.
  • the logistic regression layer outputs the waiting The classification score of the transcoded video under each preset transcoding factor.
  • the technical solution provided in this embodiment integrates the picture coding characteristics of the video to be transcoded, the rate gear to be transcoded, and the designated subjective quality index to obtain the comprehensive transcoding characteristics of the video to be transcoded at different transcoding rate gears. , And use the pre-built transcoding classification model to make transcoding judgments on the comprehensive transcoding features under different transcoding rate gears, and determine the transcoding factor for the video to be transcoded under different transcoding rate gears.
  • FIG. 3A is a flowchart of a method for video transcoding provided in Embodiment 3 of this application
  • FIG. 3B is a schematic diagram of the principle of a video transcoding process provided in Embodiment 3 of this application.
  • This embodiment is described on the basis of the foregoing embodiment. This embodiment mainly explains the extraction process of the picture coding features and subjective quality indicators of the video to be transcoded.
  • this embodiment may include the following steps:
  • S310 Extract the corresponding picture encoding features of the video to be transcoded after being transcoded in the lowest bit rate gear among the different transcoding rate gears, and subjective quality indicators specified by the different transcoding rate gears.
  • the video to be transcoded before acquiring the image encoding feature of the video to be transcoded, the video to be transcoded is first transcoded at the lowest bit rate among the different to-be-transcoded rate gears, and the video is transcoded at the lowest bit rate.
  • the basic characteristics of the picture are extracted from the transcoded video at the rate level, as the picture coding characteristics of the video to be transcoded in this embodiment, and the video content contained in the transcoded video is analyzed by the VMAF algorithm to determine the transcoded video
  • the subjective quality index specified for the video at different levels of the transcoding rate to be used for subsequent judgment of the adapted transcoding factor is
  • extracting the corresponding picture coding features of the video to be transcoded at the lowest bit rate among the different to-be-transcoded rate gears may include: using different to-be-transcoded Transcode the video to be transcoded with the lowest bit rate gear in the bit rate gear and the fixed transcoding factor under the lowest bit rate gear; extract multiple keys of the video to be transcoded after being transcoded at the lowest bit rate gear.
  • the corresponding fixed transcoding factor is set for the lowest bit rate gear in advance. At this time, the lowest bit rate gear and the corresponding fixed transcoding factor are used.
  • the transcoding factor transcodes the video to be transcoded, and extracts the key information that can characterize the picture characteristics from the transcoded video at the lowest bit rate gear, and compares the resolution and target quality of the video at different gears with the above key Information, through a certain dimension expansion method, the picture coding characteristics of the video to be transcoded are obtained.
  • S320 Determine, according to the image encoding characteristics of the video to be transcoded, that the video to be transcoded reaches each of the transcoding rate gears after being transcoded in each of the different transcoding rate gears.
  • the transcoding factor selected in the case of the specified subjective quality index.
  • the technical solution provided in this embodiment obtains the corresponding picture coding features of the video to be transcoded at the lowest bit rate among the different to-be-transcoded rate gears, ensures the accuracy of the picture coding characteristics, and then determines the video to be transcoded.
  • the picture encoding feature of the coded video is transcoded with the cooperation of the corresponding transcoding factor under the to-be-transcoded rate gear and the transcoding factor selected when the subjective quality index specified under the to-be-transcoded rate gear is reached.
  • any video to be transcoded after being transcoded according to the predicted transcoding factor is compared with the specified subjective quality index to accurately measure the subjective playback quality of any video to be transcoded under different bitrates to ensure the difference
  • the accuracy of the selected transcoding factor under the bit rate gear, and then use different to-be-transcoded rate gears and the corresponding transcoding factor to perform multi-rate transcoding on the video to be transcoded thus realizing the reasonable allocation of the bit rate.
  • FIG. 4 is a schematic structural diagram of a video transcoding apparatus provided in Embodiment 4 of this application.
  • the device may include: a transcoding factor determining module 410, configured to determine each of the to-be-transcoded video in different to-be-transcoded rate gears according to the picture coding characteristics of the to-be-transcoded video The transcoding factor used when the subjective quality index specified by each of the transcoding rate gears is reached after being transcoded in the rate gear; the video transcoding module 420 is set to adopt different transcoding rate gears and The corresponding transcoding factor transcodes the video to be transcoded.
  • the subjective quality index required to be achieved by the transcoded video is specified in advance under each to-be-transcoded rate gear, so that the same subjective quality is set for the video to be transcoded under the same bit-rate gear. Indicators to ensure that the subjective quality of the video to be transcoded remains the same after being transcoded at the same bit rate gear. At this time, in each gear to be transcoded, a variety of transcoding factors are selected for the transcoding of the video to be transcoded.
  • the effect of any video to be transcoded after transcoding according to the preset transcoding factor is compared with the specified subjective quality index to accurately measure any
  • the subjective playback quality of transcoded videos under different bit rate gears to ensure the accuracy of the selected transcoding factors under different bit rate gears, and then use different to-be-transcoded rate gears and corresponding transcoding factors to treat transcoding
  • the video is transcoded at multiple bit rates, thus realizing the reasonable allocation of bit rates, ensuring that the subjective quality of different videos to be transcoded under the same bit rate gear at different bit rate gears is consistent, and unnecessary codes are avoided. Rate waste and save bandwidth resources.
  • the above-mentioned transcoding factor determination module 410 may include: a feature fusion unit configured to fuse the picture coding features of the video to be transcoded, the to-be-transcoded rate gear, and the to-be-transcoded rate for each rate gear to be transcoded
  • the subjective quality index specified by the gear position is used to obtain the comprehensive transcoding characteristics of the video to be transcoded under the transcoding rate gear position; the transcoding factor adaptation unit is set to treat the transcoded video separately through the pre-built transcoding classification model
  • the comprehensive transcoding features under different transcoding rate gears are used for transcoding determination, and the transcoding factor adapted to the video to be transcoded under the different transcoding rate gears is determined.
  • the above-mentioned feature fusion unit is set to expand the image encoding feature of the video to be transcoded, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear for each of the to-be-transcoded rate gears. Dimension fusion is obtained to obtain the comprehensive transcoding feature of the video to be transcoded at the to-be-transcoded rate gear.
  • the above-mentioned transcoding factor adaptation unit is set to: for each rate gear to be transcoded, the comprehensive transcoding feature of the video to be transcoded in the rate gear to be transcoded is input into the transcoding classification model to obtain the The classification score of the transcoded video under different preset transcoding factors; the preset transcoding factor with the highest classification score is used as the transcoding factor for the video to be transcoded to be adapted under the to-be-transcoded rate gear.
  • the above-mentioned transcoding classification model may be composed of two or more layers of transcoding classification sub-models.
  • the above-mentioned video transcoding device may further include: a transcoding parameter extraction module, which is configured to extract the corresponding picture encoding features of the video to be transcoded at the lowest bit rate among different transcoding rate gears, and different The subjective quality index designated by the transcoding rate gear.
  • a transcoding parameter extraction module which is configured to extract the corresponding picture encoding features of the video to be transcoded at the lowest bit rate among different transcoding rate gears, and different The subjective quality index designated by the transcoding rate gear.
  • the above-mentioned transcoding parameter extraction module can be set to extract the corresponding picture encoding characteristics of the video to be transcoded at the lowest bit rate position among the different to be transcoded rate gears through the following method: use different to be transcoded rate files Transcode the video to be transcoded in the lowest bit rate gear and the fixed transcoding factor under the lowest bit rate gear; extract multiple key frames of the video to be transcoded after being transcoded in the lowest bit rate gear, and The key information in the multiple key frames is combined to obtain the picture coding feature of the video to be transcoded, and the key information is used to characterize the picture coding feature of the video to be transcoded.
  • the video transcoding device provided in this embodiment is applicable to the video transcoding method provided in any of the foregoing embodiments, and has corresponding functions.
  • FIG. 5 is a schematic structural diagram of a server provided in Embodiment 5 of this application.
  • the server includes a processor 50, a storage device 51, and a communication device 52; the number of processors 50 in the server may be one or more.
  • One, one processor 50 is taken as an example in FIG. 5; the processor 50, the storage device 51, and the communication device 52 in the server may be connected by a bus or other means. In FIG. 5, the connection by a bus is taken as an example.
  • the storage device 51 can be configured to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the video transcoding method provided in the embodiments of the present application.
  • the processor 50 executes various functional applications and data processing of the server by running the software programs, instructions, and modules stored in the storage device 51, that is, realizes the above-mentioned video transcoding method.
  • the storage device 51 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system and an application program required by at least one function; the storage data area may store data created according to the use of the terminal, and the like.
  • the storage device 51 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the storage device 51 may further include a memory remotely provided with respect to the processor 50, and these remote memories may be connected to the server through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the communication device 62 may be configured to implement a network connection or a mobile data connection between the server and the terminal.
  • the server provided in this embodiment can be configured to execute the video transcoding method provided in any of the foregoing embodiments, and has corresponding functions.
  • the sixth embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the video transcoding method in any of the foregoing embodiments can be implemented.
  • the method may include: determining, according to the picture coding characteristics of the video to be transcoded, that the video to be transcoded is transcoded in each of the different to-be-transcoded rate gears to reach each of the to-be-transcoded rate gears. Transcoding factor selected in the case of subjective quality indicators specified by the transcoding rate gear; using different transcoding rate gears and corresponding transcoding factors to transcode the video to be transcoded.
  • the storage medium containing computer-executable instructions provided by the embodiments of the present application is not limited to the method operations described above, and can also execute the video transcoding method provided by any embodiment of the present application. Related operations in.
  • the multiple units and modules included are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be realized;
  • the specific names of multiple functional units are only used to facilitate distinguishing from each other, and are not used to limit the protection scope of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Disclosed are a video transcoding method and apparatus, a server and a storage medium. The method comprises: according to a picture encoding feature of a video to be transcoded, determining a transcoding factor selected and used when said video reaches, after being transcoded at each different rate gear to be used for transcoding, a subjective quality index specified by each of the rate gears to be transcoded; and by using the different rate gears to be transcoded and corresponding transcoding factors, transcoding said video.

Description

视频转码的方法、装置、服务器和存储介质Method, device, server and storage medium for video transcoding
本申请要求在2019年12月31日提交中国专利局、申请号为201911410012.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office with an application number of 201911410012.0 on December 31, 2019. The entire content of this application is incorporated into this application by reference.
技术领域Technical field
本申请实施例涉及视频处理领域,例如涉及一种视频转码的方法、装置、服务器和存储介质。The embodiments of the present application relate to the field of video processing, for example, to a method, device, server, and storage medium for video transcoding.
背景技术Background technique
在视频转码系统中,为了适应不同终端在不同时刻下的网络带宽以及对应的终端处理能力等,往往需要按照不同的码率档位和该码率档位下预设的恒定码率因子(Constant rate factor,CRF)对所接收到的源视频流进行多码率转码,该CRF能够使源视频流在该码率档位下转码后达到相应的视频播放质量,以便后续向不同终端分发自适应码率档位下的视频流。In a video transcoding system, in order to adapt to the network bandwidth of different terminals at different times and the corresponding terminal processing capabilities, it is often necessary to follow different bit rate gears and the preset constant bit rate factor ( Constant rate factor, CRF) performs multi-rate transcoding on the received source video stream. The CRF can make the source video stream reach the corresponding video playback quality after transcoding at this bit rate gear, so that it can be subsequently transferred to different terminals Distribute video streams in adaptive bitrate gears.
此时在不同码率档位下采用对应预设的CRF对源视频流进行转码时,视频播放质量主要为视频的客观播放质量,而由于受到人眼对视频的对比敏感度、亮度非线性、频率灵敏性和掩盖效应等影响,用户对采用同一码率档位进行转码后的不同内容的视频流的主观播放质量会存在较大差别,例如用户对缓慢运动下视频画面的失真比较敏感,而不太容易察觉剧烈运动下视频画面的失真,因此对任意视频,在不同码率档位下均利用对应预设的固定CRF进行多码率转码,无法保证用户观看的不同视频可以达到同一主观质量。At this time, when the source video stream is transcoded with the corresponding preset CRF under different bit rate gears, the video playback quality is mainly the objective playback quality of the video, and due to the contrast sensitivity and brightness nonlinearity of the human eye to the video , Frequency sensitivity and masking effect, users will have a big difference in the subjective playback quality of different content video streams after transcoding with the same bitrate. For example, users are more sensitive to the distortion of the video picture under slow motion. , And it is not easy to detect the distortion of the video picture under intense exercise. Therefore, for any video, the corresponding preset fixed CRF is used for multi-bitrate transcoding at different bitrates. It cannot be guaranteed that the different videos watched by the user can reach The same subjective quality.
首先对源视频流的内容进行感知分析,判断源视频流所属的视频分类,如电影、体育运动或动漫等,然后采用适配该视频分类下预先为不同码率档位对应设定的CRF对源视频流进行多码率转码,此时通过在不同视频分类下设定对应的CRF,以根据不同用户带宽选用对应的CRF来下发视频,但无法准确衡量用户对同一视频分类下不同内容的视频流的主观播放质量。First, perform perceptual analysis on the content of the source video stream to determine the video category to which the source video stream belongs, such as movies, sports, or animation, etc., and then use the CRF pairs that are set in advance for different bitrates under the video category. The source video stream is transcoded at multiple bit rates. At this time, by setting the corresponding CRF under different video categories, the corresponding CRF is selected according to different user bandwidths to deliver the video, but it cannot accurately measure the user's different content under the same video category. The subjective playback quality of the video stream.
发明内容Summary of the invention
本申请实施例提供了一种视频转码的方法、装置、服务器和存储介质,保证不同的待转码视频在同一码率档位下转码后的主观质量一致,提高码率分配的合理性。The embodiments of the application provide a method, device, server and storage medium for video transcoding to ensure that the subjective quality of different videos to be transcoded is consistent after being transcoded at the same bit rate gear, and the rationality of bit rate allocation is improved. .
本申请实施例提供了一种视频转码的方法,该方法包括:根据待转码视频 的画面编码特征,确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标时所采用的转码因子;采用不同待转码率档位以及对应的转码因子对所述待转码视频进行转码。The embodiment of the present application provides a method for video transcoding. The method includes: determining each of the to-be-transcoded video to be transcoded in different to-be-transcoded rate gears according to the picture coding characteristics of the to-be-transcoded video The transcoding factor used when the subjective quality index specified by each of the to-be-transcoding rate gears is reached after being transcoded in the rate gear; the different to-be-transcoded rate gears and the corresponding transcoding factors are used to compare the to-be-transcoded rate gears and the corresponding transcoding factors. Code video for transcoding.
本申请实施例提供了一种视频转码的装置,该装置包括:转码因子确定模块,设置为根据待转码视频的画面编码特征,确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标时所采用的转码因子;视频转码模块,设置为采用不同待转码率档位以及对应的转码因子对所述待转码视频进行转码。The embodiment of the application provides a video transcoding device, the device includes: a transcoding factor determination module configured to determine the video to be transcoded at different transcoding rate files according to the picture coding characteristics of the video to be transcoded The transcoding factor used when each of the to-be-transcoded rate gears is transcoded and reaches the subjective quality index specified by each of the to-be-transcoded rate gears; the video transcoding module is set to use different to-be-transcoded The code rate gear and the corresponding transcoding factor transcode the video to be transcoded.
本申请实施例提供了一种服务器,该服务器包括:一个或多个处理器;存储装置,设置为存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现本申请任意实施例所述的视频转码的方法。An embodiment of the present application provides a server, which includes: one or more processors; a storage device configured to store one or more programs; when the one or more programs are used by the one or more processors When executed, the one or more processors are caused to implement the video transcoding method described in any embodiment of the present application.
本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本申请任意实施例所述的视频转码的方法。An embodiment of the present application provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the method for video transcoding described in any embodiment of the present application is implemented.
附图说明Description of the drawings
图1A为本申请实施例一提供的一种视频转码的方法的流程图;FIG. 1A is a flowchart of a method for video transcoding provided in Embodiment 1 of this application;
图1B为本申请实施例一提供的视频转码过程的原理示意图;FIG. 1B is a schematic diagram of the principle of the video transcoding process provided in Embodiment 1 of this application;
图2A为本申请实施例二提供的一种视频转码方法的流程图;2A is a flowchart of a video transcoding method provided in Embodiment 2 of this application;
图2B为本申请实施例二提供的视频转码过程的原理示意图;2B is a schematic diagram of the principle of the video transcoding process provided in the second embodiment of the application;
图3A为本申请实施例三提供的一种视频转码方法的流程图;3A is a flowchart of a video transcoding method provided in Embodiment 3 of this application;
图3B为本申请实施例三提供的视频转码过程的原理示意图;FIG. 3B is a schematic diagram of the principle of the video transcoding process provided in Embodiment 3 of this application;
图4为本申请实施例四提供的一种视频转码装置的结构示意图;FIG. 4 is a schematic structural diagram of a video transcoding device provided in Embodiment 4 of this application;
图5为本申请实施例五提供的一种服务器的结构示意图。FIG. 5 is a schematic structural diagram of a server provided in Embodiment 5 of this application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请进行说明。可以理解的是,此处所描述的实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。The application will be described below with reference to the drawings and embodiments. It can be understood that the embodiments described here are only used to explain the application, but not to limit the application. In addition, it should be noted that, for ease of description, the drawings only show a part of the structure related to the present application instead of all of the structure.
实施例一Example one
图1A为本申请实施例一提供的一种视频转码的方法的流程图。本实施例可应用于对任一视频进行多码率转码的情况中。本实施例提供的一种视频转码的 方法可以由本申请实施例提供的视频转码的装置来执行,该装置可以通过软件和/或硬件的方式来实现,并集成在执行本方法的服务器中,该服务器可以是存储有不同视频数据的后台服务器。FIG. 1A is a flowchart of a method for video transcoding provided in Embodiment 1 of this application. This embodiment can be applied to the case of performing multi-rate transcoding on any video. The video transcoding method provided in this embodiment can be executed by the video transcoding device provided in this embodiment of the application, and the device can be implemented in software and/or hardware, and is integrated in the server that executes the method. , The server can be a back-end server that stores different video data.
参考图1A,该方法可以包括如下步骤:Referring to FIG. 1A, the method may include the following steps:
S110,根据待转码视频的画面编码特征,确定待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标时所选用的转码因子。S110, according to the picture coding characteristics of the video to be transcoded, determine that the video to be transcoded reaches each of the transcoding rate gears after being transcoded in each of the different transcoding rate gears to be transcoded The transcoding factor selected when the subjective quality index is specified.
为了避免视频播放卡顿,多个用户从服务器下载相应视频时,需要选择与当前网络带宽匹配的码率下的视频,而由于视频下载过程中用户的网络带宽是实时变化的,因此为了避免在多个网络带宽下视频下载后的播放卡顿,服务器针对任一源视频,会按照预先设置的多种码率档位对该源视频进行多码率转码,以便后续向用户下发自适应码率下的源视频。In order to avoid video playback freezes, when multiple users download the corresponding video from the server, they need to select the video at the bit rate that matches the current network bandwidth, and because the user’s network bandwidth changes in real time during the video download process, in order to avoid Video playback freezes after downloading videos under multiple network bandwidths. For any source video, the server will perform multi-rate transcoding on the source video according to the preset multiple bit-rate gears, so that subsequent delivery to users will be adaptive. The source video at the bitrate.
待转码视频为其他用户向服务器上传的任一内容类型下需要进行多码率转码的源视频。画面编码特征是指一特定码率档位下的源视频中包含的能够评价源视频中的多个视频帧在该特定码率档位下编码后的客观编码质量的基础参数,例如编码后视频帧的峰值信噪比(Peak Signal to Noise Ratio,PSNR)、编码码率以及编码量化参数等,同时该画面编码特征为可以表征视频画面的时空复杂度的特征集合,此时该画面编码特征对应的特定码率档位可以是用户上传该源视频时采用的编码码率,也可以是服务器收到该源视频后进行初步转码时所采用的特定码率。待转码率档位是预先为源视频设定的能够与实时变化的网络带宽匹配的多种转码码率。主观质量指标是指待转码视频在不同待转码率档位下转码后在用户终端播放时,对于用户观看体验所要求达到的主观播放质量,由于视频质量多方法评估融合(Video Multimethod Assessment Fusion,VMAF)算法能够较好的衡量源视频内容与不同用户的主观观看感受之间的关系,因此本实施例中的主观质量指标可以采用VMAF分数来表示,以衡量源视频转码后的主观播放质量。The video to be transcoded is a source video that needs to be transcoded at multiple rates under any content type uploaded to the server by other users. Picture coding feature refers to the basic parameters included in the source video at a specific bit rate gear that can evaluate the objective encoding quality of multiple video frames in the source video after encoding at the specific bit rate gear, for example, the encoded video The peak signal to noise ratio (PSNR) of the frame, coding rate, and coding quantization parameters, etc. At the same time, the picture coding feature is a feature set that can characterize the temporal and spatial complexity of the video picture. At this time, the picture coding feature corresponds to The specific bitrate can be the coding bitrate used when the user uploads the source video, or the specific bitrate used by the server for preliminary transcoding after receiving the source video. The to-be-transcoded rate gears are a variety of transcoding rates set in advance for the source video that can match the real-time changing network bandwidth. The subjective quality index refers to the subjective playback quality required for the user’s viewing experience when the video to be transcoded is transcoded at different transcoding rate gears and then played on the user terminal. Due to the integration of video quality evaluation methods (Video Multimethod Assessment) Fusion (VMAF) algorithm can better measure the relationship between the source video content and the subjective viewing experience of different users. Therefore, the subjective quality index in this embodiment can be represented by VMAF scores to measure the subjective subjectiveness of the source video after transcoding. Playback quality.
由于码率档位决定待转码视频在单位时间内的整体文件大小,无法保证不同内容的视频的转码质量,因此本实施例中的转码因子用于在不同待转码率档位下控制不同内容的待转码视频的转码质量。如图1B所示,本实施例在每一待转码率档位下会预先设定多种不同的转码因子,以便后续在每个待转码率档位下筛选出与指定的主观质量指标适配的转码因子,进而在该待转码率档位下采用该转码因子对源视频进行转码。本实施例中的转码因子可以为在每一待转码率档位下预先设定的多个CRF。Since the bit rate gear determines the overall file size of the video to be transcoded in a unit time, the transcoding quality of videos with different content cannot be guaranteed, so the transcoding factor in this embodiment is used for different transcoding rate gears. Control the transcoding quality of videos to be transcoded with different content. As shown in FIG. 1B, in this embodiment, a variety of different transcoding factors are preset under each rate gear to be transcoded, so as to subsequently filter out and specify the subjective quality under each gear rate to be transcoded. The transcoding factor adapted to the index is then used to transcode the source video under the to-be-transcoded rate gear. The transcoding factor in this embodiment may be a plurality of CRFs preset in each gear to be transcoded.
可选的,服务器在获取到待转码视频后,首先确定出该待转码视频在一特 定码率档位下编码后,能够评价该待转码视频在编码后的多项画面质量的画面编码特征,且如图1B所示,在每一待转码率档位下,可以采用该待转码率档位以及在该待转码率档位下设定的不同转码因子对该待转码视频进行同一待转码率档位下不同转码因子的多次转码,并查找该待转码视频的画面编码特征在该待转码率档位的不同转码因子下转码后对应的特征转码效果,并判断在该待转码率档位下选用不同转码因子转码后的特征转码效果是否达到该待转码率档位指定的主观质量指标,将在该待转码率档位下达到指定的主观质量指标时所选用的转码因子作为本实施例待转码视频在该待转码率档位下适配的转码因子,以便后续对该待转码视频进行实际转码;按照上述步骤,确定待转码视频在不同待转码率档位下适配的转码因子,保证不同码率档位下所选用的转码因子的准确性。Optionally, after the server obtains the video to be transcoded, it first determines that the video to be transcoded is encoded at a specific bitrate, and can evaluate multiple pictures of the encoded video quality of the video to be transcoded. Encoding characteristics, and as shown in Figure 1B, in each to-be-transcoded rate gear, the to-be-transcoded rate gear and different transcoding factors set under the to-be-transcoded rate gear can be used for the to-be-transcoded rate gear. The transcoded video is transcoded multiple times with different transcoding factors under the same transcoding rate gear, and the picture encoding characteristics of the video to be transcoded are searched for after transcoding under different transcoding factors in the transcoding rate gear Corresponding feature transcoding effect, and determine whether the feature transcoding effect after selecting different transcoding factors in the to-be-transcoding rate gear reaches the subjective quality index specified by the to-be-transcoding rate gear. The transcoding factor selected when the specified subjective quality index is reached under the transcoding rate gear is used as the transcoding factor for the video to be transcoded in this embodiment to be adapted under the transcoding rate gear to be subsequently transcoded The video is actually transcoded; according to the above steps, the transcoding factor that the video to be transcoded adapts to under different transcoding rate gears is determined to ensure the accuracy of the selected transcoding factor under different bit rate gears.
S120,采用不同待转码率档位以及对应的转码因子对待转码视频进行转码。S120, using different to-be-transcoded rate gears and corresponding transcoding factors to transcode the video to be transcoded.
可选的,在确定待转码视频在不同待转码率档位中的每一待转码率档位下适配的转码因子后,可以在每一待转码率档位下,采用该待转码率档位以及对应适配的转码因子对该待转码视频进行转码,实现待转码视频的多码率转码,从而合理分配码率,保证不同待转码视频在不同码率档位中的每一码率档位下转码后的主观质量一致,避免不必要的码率浪费而节省带宽资源。Optionally, after determining the transcoding factor that the video to be transcoded is adapted to in each of the different transcoding rate gears to be transcoded, it can be used in each transcoding rate gear to be transcoded. The to-be-transcoded rate gear and the corresponding adapted transcoding factor are used to transcode the to-be-transcoded video to realize the multi-rate transcoding of the to-be-transcoded video, so as to allocate the bit rate reasonably and ensure that the different videos to be transcoded are in The subjective quality after transcoding in each of the different bit rate gears is consistent, avoiding unnecessary bit rate waste and saving bandwidth resources.
本实施例提供的技术方案,预先在每一待转码率档位下指定出要求转码后的视频达到的主观质量指标,从而对待转码视频在同一码率档位下设置相同的主观质量指标,保证待转码视频在同一码率挡位下转码后的主观质量保持一致,此时在每一待转码率档位下,分别配合选用多种转码因子对待转码视频进行转码,并确定出待转码视频的画面编码特征在对应转码因子的配合下进行转码后达到该待转码率档位下指定的主观质量指标时所选用的转码因子,进而确定每一待转码率档位下的转码因子,此时对任一待转码视频按照预设的转码因子进行转码后的效果与指定的主观质量指标进行比对,准确衡量任一待转码视频在不同码率档位下的主观播放质量,保证不同码率档位下所选用的转码因子的准确性,然后采用不同待转码率档位以及对应的转码因子对待转码视频进行多码率转码,从而实现了码率的合理分配,避免不必要的码率浪费而节省带宽资源。In the technical solution provided in this embodiment, the subjective quality index required to be achieved by the transcoded video is specified in advance under each to-be-transcoded rate gear, so that the same subjective quality is set for the video to be transcoded under the same bit-rate gear. Indicators to ensure that the subjective quality of the video to be transcoded remains the same after being transcoded at the same bit rate gear. At this time, in each gear to be transcoded, a variety of transcoding factors are selected for the transcoding of the video to be transcoded. Code, and determine the image encoding feature of the video to be transcoded, after transcoding with the cooperation of the corresponding transcoding factor, the selected transcoding factor to be used when the subjective quality index specified under the to-be-transcoding rate gear is reached, and then determine each A transcoding factor under the transcoding rate gear. At this time, the effect of any video to be transcoded after transcoding according to the preset transcoding factor is compared with the specified subjective quality index to accurately measure any The subjective playback quality of transcoded videos under different bit rate gears, to ensure the accuracy of the selected transcoding factors under different bit rate gears, and then use different to-be-transcoded rate gears and corresponding transcoding factors to treat transcoding The video is transcoded with multiple bit rates, thereby realizing the reasonable allocation of bit rates, avoiding unnecessary bit rate waste and saving bandwidth resources.
实施例二Example two
图2A为本申请实施例二提供的一种视频转码的方法的流程图,图2B为本申请实施例二提供的视频转码过程的原理示意图。本实施例是在上述实施例的基础上进行说明,本实施例主要对于待转码视频在不同待转码率档位下适配的转码因子的确定过程进行解释说明。FIG. 2A is a flowchart of a video transcoding method provided in Embodiment 2 of this application, and FIG. 2B is a schematic diagram of the principle of a video transcoding process provided in Embodiment 2 of this application. This embodiment is described on the basis of the above-mentioned embodiment. This embodiment mainly explains the process of determining the transcoding factor adapted to the video to be transcoded under different to-be-transcoding rate gears.
可选的,如图2A所示,本实施例可以包括如下步骤:Optionally, as shown in FIG. 2A, this embodiment may include the following steps:
S210,针对每一待转码率档位,融合待转码视频的画面编码特征、该待转码率档位以及该待转码率档位指定的主观质量指标,得到待转码视频在该待转码率档位下的综合转码特征。S210: For each rate gear to be transcoded, the image encoding feature of the video to be transcoded, the rate gear to be transcoded, and the subjective quality index specified by the rate gear to be transcoded are merged to obtain the video to be transcoded in the The comprehensive transcoding feature under the rate to be transcoded.
可选的,由于本实施例会通过预先训练的神经网络模型来判定不同待转码率档位下适配的转码因子,因此首先需要获取待转码视频在多个待转码率档位下的转码特征,针对每一待转码率档位,本实施例可以将待转码视频在一特定码率档位下编码后的画面编码特征、该待转码率档位的码率值以及该待转码率档位下预先指定的主观质量指标进行特征融合,生成待转码视频在该待转码率档位下的综合转码特征;按照上述步骤,生成待转码视频在每一待转码率档位下的综合转码特征。Optionally, since this embodiment uses a pre-trained neural network model to determine the transcoding factors adapted under different transcoding rate gears, it is first necessary to obtain the video to be transcoded in multiple transcoding rate gears. For each rate gear to be transcoded, this embodiment can encode the image encoding feature of the video to be transcoded at a specific bit rate gear, and the bit rate value of the bit rate gear to be transcoded. And the subjective quality indicators specified in advance under the to-be-transcoded rate gear are feature fusion to generate the comprehensive transcoding features of the to-be-transcoded video under the to-be-transcoded rate gear; according to the above steps, the to-be-transcoded video is generated at each A comprehensive transcoding feature under the to-be-transcoding rate gear.
示例性的,如图2B所示,本实施例中融合待转码视频的画面编码特征、该待转码率档位以及该待转码率档位指定的主观质量指标,得到待转码视频在该待转码率档位下的综合转码特征,可以包括:对待转码视频的画面编码特征、该待转码率档位以及该待转码率档位指定的主观质量指标进行扩维融合,得到待转码视频在该待转码率档位下的综合转码特征。Exemplarily, as shown in FIG. 2B, in this embodiment, the picture encoding feature of the video to be transcoded, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear are combined to obtain the video to be transcoded The comprehensive transcoding feature under the to-be-transcoded rate gear may include: the picture encoding characteristics of the video to be transcoded, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear to expand dimensions Fusion, to obtain the comprehensive transcoding feature of the video to be transcoded at the rate to be transcoded.
为了保证神经网络模型的预测准确性,要求待转码视频在不同待转码率档位下的综合转码特征中能够包含多种维度下的特征,以便后续对大量维度特征进行融合分析;因此本实施例可以通过依次对待转码视频的画面编码特征、该待转码率档位的码率值以及该待转码率档位下指定的主观质量指标进行四阶叉乘运算,而对待转码视频的画面编码特征、该待转码率档位以及该待转码率档位指定的主观质量指标进行融合,进而使融合后的综合转码特征的特征维度相比直接合并的特征维度进一步扩大,从而得到待转码视频在该待转码率档位下扩维融合后的综合转码特征,此时该扩维融合后的综合转码特征中包含大量不同维度下的特征信息。In order to ensure the prediction accuracy of the neural network model, it is required that the comprehensive transcoding features of the video to be transcoded under different transcoding rates can include features in multiple dimensions, so that the subsequent fusion analysis of a large number of dimensional features; therefore In this embodiment, the image encoding feature of the video to be transcoded, the bit rate value of the to-be-transcoded rate gear, and the subjective quality index specified under the to-be-transcoded rate gear can be sequentially performed four-order cross product operation, and the to-be-transcoded video The picture coding features of the coded video, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear are merged, so that the characteristic dimension of the integrated transcoding characteristic after fusion is further compared with the directly merged characteristic dimension Expand to obtain a comprehensive transcoding feature of the video to be transcoded after dimension expansion and fusion at the to-be-transcoding rate gear. At this time, the comprehensive transcoding feature after dimension expansion and fusion contains a large amount of feature information in different dimensions.
S220,通过预先构建的转码分类模型分别对待转码视频在不同待转码率档位下的综合转码特征进行转码判定,确定待转码视频在不同待转码率档位下适配的转码因子。S220: Perform a transcoding judgment on the comprehensive transcoding features of the video to be transcoded at different levels of the transcoding rate through the pre-built transcoding classification model, and determine that the video to be transcoded adapts under the different levels of the transcoding rate to be transcoded The transcoding factor.
可选的,本实施例会预先选用大量视频样本,并设定多个视频样本在不同待转码率档位下适配的转码因子标签,进而对本实施例中的转码分类模型进行训练,直至该转码分类模型能够准确判定任一视频在不同待转码率档位下适配的转码因子。在确定待转码视频在不同待转码率档位下的综合转码特征后,如图2B所示,可以将该不同待转码率档位下的综合转码特征依次输入预先构建的转码分类模型中,通过该转码分类模型对不同待转码率档位下的综合转码特征进行相应的转码判定,从而确定待转码视频在不同待转码率档位下适配的转码 因子。Optionally, this embodiment selects a large number of video samples in advance, and sets the transcoding factor labels adapted to the multiple video samples under different to-be-transcoded rate gears, and then trains the transcoding classification model in this embodiment. Until the transcoding classification model can accurately determine the transcoding factor that any video adapts under different to-be-transcoding rate gears. After determining the comprehensive transcoding features of the video to be transcoded under different transcoding rate gears, as shown in Figure 2B, the comprehensive transcoding features under the different transcoding rate gears can be input into the pre-built transcoding in turn. In the code classification model, the transcoding classification model is used to make corresponding transcoding judgments on the comprehensive transcoding features under different transcoding rate gears, so as to determine the adaptation of the video to be transcoded under different transcoding rate gears. Transcoding factor.
在训练本实施例中的转码分类模型时,首先会选取出大量视频样本,并确定每一视频样本在不同待转码率档位下的主观质量指标,以及达到该主观质量指标时所选用的转码因子,将该转码因子作为对应的样本标签,并针对每一视频样本,对该视频样本的画面编码特征、待转码率档位以及指定的主观质量指标进行四阶叉乘运算,得到扩维融合后的综合转码特征,进而将大量视频样本在扩维融合后的综合转码特征作为对应的训练样本集,通过初始设定的转码分类模型对该训练样本集进行多分类训练,进而不断更新转码分类模型中的网络参数,直至训练完成。When training the transcoding classification model in this embodiment, a large number of video samples will be selected first, and the subjective quality index of each video sample at different levels to be transcoded will be determined, and the subjective quality index will be selected when the subjective quality index is reached. The transcoding factor of, the transcoding factor is used as the corresponding sample label, and for each video sample, the video sample’s picture coding feature, the transcoding rate gear to be transcoded and the specified subjective quality index are subjected to a four-order cross product operation , Get the comprehensive transcoding feature after the expansion and fusion, and then use the comprehensive transcoding feature of a large number of video samples after the expansion and fusion as the corresponding training sample set, and multiply the training sample set through the initially set transcoding classification model. Classification training, and then continuously update the network parameters in the transcoding classification model until the training is completed.
示例性的,如图2B所示,本实施例中通过预先构建的转码分类模型分别对待转码视频在不同待转码率档位下的综合转码特征进行转码判定,确定待转码视频在不同待转码率档位下适配的转码因子,可以包括:针对每一待转码率档位,将待转码视频在该待转码率档位下的综合转码特征输入转码分类模型中,得到待转码视频在不同预设转码因子下的分类得分;将分类得分最高的待转码视频对应的预设转码因子,作为待转码视频在该待转码率档位下适配的转码因子。Exemplarily, as shown in FIG. 2B, in this embodiment, the pre-built transcoding classification model is used to determine the comprehensive transcoding features of the video to be transcoded at different transcoding rate gears to determine the transcoding to be transcoded. The transcoding factor that the video adapts to different transcoding rate gears may include: for each transcoding rate gear, the comprehensive transcoding feature of the video to be transcoded in the transcoding rate gear is input In the transcoding classification model, the classification scores of the video to be transcoded under different preset transcoding factors are obtained; the preset transcoding factor corresponding to the video to be transcoded with the highest classification score is used as the video to be transcoded in the to-be-transcoded video The transcoding factor adapted in the rate gear.
本实施例的转码分类模型中会预先设定多种转码因子,将待转码视频在每一待转码率档位下的综合转码特征输入该转码分类模型中,由该转码分类模型对每一待转码率档位下的综合转码特征进行分析,以输出该待转码视频在该转码分类模型中不同预设转码因子下的分类得分,进而选取出不同待转码率档位下分类得分最高的待转码视频对应的预设转码因子,作为待转码视频在该待转码率档位下适配的转码因子。In the transcoding classification model of this embodiment, multiple transcoding factors are preset, and the comprehensive transcoding features of the video to be transcoded at each transcoding rate gear to be transcoded are input into the transcoding classification model, and the transcoding The code classification model analyzes the comprehensive transcoding features of each rate to be transcoded to output the classification scores of the video to be transcoded under different preset transcoding factors in the transcoding classification model, and then select different The preset transcoding factor corresponding to the to-be-transcoded video with the highest classification score under the to-be-transcoding rate gear is used as the transcoding factor for the to-be-transcoded video to be adapted under the to-be-transcoding rate gear.
为了降低转码分类模型中额外的计算量,本实施例中转码分类模型可以由两层或两层以上的转码分类子模型构成。例如采用三层的小型神经网络实现,第一层可以为全连接和卷积的串联,第二层和第三层分别为批量标准化、全连接和卷积的串联,最后通过逻辑回归层输出待转码视频在每一个预设转码因子下的分类得分。In order to reduce the amount of extra calculation in the transcoding classification model, the transcoding classification model in this embodiment may be composed of two or more layers of transcoding classification sub-models. For example, a three-layer small neural network is used. The first layer can be the series connection of full connection and convolution. The second and third layers are the series connection of batch normalization, full connection and convolution respectively. Finally, the logistic regression layer outputs the waiting The classification score of the transcoded video under each preset transcoding factor.
S230,采用不同待转码率档位以及对应的转码因子对待转码视频进行转码。S230: Transcoding the video to be transcoded by using different to-be-transcoded rate gears and corresponding transcoding factors.
本实施例提供的技术方案,融合待转码视频的画面编码特征、待转码率档位以及指定的主观质量指标,得到待转码视频在不同待转码率档位下的综合转码特征,并通过预先构建的转码分类模型分别对不同待转码率档位下的综合转码特征进行转码判定,确定待转码视频在不同待转码率档位下适配的转码因子,保证不同待转码率档位下所选用的转码因子的适配准确性,从而实现了码率的合理分配,保证不同的待转码视频在不同码率档位的同一码率档位下转码后的 主观质量一致,避免不必要的码率浪费而节省带宽资源。The technical solution provided in this embodiment integrates the picture coding characteristics of the video to be transcoded, the rate gear to be transcoded, and the designated subjective quality index to obtain the comprehensive transcoding characteristics of the video to be transcoded at different transcoding rate gears. , And use the pre-built transcoding classification model to make transcoding judgments on the comprehensive transcoding features under different transcoding rate gears, and determine the transcoding factor for the video to be transcoded under different transcoding rate gears. , To ensure the adaptation accuracy of the selected transcoding factors under different to-be-transcoded rate gears, thereby realizing the reasonable allocation of bit-rates, and ensuring that different to-be-transcoded videos are in the same bit-rate gear of different bit-rate gears The subjective quality after transcoding is consistent, avoiding unnecessary bit rate waste and saving bandwidth resources.
实施例三Example three
图3A为本申请实施例三提供的一种视频转码的方法的流程图,图3B为本申请实施例三提供的视频转码过程的原理示意图。本实施例是在上述实施例的基础上进行说明,本实施例主要对于待转码视频的画面编码特征和主观质量指标的提取过程进行解释说明。FIG. 3A is a flowchart of a method for video transcoding provided in Embodiment 3 of this application, and FIG. 3B is a schematic diagram of the principle of a video transcoding process provided in Embodiment 3 of this application. This embodiment is described on the basis of the foregoing embodiment. This embodiment mainly explains the extraction process of the picture coding features and subjective quality indicators of the video to be transcoded.
可选的,如图3A所示,本实施例可以包括如下步骤:Optionally, as shown in FIG. 3A, this embodiment may include the following steps:
S310,提取待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征,以及不同待转码率档位指定的主观质量指标。S310: Extract the corresponding picture encoding features of the video to be transcoded after being transcoded in the lowest bit rate gear among the different transcoding rate gears, and subjective quality indicators specified by the different transcoding rate gears.
可选的,本实施例在获取待转码视频的画面编码特征之前,首先在不同待转码率档位中的最低码率档位下对该待转码视频进行转码,并在最低码率档位下转码后的视频中提取画面基本特征,作为本实施例中待转码视频的画面编码特征,同时通过VMAF算法对待转码视频中包含的视频内容进行分析,从而确定待转码视频在不同待转码率档位下指定的主观质量指标,以便后续判断适配的转码因子。Optionally, in this embodiment, before acquiring the image encoding feature of the video to be transcoded, the video to be transcoded is first transcoded at the lowest bit rate among the different to-be-transcoded rate gears, and the video is transcoded at the lowest bit rate. The basic characteristics of the picture are extracted from the transcoded video at the rate level, as the picture coding characteristics of the video to be transcoded in this embodiment, and the video content contained in the transcoded video is analyzed by the VMAF algorithm to determine the transcoded video The subjective quality index specified for the video at different levels of the transcoding rate to be used for subsequent judgment of the adapted transcoding factor.
示例性的,如图3B所示,本实施例中提取待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征,可以包括:采用不同待转码率档位中的最低码率档位以及最低码率档位下的固定转码因子对待转码视频进行转码;提取待转码视频在最低码率档位下转码后的多个关键帧中的关键信息,将不同码率档位下视频的分辨率和目标质量与上述多个关键信息,经过一定的维度扩展方法,得到待转码视频的画面编码特征。Exemplarily, as shown in FIG. 3B, in this embodiment, extracting the corresponding picture coding features of the video to be transcoded at the lowest bit rate among the different to-be-transcoded rate gears may include: using different to-be-transcoded Transcode the video to be transcoded with the lowest bit rate gear in the bit rate gear and the fixed transcoding factor under the lowest bit rate gear; extract multiple keys of the video to be transcoded after being transcoded at the lowest bit rate gear The key information in the frame, the resolution and target quality of the video at different bit rate levels and the above-mentioned multiple key information, after a certain dimension expansion method, obtain the picture coding characteristics of the video to be transcoded.
首先在不同待转码率档位中选取出最低码率档位,本实施例预先为最低码率档位设定相应的固定转码因子,此时采用该最低码率档位以及对应的固定转码因子对待转码视频进行转码,并在该最低码率档位下转码后的视频中提取可以表征画面特征的关键信息,将不同档位下视频的分辨率和目标质量与上述关键信息,经过一定的维度扩展方法得到待转码视频的画面编码特征。First, select the lowest bit rate gear from the different to-be-transcoded rate gears. In this embodiment, the corresponding fixed transcoding factor is set for the lowest bit rate gear in advance. At this time, the lowest bit rate gear and the corresponding fixed transcoding factor are used. The transcoding factor transcodes the video to be transcoded, and extracts the key information that can characterize the picture characteristics from the transcoded video at the lowest bit rate gear, and compares the resolution and target quality of the video at different gears with the above key Information, through a certain dimension expansion method, the picture coding characteristics of the video to be transcoded are obtained.
S320,根据待转码视频的画面编码特征,确定待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所选用的转码因子。S320: Determine, according to the image encoding characteristics of the video to be transcoded, that the video to be transcoded reaches each of the transcoding rate gears after being transcoded in each of the different transcoding rate gears. The transcoding factor selected in the case of the specified subjective quality index.
S330,采用不同待转码率档位以及对应的转码因子对待转码视频进行转码。S330: Transcoding the video to be transcoded by using different to-be-transcoded rate gears and corresponding transcoding factors.
本实施例提供的技术方案,获取待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征,保证画面编码特征的准确性,进而确定出待转码视频的画面编码特征在待转码率档位下对应转码因子的配合下进行转 码后达到该待转码率档位下指定的主观质量指标时所选用的转码因子,此时对任一待转码视频按照预测的转码因子进行转码后的效果与指定的主观质量指标进行比对,准确衡量任一待转码视频在不同码率档位下的主观播放质量,保证不同码率档位下所选用的转码因子的准确性,然后采用不同待转码率档位以及对应的转码因子对待转码视频进行多码率转码,从而实现了码率的合理分配,保证不同的待转码视频在不同码率档位的同一码率档位下转码后的主观质量一致,避免不必要的码率浪费而节省带宽资源。The technical solution provided in this embodiment obtains the corresponding picture coding features of the video to be transcoded at the lowest bit rate among the different to-be-transcoded rate gears, ensures the accuracy of the picture coding characteristics, and then determines the video to be transcoded. The picture encoding feature of the coded video is transcoded with the cooperation of the corresponding transcoding factor under the to-be-transcoded rate gear and the transcoding factor selected when the subjective quality index specified under the to-be-transcoded rate gear is reached. The effect of any video to be transcoded after being transcoded according to the predicted transcoding factor is compared with the specified subjective quality index to accurately measure the subjective playback quality of any video to be transcoded under different bitrates to ensure the difference The accuracy of the selected transcoding factor under the bit rate gear, and then use different to-be-transcoded rate gears and the corresponding transcoding factor to perform multi-rate transcoding on the video to be transcoded, thus realizing the reasonable allocation of the bit rate. Ensure that different videos to be transcoded have the same subjective quality after being transcoded under the same bitrate gears of different bitrate gears, avoid unnecessary bitrate waste and save bandwidth resources.
实施例四Example four
图4为本申请实施例四提供的一种视频转码的装置的结构示意图,。如图4所示,该装置可以包括:转码因子确定模块410,设置为根据待转码视频的画面编码特征,确定待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所采用的转码因子;视频转码模块420,设置为采用不同待转码率档位以及对应的转码因子对待转码视频进行转码。FIG. 4 is a schematic structural diagram of a video transcoding apparatus provided in Embodiment 4 of this application. As shown in FIG. 4, the device may include: a transcoding factor determining module 410, configured to determine each of the to-be-transcoded video in different to-be-transcoded rate gears according to the picture coding characteristics of the to-be-transcoded video The transcoding factor used when the subjective quality index specified by each of the transcoding rate gears is reached after being transcoded in the rate gear; the video transcoding module 420 is set to adopt different transcoding rate gears and The corresponding transcoding factor transcodes the video to be transcoded.
本实施例提供的技术方案,预先在每一待转码率档位下指定出要求转码后的视频达到的主观质量指标,从而对待转码视频在同一码率档位下设置相同的主观质量指标,保证待转码视频在同一码率挡位下转码后的主观质量保持一致,此时在每一待转码率档位下,分别配合选用多种转码因子对待转码视频进行转码,并确定出待转码视频的画面编码特征在对应转码因子的配合下进行转码后达到该待转码率档位下指定的主观质量指标时所选用的转码因子,进而确定每一待转码率档位下的转码因子,此时对任一待转码视频按照预设的转码因子进行转码后的效果与指定的主观质量指标进行比对,准确衡量任一待转码视频在不同码率档位下的主观播放质量,保证不同码率档位下所选用的转码因子的准确性,然后采用不同待转码率档位以及对应的转码因子对待转码视频进行多码率转码,从而实现了码率的合理分配,保证不同的待转码视频在不同码率档位的同一码率档位下转码后的主观质量一致,避免不必要的码率浪费而节省带宽资源。In the technical solution provided in this embodiment, the subjective quality index required to be achieved by the transcoded video is specified in advance under each to-be-transcoded rate gear, so that the same subjective quality is set for the video to be transcoded under the same bit-rate gear. Indicators to ensure that the subjective quality of the video to be transcoded remains the same after being transcoded at the same bit rate gear. At this time, in each gear to be transcoded, a variety of transcoding factors are selected for the transcoding of the video to be transcoded. Code, and determine the image encoding feature of the video to be transcoded, after transcoding with the cooperation of the corresponding transcoding factor, the selected transcoding factor to be used when the subjective quality index specified under the to-be-transcoding rate gear is reached, and then determine each A transcoding factor under the transcoding rate gear. At this time, the effect of any video to be transcoded after transcoding according to the preset transcoding factor is compared with the specified subjective quality index to accurately measure any The subjective playback quality of transcoded videos under different bit rate gears, to ensure the accuracy of the selected transcoding factors under different bit rate gears, and then use different to-be-transcoded rate gears and corresponding transcoding factors to treat transcoding The video is transcoded at multiple bit rates, thus realizing the reasonable allocation of bit rates, ensuring that the subjective quality of different videos to be transcoded under the same bit rate gear at different bit rate gears is consistent, and unnecessary codes are avoided. Rate waste and save bandwidth resources.
上述转码因子确定模块410,可以包括:特征融合单元,设置为针对每一待转码率档位,融合待转码视频的画面编码特征、该待转码率档位以及该待转码率档位指定的主观质量指标,得到待转码视频在该待转码率档位下的综合转码特征;转码因子适配单元,设置为通过预先构建的转码分类模型分别对待转码视频在不同待转码率档位下的综合转码特征进行转码判定,确定待转码视频在不同待转码率档位下适配的转码因子。The above-mentioned transcoding factor determination module 410 may include: a feature fusion unit configured to fuse the picture coding features of the video to be transcoded, the to-be-transcoded rate gear, and the to-be-transcoded rate for each rate gear to be transcoded The subjective quality index specified by the gear position is used to obtain the comprehensive transcoding characteristics of the video to be transcoded under the transcoding rate gear position; the transcoding factor adaptation unit is set to treat the transcoded video separately through the pre-built transcoding classification model The comprehensive transcoding features under different transcoding rate gears are used for transcoding determination, and the transcoding factor adapted to the video to be transcoded under the different transcoding rate gears is determined.
上述特征融合单元,是设置为:针对每一待转码率档位,对待转码视频的 画面编码特征、该待转码率档位以及该待转码率档位指定的主观质量指标进行扩维融合,得到待转码视频在该待转码率档位下的综合转码特征。The above-mentioned feature fusion unit is set to expand the image encoding feature of the video to be transcoded, the to-be-transcoded rate gear, and the subjective quality index specified by the to-be-transcoded rate gear for each of the to-be-transcoded rate gears. Dimension fusion is obtained to obtain the comprehensive transcoding feature of the video to be transcoded at the to-be-transcoded rate gear.
上述转码因子适配单元,是设置为:针对每一待转码率档位,将待转码视频在该待转码率档位下的综合转码特征输入转码分类模型中,得到待转码视频在不同预设转码因子下的分类得分;将分类得分最高的预设转码因子,作为待转码视频在该待转码率档位下适配的转码因子。The above-mentioned transcoding factor adaptation unit is set to: for each rate gear to be transcoded, the comprehensive transcoding feature of the video to be transcoded in the rate gear to be transcoded is input into the transcoding classification model to obtain the The classification score of the transcoded video under different preset transcoding factors; the preset transcoding factor with the highest classification score is used as the transcoding factor for the video to be transcoded to be adapted under the to-be-transcoded rate gear.
上述转码分类模型可以由两层或两层以上的转码分类子模型构成。The above-mentioned transcoding classification model may be composed of two or more layers of transcoding classification sub-models.
上述视频转码的装置,还可以包括:转码参数提取模块,设置为提取待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征,以及不同待转码率档位指定的主观质量指标。The above-mentioned video transcoding device may further include: a transcoding parameter extraction module, which is configured to extract the corresponding picture encoding features of the video to be transcoded at the lowest bit rate among different transcoding rate gears, and different The subjective quality index designated by the transcoding rate gear.
上述转码参数提取模块,可以是设置为通过如下方式提取待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征:采用不同待转码率档位中的最低码率档位以及最低码率档位下的固定转码因子对待转码视频进行转码;抽取待转码视频在最低码率档位下转码后的多个关键帧,并合并多个关键帧中的关键信息,得到待转码视频的画面编码特征,所述关键信息用于表征待转码视频的画面编码特征。The above-mentioned transcoding parameter extraction module can be set to extract the corresponding picture encoding characteristics of the video to be transcoded at the lowest bit rate position among the different to be transcoded rate gears through the following method: use different to be transcoded rate files Transcode the video to be transcoded in the lowest bit rate gear and the fixed transcoding factor under the lowest bit rate gear; extract multiple key frames of the video to be transcoded after being transcoded in the lowest bit rate gear, and The key information in the multiple key frames is combined to obtain the picture coding feature of the video to be transcoded, and the key information is used to characterize the picture coding feature of the video to be transcoded.
本实施例提供的视频转码的装置可适用于上述任意实施例提供的视频转码的方法,具备相应的功能。The video transcoding device provided in this embodiment is applicable to the video transcoding method provided in any of the foregoing embodiments, and has corresponding functions.
实施例五Example five
图5为本申请实施例五提供的一种服务器的结构示意图,如图5所示,该服务器包括处理器50、存储装置51和通信装置52;服务器中处理器50的数量可以是一个或多个,图5中以一个处理器50为例;服务器中的处理器50、存储装置51和通信装置52可以通过总线或其他方式连接,图5中以通过总线连接为例。FIG. 5 is a schematic structural diagram of a server provided in Embodiment 5 of this application. As shown in FIG. 5, the server includes a processor 50, a storage device 51, and a communication device 52; the number of processors 50 in the server may be one or more. One, one processor 50 is taken as an example in FIG. 5; the processor 50, the storage device 51, and the communication device 52 in the server may be connected by a bus or other means. In FIG. 5, the connection by a bus is taken as an example.
存储装置51作为一种计算机可读存储介质,可设置为存储软件程序、计算机可执行程序以及模块,如本申请实施例中提供的视频转码的方法对应的程序指令/模块。处理器50通过运行存储在存储装置51中的软件程序、指令以及模块,从而执行服务器的多种功能应用以及数据处理,即实现上述视频转码的方法。As a computer-readable storage medium, the storage device 51 can be configured to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the video transcoding method provided in the embodiments of the present application. The processor 50 executes various functional applications and data processing of the server by running the software programs, instructions, and modules stored in the storage device 51, that is, realizes the above-mentioned video transcoding method.
存储装置51可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端的使用所创建的数据等。此外,存储装置51可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失 性固态存储器件。在一些实例中,存储装置51还可包括相对于处理器50远程设置的存储器,这些远程存储器可以通过网络连接至服务器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The storage device 51 may mainly include a storage program area and a storage data area. The storage program area may store an operating system and an application program required by at least one function; the storage data area may store data created according to the use of the terminal, and the like. In addition, the storage device 51 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices. In some examples, the storage device 51 may further include a memory remotely provided with respect to the processor 50, and these remote memories may be connected to the server through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
通信装置62可设置为实现服务器与终端之间的网络连接或者移动数据连接。The communication device 62 may be configured to implement a network connection or a mobile data connection between the server and the terminal.
本实施例提供的一种服务器可设置为执行上述任意实施例提供的视频转码的方法,具备相应的功能。The server provided in this embodiment can be configured to execute the video transcoding method provided in any of the foregoing embodiments, and has corresponding functions.
实施例六Example Six
本申请实施例六还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时可实现上述任意实施例中的视频转码的方法。该方法可以包括:根据待转码视频的画面编码特征,确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所选用的转码因子;采用不同待转码率档位以及对应的转码因子对所述待转码视频进行转码。The sixth embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the video transcoding method in any of the foregoing embodiments can be implemented. The method may include: determining, according to the picture coding characteristics of the video to be transcoded, that the video to be transcoded is transcoded in each of the different to-be-transcoded rate gears to reach each of the to-be-transcoded rate gears. Transcoding factor selected in the case of subjective quality indicators specified by the transcoding rate gear; using different transcoding rate gears and corresponding transcoding factors to transcode the video to be transcoded.
当然,本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的视频转码的方法中的相关操作。Of course, the storage medium containing computer-executable instructions provided by the embodiments of the present application is not limited to the method operations described above, and can also execute the video transcoding method provided by any embodiment of the present application. Related operations in.
通过以上关于实施方式的描述,所属领域的技术人员可以了解到,本申请可借助软件及必需的通用硬件来实现,当然也可以通过硬件实现。基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、闪存(FLASH)、硬盘或光盘等,包括多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请任意实施例所述的方法。From the above description of the implementation manners, those skilled in the art can understand that this application can be implemented by software and necessary general-purpose hardware, and of course, it can also be implemented by hardware. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk, read-only memory (ROM), Random Access Memory (RAM), flash memory (FLASH), hard disk or optical disk, etc., including multiple instructions to make a computer device (which can be a personal computer, server, or network device, etc.) execute any of this application The method described in the embodiment.
值得注意的是,上述视频转码的装置的实施例中,所包括的多个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,多个功能单元的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。It is worth noting that, in the above embodiment of the video transcoding device, the multiple units and modules included are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; In addition, the specific names of multiple functional units are only used to facilitate distinguishing from each other, and are not used to limit the protection scope of the present application.

Claims (10)

  1. 一种视频转码的方法,包括:A method for video transcoding, including:
    根据待转码视频的画面编码特征,确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所选用的转码因子;According to the picture encoding characteristics of the video to be transcoded, it is determined that the video to be transcoded reaches each of the transcoding rate gears after being transcoded in each of the different transcoding rate gears. The transcoding factor selected in the case of the specified subjective quality index;
    采用不同待转码率档位以及对应的转码因子对所述待转码视频进行转码。Transcoding the video to be transcoded using different to-be-transcoded rate gears and corresponding transcoding factors.
  2. 根据权利要求1所述的方法,其中,根据待转码视频的画面编码特征,确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所采用的转码因子,包括:The method according to claim 1, wherein the video to be transcoded is determined to be transcoded at each of the different transcoding rate gears after transcoding according to the picture encoding characteristics of the video to be transcoded The transcoding factor used in the case of reaching the subjective quality index specified by each of the to-be-transcoding rate gears includes:
    融合每一待转码视频的画面编码特征、所述每一待转码率档位以及所述每一待转码率档位指定的主观质量指标,得到所述待转码视频在所述每一待转码率档位下的综合转码特征;Combining the picture coding features of each video to be transcoded, each of the to-be-transcoded rate gears, and the subjective quality index specified by each of the to-be-transcoded rate gears, to obtain the video to be transcoded in each of the A comprehensive transcoding feature under the to-be-transcoding rate gear;
    通过预先构建的转码分类模型分别对所述待转码视频在不同待转码率档位下的综合转码特征进行转码判定,确定所述待转码视频在不同待转码率档位下适配的转码因子。The comprehensive transcoding features of the video to be transcoded under different transcoding rate gears are respectively determined through the pre-built transcoding classification model, and it is determined that the video to be transcoded is in the different transcoding rate gears. The transcoding factor for the next adaptation.
  3. 根据权利要求2所述的方法,其中,融合每一待转码视频的画面编码特征、所述每一待转码率档位以及所述每一待转码率档位指定的主观质量指标,得到所述待转码视频在所述每一待转码率档位下的综合转码特征,包括:3. The method according to claim 2, wherein the image coding characteristics of each video to be transcoded, each of the to-be-transcoded rate gears, and the subjective quality index designated by each of the to-be-transcoded rate gears are merged, Obtaining the comprehensive transcoding features of the video to be transcoded in each of the transcoding rate gears to be transcoded includes:
    对每一待转码视频的画面编码特征、所述每一待转码率档位以及所述每一待转码率档位指定的主观质量指标进行扩维融合,得到所述待转码视频在所述每一待转码率档位下的综合转码特征。The image encoding feature of each video to be transcoded, each of the to-be-transcoded rate gears, and the subjective quality index specified by each of the to-be-transcoded rate gears are expanded and fused to obtain the video to be transcoded The comprehensive transcoding feature in each of the to-be-transcoding rate gears.
  4. 根据权利要求2所述的方法,其中,通过预先构建的转码分类模型分别对所述待转码视频在不同待转码率档位下的综合转码特征进行转码判定,确定所述待转码视频在不同待转码率档位下适配的转码因子,包括:The method according to claim 2, wherein the comprehensive transcoding features of the to-be-transcoded video at different to-be-transcoding rate gears are respectively subjected to transcoding judgments through a pre-built transcoding classification model, and the to-be-transcoding rate is determined to be The transcoding factor adapted for the transcoded video under different levels of the transcoding rate to be transcoded, including:
    将所述待转码视频在每一待转码率档位下的综合转码特征输入所述转码分类模型中,得到所述待转码视频在不同预设转码因子下的分类得分;Input the comprehensive transcoding features of the video to be transcoded under each transcoding rate gear into the transcoding classification model to obtain the classification scores of the video to be transcoded under different preset transcoding factors;
    将所述分类得分最高的预设转码因子,作为所述待转码视频在所述每一待转码率档位下适配的转码因子。The preset transcoding factor with the highest classification score is used as the transcoding factor adapted to the video to be transcoded in each of the transcoding rate gears to be transcoded.
  5. 根据权利要求2所述的方法,其中,所述转码分类模型由至少两层的转码分类子模型构成。The method according to claim 2, wherein the transcoding classification model is composed of at least two layers of transcoding classification sub-models.
  6. 根据权利要求1-5任一项所述的方法,其中,在确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所采用的转码因子之前,还包括:The method according to any one of claims 1 to 5, wherein after determining that the video to be transcoded is transcoded in each of the different to-be-transcoding-rate gears, it is reached after transcoding at each of the different-to-transcoding-rate gears. Before the subjective quality index to be used in the case of the subjective quality index specified by the transcoding rate gear, it also includes:
    提取所述待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征,以及不同待转码率档位指定的主观质量指标。Extract the corresponding picture coding features of the video to be transcoded after being transcoded at the lowest bit rate gear among the different to-be-transcoded rate gears, and subjective quality indicators specified by the different to-be-transcoded rate gears.
  7. 根据权利要求6所述的方法,其中,提取所述待转码视频在不同待转码率档位中最低码率档位下转码后对应的画面编码特征,包括:The method according to claim 6, wherein extracting the corresponding picture coding features of the video to be transcoded after being transcoded at the lowest bit rate among the different to be transcoded rate gears, comprises:
    采用不同待转码率档位中的最低码率档位以及所述最低码率档位下的固定转码因子对所述待转码视频进行转码;Transcoding the video to be transcoded by using the lowest bit rate gear of the different to-be-transcoded rate gears and the fixed transcoding factor under the lowest bit rate gear;
    抽取所述待转码视频在最低码率档位下转码后的多个关键帧,并合并所述多个关键帧中的关键信息,得到所述待转码视频的画面编码特征,所述关键信息用于表征所述待转码视频的画面编码特征。Extracting multiple key frames of the video to be transcoded after being transcoded at the lowest bit rate gear, and merging the key information in the multiple key frames to obtain the picture coding characteristics of the video to be transcoded, The key information is used to characterize the picture coding characteristics of the video to be transcoded.
  8. 一种视频转码的装置,包括:A video transcoding device includes:
    转码因子确定模块,设置为根据待转码视频的画面编码特征,确定所述待转码视频在不同待转码率档位中的每一待转码率档位下转码后达到所述每一待转码率档位指定的主观质量指标的情况下所采用的转码因子;The transcoding factor determination module is configured to determine that the video to be transcoded is transcoded at each of the different transcoding rate gears to reach the said transcoding factor according to the picture coding characteristics of the video to be transcoded. The transcoding factor used in the case of the subjective quality index specified by each transcoding rate gear;
    视频转码模块,设置为采用不同待转码率档位以及对应的转码因子对所述待转码视频进行转码。The video transcoding module is configured to transcode the video to be transcoded by using different to-be-transcoded rate gears and corresponding transcoding factors.
  9. 一种服务器,包括:A server that includes:
    至少一个处理器;At least one processor;
    存储装置,设置为存储至少一个程序;The storage device is set to store at least one program;
    当所述至少一个程序被所述至少一个处理器执行时,使得所述至少一个处理器实现如权利要求1-7中任一所述的视频转码的方法。When the at least one program is executed by the at least one processor, the at least one processor is caused to implement the video transcoding method according to any one of claims 1-7.
  10. 一种计算机可读存储介质,存储有计算机程序,该程序被处理器执行时实现如权利要求1-7中任一所述的视频转码的方法。A computer-readable storage medium storing a computer program, which when executed by a processor, realizes the video transcoding method according to any one of claims 1-7.
PCT/CN2020/137513 2019-12-31 2020-12-18 Video transcoding method and apparatus, server and storage medium WO2021135983A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911410012.0A CN111107395B (en) 2019-12-31 2019-12-31 Video transcoding method, device, server and storage medium
CN201911410012.0 2019-12-31

Publications (1)

Publication Number Publication Date
WO2021135983A1 true WO2021135983A1 (en) 2021-07-08

Family

ID=70424089

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/137513 WO2021135983A1 (en) 2019-12-31 2020-12-18 Video transcoding method and apparatus, server and storage medium

Country Status (2)

Country Link
CN (1) CN111107395B (en)
WO (1) WO2021135983A1 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111107395B (en) * 2019-12-31 2021-08-03 广州市百果园网络科技有限公司 Video transcoding method, device, server and storage medium
CN111726656B (en) * 2020-07-20 2022-07-26 有半岛(北京)信息科技有限公司 Transcoding method, device, server and storage medium of live video
CN111901631B (en) * 2020-07-30 2023-02-17 有半岛(北京)信息科技有限公司 Transcoding method, device, server and storage medium for live video
CN111970565A (en) * 2020-09-21 2020-11-20 Oppo广东移动通信有限公司 Video data processing method and device, electronic equipment and storage medium
CN115134639B (en) * 2021-03-24 2023-12-19 北京字跳网络技术有限公司 Video gear determining method, device, server, storage medium and system
CN113259730B (en) * 2021-07-06 2021-12-14 北京达佳互联信息技术有限公司 Code rate adjustment method and device for live broadcast
CN113891155B (en) * 2021-09-29 2024-04-05 百果园技术(新加坡)有限公司 Video playing gear determining method, video playing method and related devices
CN114025190B (en) * 2021-11-03 2023-06-20 北京达佳互联信息技术有限公司 Multi-code rate scheduling method and multi-code rate scheduling device
CN114040230B (en) * 2021-11-08 2024-03-29 北京达佳互联信息技术有限公司 Video code rate determining method and device, electronic equipment and storage medium thereof
CN114598927A (en) * 2022-03-03 2022-06-07 京东科技信息技术有限公司 Method and system for scheduling transcoding resources and scheduling device
CN114760506B (en) * 2022-04-11 2024-02-09 北京字跳网络技术有限公司 Video transcoding evaluation method, device, equipment and storage medium
CN115002520B (en) * 2022-04-14 2024-04-02 百果园技术(新加坡)有限公司 Video stream data processing method, device, equipment and storage medium
CN115379248B (en) * 2022-07-14 2023-12-12 百果园技术(新加坡)有限公司 Video source stream replacement method, system, equipment and storage medium
CN115379291B (en) * 2022-07-19 2023-12-26 百果园技术(新加坡)有限公司 Code table updating method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109286825A (en) * 2018-12-14 2019-01-29 北京百度网讯科技有限公司 Method and apparatus for handling video
US10298969B2 (en) * 2016-11-10 2019-05-21 University Of Louisiana At Lafayette Architecture and method for high performance on demand video transcoding
CN110418177A (en) * 2019-04-19 2019-11-05 腾讯科技(深圳)有限公司 Method for video coding, device, equipment and storage medium
CN110493196A (en) * 2019-07-24 2019-11-22 深圳市瑞讯云技术有限公司 A kind of video code conversion unit and video code conversion component
CN111107395A (en) * 2019-12-31 2020-05-05 广州市百果园网络科技有限公司 Video transcoding method, device, server and storage medium

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150172680A1 (en) * 2013-12-16 2015-06-18 Arris Enterprises, Inc. Producing an Output Need Parameter for an Encoder
CN105187835B (en) * 2014-05-30 2019-02-15 阿里巴巴集团控股有限公司 Adaptive video code-transferring method and device based on content
CN107820084B (en) * 2016-09-13 2020-02-07 北京金山云网络技术有限公司 Video perception coding method and device
CN106713956B (en) * 2016-11-16 2020-09-15 上海交通大学 Code rate control and version selection method and system for dynamic self-adaptive video streaming media
AU2017368324B2 (en) * 2016-12-01 2022-08-25 Brightcove, Inc. Optimization of encoding profiles for media streaming
CN109391825A (en) * 2017-08-03 2019-02-26 腾讯科技(深圳)有限公司 A kind of video transcoding method and its device, server, readable storage medium storing program for executing
CN109660825B (en) * 2017-10-10 2021-02-09 腾讯科技(深圳)有限公司 Video transcoding method and device, computer equipment and storage medium
US10587669B2 (en) * 2017-12-20 2020-03-10 Facebook, Inc. Visual quality metrics
CN108174290B (en) * 2018-01-25 2019-05-24 北京百度网讯科技有限公司 Method and apparatus for handling video
CN109729384B (en) * 2018-12-18 2021-11-19 广州市百果园信息技术有限公司 Video transcoding selection method and device
CN110248189B (en) * 2019-06-14 2021-07-27 北京字节跳动网络技术有限公司 Video quality prediction method, device, medium and electronic equipment
CN110248195B (en) * 2019-07-17 2021-11-05 北京百度网讯科技有限公司 Method and apparatus for outputting information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10298969B2 (en) * 2016-11-10 2019-05-21 University Of Louisiana At Lafayette Architecture and method for high performance on demand video transcoding
CN109286825A (en) * 2018-12-14 2019-01-29 北京百度网讯科技有限公司 Method and apparatus for handling video
CN110418177A (en) * 2019-04-19 2019-11-05 腾讯科技(深圳)有限公司 Method for video coding, device, equipment and storage medium
CN110493196A (en) * 2019-07-24 2019-11-22 深圳市瑞讯云技术有限公司 A kind of video code conversion unit and video code conversion component
CN111107395A (en) * 2019-12-31 2020-05-05 广州市百果园网络科技有限公司 Video transcoding method, device, server and storage medium

Also Published As

Publication number Publication date
CN111107395B (en) 2021-08-03
CN111107395A (en) 2020-05-05

Similar Documents

Publication Publication Date Title
WO2021135983A1 (en) Video transcoding method and apparatus, server and storage medium
US20220030244A1 (en) Content adaptation for streaming
JP6928041B2 (en) Methods and equipment for processing video
KR102082816B1 (en) Method for improving the resolution of streaming files
Wang et al. Rich features for perceptual quality assessment of UGC videos
WO2021147448A1 (en) Video data processing method and apparatus, and storage medium
EP3583777A1 (en) A method and technical equipment for video processing
CN109819282B (en) Video user category identification method, device and medium
KR102472971B1 (en) Method, system, and computer program to optimize video encoding using artificial intelligence model
CN110620924B (en) Method and device for processing coded data, computer equipment and storage medium
WO2023134523A1 (en) Content adaptive video coding method and apparatus, device and storage medium
Karim et al. Quality of service (QoS): measurements of image formats in social cloud computing
CN114513655A (en) Live video quality evaluation method, video quality adjustment method and related device
CN116233445B (en) Video encoding and decoding processing method and device, computer equipment and storage medium
CN111050174A (en) Image compression method, device and system
CN112752117A (en) Video caching method, device, equipment and storage medium
CN112383824A (en) Video advertisement filtering method, device and storage medium
CN111726656A (en) Transcoding method, device, server and storage medium for live video
WO2024017106A1 (en) Code table updating method, apparatus, and device, and storage medium
CN113452996A (en) Video coding and decoding method and device
CN114827617B (en) Video coding and decoding method and system based on perception model
CN113542780B (en) Method and device for removing compression artifacts of live webcast video
KR102430177B1 (en) System for rapid management of large scale moving pictures and method thereof
CN111901631B (en) Transcoding method, device, server and storage medium for live video
Huang et al. Semantic video adaptation using a preprocessing method for mobile environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20910690

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20910690

Country of ref document: EP

Kind code of ref document: A1