CN111970565A - Video data processing method and device, electronic equipment and storage medium - Google Patents

Video data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111970565A
CN111970565A CN202010997869.3A CN202010997869A CN111970565A CN 111970565 A CN111970565 A CN 111970565A CN 202010997869 A CN202010997869 A CN 202010997869A CN 111970565 A CN111970565 A CN 111970565A
Authority
CN
China
Prior art keywords
video
processed
quality
threshold
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010997869.3A
Other languages
Chinese (zh)
Inventor
葛冬冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010997869.3A priority Critical patent/CN111970565A/en
Publication of CN111970565A publication Critical patent/CN111970565A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application discloses a video data processing method and device, electronic equipment and a storage medium, and relates to the technical field of data processing. The method comprises the following steps: acquiring a video to be processed and a target code rate corresponding to the video to be processed, wherein the target code rate is lower than an initial code rate of the video to be processed; transcoding the video to be processed based on the target code rate; acquiring a quality index value corresponding to the video to be processed after transcoding processing; and if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate, and performing transcoding processing on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value. The method realizes the adjustment of the target code rate on the premise of ensuring the video quality of the video to be processed, thereby realizing the code rate saving and the network bandwidth saving.

Description

Video data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a method and an apparatus for processing video data, an electronic device, and a storage medium.
Background
With the development of network technology, video services are rapidly developed, and the quality requirements of users on watched videos are higher and higher. In a related approach to improving video quality, video data may be transcoded. However, with the increase of the types of video data and the increase of the video traffic, in order to ensure the video quality, when transcoding the video data, the bitrate is wasted, which further causes the waste of the network bandwidth.
Disclosure of Invention
The present application provides a video data processing method, apparatus, electronic device and storage medium to improve the above-mentioned problems.
In a first aspect, an embodiment of the present application provides a video data processing method, where the method includes: acquiring a video to be processed and a target code rate corresponding to the video to be processed, wherein the target code rate is lower than an initial code rate of the video to be processed; transcoding the video to be processed based on the target code rate; acquiring a quality index value corresponding to the video to be processed after transcoding processing; and if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate, and performing transcoding processing on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
In a second aspect, an embodiment of the present application provides a video data processing apparatus, including: the first obtaining module is used for obtaining a video to be processed and a target code rate corresponding to the video to be processed, wherein the target code rate is lower than an initial code rate of the video to be processed; the first processing module is used for transcoding the video to be processed based on the target code rate; the second acquisition module is used for acquiring a quality index value corresponding to the video to be processed after transcoding processing; and the second processing module is used for adjusting the target code rate if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, and carrying out transcoding processing on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a memory; one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more application programs being configured to perform the video data processing method provided by the first aspect above.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a program code is stored in the computer-readable storage medium, and the program code can be called by a processor to execute the video data processing method provided in the first aspect.
According to the video data processing method, the video data processing device, the electronic equipment and the storage medium, the video to be processed and the target code rate corresponding to the video to be processed are obtained, then transcoding processing is carried out on the video to be processed based on the target code rate, then the quality index value corresponding to the video to be processed after transcoding processing is obtained, then if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, the target code rate is adjusted, and transcoding processing is carried out on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value. Therefore, transcoding processing is performed on the video to be processed based on the target code rate lower than the initial code rate of the video to be processed, when the difference value between the quality index value corresponding to the video to be processed after transcoding processing and the quality threshold value is larger than the first threshold value, the target code rate is adjusted, transcoding processing is performed on the video to be processed again based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value, so that the target code rate can be adjusted on the premise of ensuring the video quality of the video to be processed, code rate saving is achieved, and network bandwidth is saved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 shows a schematic diagram of an application environment provided by an embodiment of the present application.
Fig. 2 shows a flowchart of a video data processing method according to an embodiment of the present application.
Fig. 3 is a flowchart illustrating a video data processing method according to another embodiment of the present application.
Fig. 4 shows a schematic diagram of a slicing manner for slicing a video to be transcoded according to this embodiment.
Fig. 5 is a flowchart of a method for determining quality thresholds corresponding to a plurality of video segments according to an embodiment of the present disclosure.
Fig. 6 is a schematic diagram illustrating a determination manner of determining quality thresholds corresponding to a plurality of video segments according to content complexity corresponding to each of the plurality of video segments according to an embodiment of the present application.
Fig. 7 is a flowchart of another method for determining quality thresholds corresponding to a plurality of video segments according to an embodiment of the present disclosure.
Fig. 8 is a schematic diagram illustrating a determination manner of determining quality thresholds corresponding to a plurality of video segments according to segment positions corresponding to the plurality of video segments according to an embodiment of the present application.
Fig. 9 is a flowchart of a method for determining quality thresholds corresponding to a plurality of video segments according to an embodiment of the present disclosure.
Fig. 10 is a flowchart illustrating a video data processing method according to another embodiment of the present application.
Fig. 11 is a flowchart illustrating a video data processing method according to still another embodiment of the present application.
Fig. 12 is a block diagram illustrating a video data processing apparatus according to an embodiment of the present application.
Fig. 13 shows a block diagram of an electronic device according to an embodiment of the present application.
Fig. 14 illustrates a storage unit for storing or carrying program codes for implementing a video data processing method according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Along with the improvement of the living standard of people, the popularization rate of the intelligent terminal in daily life use is nearly the national coverage, so that the number of watching users and the watching frequency of video services such as short videos or live webcasts are extremely increased, and meanwhile, the quality requirements of the users on the watched videos are higher and higher.
As a way of improving video quality, video quality can be improved by transcoding video data, and specifically, video quality can be improved by optimizing a transcoding model and with the help of the optimized transcoding model. However, a large number of data sources are needed for training the transcoding model, and different training models are different in training speed and accuracy for data of different scales or different types, so that the video quality is improved by simply depending on the optimized transcoding model, and an optimal transcoding result cannot be obtained necessarily; in addition, in order to ensure transcoding efficiency, video data needs to be transcoded according to a uniform transcoding format, and as the video data increases, the storage cost of the server for the video data increases.
The inventor finds that the target code rate can be obtained by obtaining the video to be processed and the target code rate corresponding to the video to be processed, transcoding the video to be processed based on the target code rate, obtaining the quality index value corresponding to the transcoded video to be processed, adjusting the target code rate if the difference between the quality index value and the quality threshold is larger than a first threshold, and transcoding the video to be processed based on the adjusted target code rate until the difference is smaller than or equal to the first threshold. Therefore, transcoding processing is performed on the video to be processed based on the target code rate lower than the initial code rate of the video to be processed, when the difference value between the quality index value corresponding to the video to be processed after transcoding processing and the quality threshold value is larger than the first threshold value, the target code rate is adjusted, transcoding processing is performed on the video to be processed again based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value, so that the target code rate can be adjusted on the premise of ensuring the video quality of the video to be processed, code rate saving is achieved, and network bandwidth is saved.
Therefore, in order to improve the above problem, the inventor proposes a video data processing method, an apparatus, an electronic device, and a storage medium, which can adjust the target bitrate of the video to be processed on the premise of ensuring the video quality of the video to be processed, thereby saving bitrate and saving network bandwidth.
The following description is provided to an application environment according to an embodiment of the present application.
Referring to fig. 1, an application environment of a video data processing method according to an embodiment of the present application is schematically illustrated, and as shown in fig. 1, the application environment may be understood as a network system 10 according to an embodiment of the present application, where the network system 10 includes: the mobile terminal 11 may be any device having communication and storage functions, including but not limited to a PC (Personal Computer), a PDA (tablet Personal Computer), a smart television, a smart phone, a smart wearable device, or other smart communication devices having a network connection function, and the server 12 may be a server (network access server), a server cluster (cloud server) composed of a plurality of servers, or a cloud computing center (database server).
In the embodiment of the application, a video shot by a user through a mobile terminal 11 can be sent to a server 12 through a network for storage, along with the increase of the number of videos, the server 12 can perform segmentation processing on the video to be stored to obtain a plurality of video segments, then adjust the code rate of part of the video segments in the plurality of video segments, or adjust the code rate of each video segment in the plurality of video segments, so that the code rate is adjusted as low as possible on the premise of keeping the quality of the video, the code rate is saved, and the network bandwidth resource is saved.
It should be noted that, as a method for improving transcoding efficiency, in the embodiment of the present application, when a video to be stored is split, some servers may be responsible for splitting the video, and other different servers may be responsible for transcoding the split video, so that video segmentation and video transcoding may be processed in parallel, thereby improving transcoding efficiency. The specific number of some servers and other servers may not be limited.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2, a flowchart of a video data processing method according to an embodiment of the present application is shown, where the embodiment provides a video data processing method applicable to a server, and the method includes:
step S110: the method comprises the steps of obtaining a video to be processed and a target code rate corresponding to the video to be processed.
In this embodiment, the video to be processed may be a video to be stored whose file size is greater than or equal to the target threshold. For example, the video to be processed may be a video with a file size of 10M or more, where a specific value of the target threshold may not be limited, and may be other values besides 10M, for example, 20M or 30M.
As a mode for reducing the storage space occupied by the video to be processed, the target code rate corresponding to the video to be processed can be obtained, and then the video to be processed can be transcoded based on the target code rate, so that the code rate is saved.
In this embodiment, the target bitrate corresponding to the video to be processed may be set to be lower than the initial bitrate of the video to be processed, for example, assuming that the initial bitrate of the video to be processed is 2048kbps, the target bitrate may be set to 1024kbps (this is merely an example, and the specific value may not be limited).
Optionally, in order to maintain the quality of the video to be processed as much as possible, when the target bitrate is set, as an implementation manner, it may be determined whether the video to be processed belongs to a video file of an important level, where whether the video to be processed belongs to the important level may be distinguished according to the name of the video file. Alternatively, the video files belonging to the importance level may carry a special identifier, which may be "imp", for example.
In an embodiment, if the video to be processed is a video file belonging to an important level, the maximum value of the difference between the target bitrate and the initial bitrate may be set to be not greater than a third threshold, so that the target bitrate may approach the initial bitrate as much as possible on the basis that the target bitrate is lower than the initial bitrate, thereby achieving the maximum fidelity video quality. In another embodiment, if the video to be processed is a video file that does not belong to an important level, the maximum value of the difference between the target bitrate and the initial bitrate is set to be not greater than the fourth threshold, so that the bitrate of the video to be processed is reduced as much as possible, the storage space of the video to be processed is reduced as much as possible, and the network bandwidth resource in the process of transmitting the video to be processed is saved. The fourth threshold is greater than the third threshold, and the specific values of the fourth threshold and the third threshold may not be limited. It should be noted that the target code rate set by the above two methods is lower than the initial code rate.
For example, in a specific application scenario, assuming that the initial bitrate is 2048kbps, the third threshold is 200kbps, and the fourth threshold is 500kbps, if the video to be processed is a video file belonging to an important level, the target bitrate can be set to 1900 kbps; if the video to be processed is a video file not belonging to the important grade, the target bitrate can be set to 1600 kbps. The above numerical values are merely described as examples, and do not limit the present invention, and may be set according to technical requirements in actual implementation.
Step S120: and transcoding the video to be processed based on the target code rate.
Optionally, after the target code rate is obtained, a transcoding parameter corresponding to the to-be-processed video may also be obtained, so that transcoding processing may be performed on the to-be-processed video based on the target code rate and the transcoding parameter, so that the code rate of the to-be-processed video after transcoding is the target code rate. For example, if the initial bitrate corresponding to the video to be processed is 2048kbps and the target bitrate corresponding to the video to be processed is 1024kbps, the bitrate of the transcoded video to be processed is 1024 kbps. The transcoding parameters may include parameters of the image width and height of the video to be processed, the transcoding type, the transcoding frame rate, and the like.
Step S130: and acquiring a quality index value corresponding to the video to be processed after transcoding processing.
As one way, the transcoded to-be-processed video may be compared with the to-be-processed video before transcoding, so as to obtain a quality index value (i.e., a VMAF value) corresponding to the transcoded to-be-processed video. Among them, the VMAF (Video multi-method Assessment Fusion) is a Video quality evaluation index combining a human visual model and machine learning, and may be used to compare the quality and efficiency of various transcoding technologies. Optionally, in the calculation process, the quality index value may be calculated by comparing the value of each pixel point of the transcoded video image to be processed with the value of each pixel point of the video image to be processed before transcoding.
Step S140: and if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate, and performing transcoding processing on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
The quality threshold (which may be a VMAF quality threshold in this embodiment) may be set in advance. Optionally, the quality threshold may be set according to the initial code rate of the video to be processed, and the quality threshold corresponding to the video to be processed with a higher initial code rate may be greater than the quality threshold corresponding to the video to be processed with a lower initial code rate. In this embodiment, the video picture quality corresponding to the quality threshold is close to the picture quality of the video to be processed.
As a manner, when the quality index value corresponding to the transcoded to-be-processed video is obtained, a difference between the quality index value and the quality threshold may be obtained, and then it is determined whether the difference is smaller than or equal to the first threshold, so that the target bitrate may be flexibly adjusted according to the determination result. Alternatively, the first threshold value may be understood as a threshold value. In some embodiments, the first threshold may be 0, in which case, if the quality index value is not equal to the quality threshold, the target code rate needs to be adjusted. In other embodiments, the first threshold may be a small range, for example, may be 0 to 5 (the specific value may be adjusted according to actual requirements), and the like.
As an embodiment, if the difference between the quality index value and the quality threshold is greater than the first threshold, and the quality index value is greater than the quality threshold, the picture quality of the video to be processed in this case is closer to the picture quality of the video to be processed than the picture quality corresponding to the quality threshold, and the bitrate of the current video to be processed (i.e., the transcoded video) is the target bitrate, so as to reduce the occupation of the storage space by the video to be processed, the target bitrate can be reduced, and the file size of the video to be processed can be reduced, thereby saving the network transmission bandwidth. The reduction range of the target code rate may be set according to actual needs, and is not limited herein. In this way, in order to reduce the size of the video file as much as possible while ensuring the video quality of the video to be processed, the foregoing transcoding process may be continued to be performed on the video to be processed again based on the reduced target bitrate until the difference between the quality index value and the quality threshold is less than or equal to the first threshold.
As another embodiment, if the difference between the quality index value and the quality threshold is greater than the first threshold, and the quality index value is smaller than the quality threshold, it indicates that the file size of the to-be-processed video in this case is smaller, that is, the picture quality of the to-be-processed video in this case is weaker than the picture quality corresponding to the quality threshold. The specific amplitude for increasing the target code rate may not be limited. In this way, in order to improve (or maintain) the picture quality of the video to be processed while ensuring that the file size of the video to be processed is as small as possible, the foregoing transcoding process may be continued to be performed on the video to be processed again based on the increased target bitrate until the difference between the quality index value and the quality threshold value is less than or equal to the first threshold value.
Optionally, after the target bitrate is adjusted by the two embodiments, the quality index value corresponding to the video to be processed may be equal to the quality threshold, or the quality index value corresponding to the video to be processed may approach (or approach) the quality threshold infinitely, that is, the difference between the quality index value and the quality threshold is less than or equal to the first threshold.
In the video data processing method provided by this embodiment, a video to be processed and a target code rate corresponding to the video to be processed are obtained, transcoding processing is performed on the video to be processed based on the target code rate, a quality index value corresponding to the video to be processed after transcoding processing is obtained, then, if a difference between the quality index value and a quality threshold is greater than a first threshold, the target code rate is adjusted, and transcoding processing is performed on the video to be processed based on the adjusted target code rate until the difference is less than or equal to the first threshold. Therefore, transcoding processing is carried out on the video to be processed based on the target code rate lower than the initial code rate of the video to be processed, when the difference value between the quality index value corresponding to the video to be processed after transcoding processing and the quality threshold value is larger than the first threshold value, the target code rate is adjusted, transcoding processing is carried out on the video to be processed again based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value, the target code rate can be adjusted on the premise that the video quality of the video to be processed is guaranteed, the file size of the adjusted video to be processed is reduced compared with that before adjustment, code rate saving is achieved, and network bandwidth resources are saved.
Referring to fig. 3, a flowchart of a video data processing method according to another embodiment of the present application is shown, where the present embodiment provides a video data processing method applicable to a server, and the method includes:
step S210: and acquiring the video to be processed and the target code rates corresponding to the video clips.
In this embodiment, the video to be processed may include a plurality of video segments, and the plurality of video segments are obtained by segmenting based on the video to be transcoded. Optionally, the server may be configured with an automatic compression function, and when the automatic compression function is in an on state, the video that needs to be stored (i.e., the video that corresponds to the video to be stored when the storage instruction is received) may be defaulted as the video to be transcoded. In this way, in order to improve the transcoding efficiency of the video to be processed, the video to be transcoded can be segmented. For example, the server may segment the video to be transcoded by taking a Group of Pictures (GOP) or a multiple of the GOP as a unit to obtain a plurality of video segments, and use the video segments as the video to be processed. One GOP may be a group of continuous pictures, and the number of picture frames included in one GOP is not limited in this embodiment.
Optionally, in the actual segmentation process, for a segment of video to be transcoded, multiple segmentation modes may be available. Fig. 4 is a schematic diagram illustrating a slicing manner for slicing a video to be transcoded according to an embodiment. As shown in fig. 4, the video to be transcoded may be sliced in units of a single GOP, and each video segment sliced in this way includes the same number of picture frames. The number of picture frames included in a single GOP may be set according to actual needs, for example, if the picture content of the video to be transcoded is less, the video to be transcoded may be segmented by taking the GOP including a smaller number of picture frames as a unit, so as to obtain a plurality of video segments in "manner one" shown in fig. 4 (i.e., video segment 1, video segment 2, video segment 3, video segment 4, video segment 5, video segment 6, video segment 7, and video segment 8 corresponding to manner one in fig. 4). Similarly, if the picture content of the video to be transcoded is more, the video to be transcoded may be segmented in units of GOPs including a larger number of picture frames, or the video to be transcoded is segmented in units of multiples of GOPs including a smaller number of picture frames, so as to obtain a plurality of video segments in the "mode two" shown in fig. 4 (i.e., the video segment 1, the video segment 2, the video segment 3, and the video segment 4 corresponding to the mode two in fig. 4). It is easy to see that the number of picture frames included in a single video clip in "mode two" shown in fig. 4 is greater than the number of picture frames included in a single video clip in "mode one".
Optionally, the video to be transcoded may be alternatively split in units of GOPs including a smaller number of picture frames and in units of GOPs including a larger number of picture frames, and in this manner, multiple video clips in "manner three" shown in fig. 4 (video clip 1, video clip 2, video clip 3, video clip 4, video clip 5, and video clip 6 corresponding to manner three in fig. 4) may be obtained after the splitting. Optionally, the splitting manner may be determined according to the performance and the number of cores of the server, for example, a server supporting a single core corresponds to the splitting manner one, a server supporting a dual core corresponds to the splitting manner two, a server supporting a multi core corresponds to the splitting manner three, and the like, and the selection of the specific splitting manner may be adjusted according to an actual situation, which is only taken as an example.
In this embodiment, the target bitrate corresponding to each of the plurality of video segments may be set, and optionally, the target bitrate corresponding to each of the plurality of video segments may be the same or different. In some embodiments, the target bitrate corresponding to each video clip can be lower than the initial bitrate of the video clip; in other embodiments, the target bitrate corresponding to each video segment may be lower than the overall initial bitrate of the video to be processed (alternatively, the overall initial bitrate may be an average of the initial bitrates corresponding to the plurality of video segments, respectively).
Step S220: and respectively carrying out transcoding processing on the plurality of video segments based on the respective corresponding target code rates.
As a manner, if the target code rates corresponding to the multiple video segments are different, the multiple video segments may be transcoded based on the target code rates corresponding to the multiple video segments, and in this manner, the code rates corresponding to the transcoded video segments are the target code rates corresponding to the transcoded video segments. For a specific process of performing transcoding processing on each of the plurality of video segments, reference may be made to the description in the foregoing embodiment, and details are not repeated here.
Step S230: and acquiring quality index values respectively corresponding to the plurality of transcoded video segments.
The quality index value corresponding to each video segment may be calculated by combining the transcoded video segment and the video segment before transcoding, and the specific calculation principle may refer to the description in the foregoing embodiments, which is not described herein again.
Optionally, the quality index value corresponding to the video segment with the higher target code rate may be higher, and the quality index value corresponding to the video segment with the lower target code rate may be lower. In one embodiment, a quality threshold may be set, the quality index value may be compared with the quality threshold, and the target bitrate may be adjusted according to the comparison result. However, if the quality index values corresponding to a plurality of video segments are compared with a uniform quality threshold, when the target bitrate corresponding to a certain video segment is increased, the target bitrate corresponding to other video segments may be increased by mistake, and thus, the bitrate may be wasted.
As a way to improve the above problem, the quality thresholds corresponding to the plurality of video segments may be determined based on the quality characteristics corresponding to the plurality of video segments, that is, if the quality characteristics corresponding to the plurality of video segments are different, the quality thresholds corresponding to the plurality of video segments may be different, which is described in detail below.
Referring to fig. 5 as an implementation manner, a flowchart of a method for determining quality thresholds corresponding to a plurality of video segments is shown according to an embodiment of the present application. As shown in fig. 5, one way to determine quality thresholds corresponding to a plurality of video segments, respectively, may include the steps of:
step S231: and acquiring the content complexity of each video clip.
In this embodiment, the quality feature corresponding to the video clip may be the content complexity of the video clip. The content complexity is used to characterize how fast the content of the picture scene of the video clip changes, the picture color type, and/or how much/how fast the picture color changes. For example, the higher the content complexity of a video clip, the faster the picture scene content between adjacent picture frames within the video clip changes; the lower the content complexity of a video clip, the slower the picture scene content between adjacent picture frames within the video clip changes. As another example, the higher the content complexity of a video clip, the more/faster the picture color categories between adjacent picture frames within the video clip change; the lower the content complexity of the video segment, the less/slower the picture color class changes between adjacent picture frames within the video segment. Specifically, the content complexity of the video clip can be obtained according to the difference degree between adjacent picture frames in the video clip. Alternatively, the difference degree may be obtained based on the similarity of picture contents between adjacent picture frames in the video segment, or the difference degree may be obtained based on how fast the motion vectors of the pixels of the adjacent picture frames in the video segment change.
As an implementation mode, the similarity of the picture content between adjacent picture frames can be obtained, the speed of the change of the picture scene content is determined according to the value of the similarity, and the content complexity of the video clip is obtained. Optionally, the smaller the similarity of the picture content between the adjacent picture frames is, the faster the picture scene content representing between the adjacent picture frames changes; the greater the similarity of picture content between adjacent picture frames, the slower the picture scene content between the adjacent picture frames is characterized to change. In this implementation manner, the similarity of the picture contents between adjacent picture frames may be scored, and then the score of the similarity is subtracted from the target score (which may be 100, for example, without limitation), so as to obtain the score of the degree of difference. Optionally, the larger the score of the degree of difference is, the larger the degree of difference is characterized, and the smaller the score of the degree of difference is, the smaller the degree of difference is characterized.
Optionally, in a case that the similarity of the picture content between the adjacent picture frames is obtained, a variation width of the similarity of the picture content corresponding to each of the plurality of adjacent picture frames of the video clip may be obtained, and optionally, the larger the variation width is, the faster the picture scene content of the corresponding video clip varies, and the smaller the variation width is, the slower the picture scene content of the corresponding video clip varies. For example, assuming that the video segment a includes 5 frame frames, i.e., frame 1, frame 2, frame 3, frame 4, and frame 5, if the similarity of the frame contents between frame 1 and frame 2 is 8, the similarity of the frame contents between frame 2 and frame 3 is 15, the similarity of the frame contents between frame 3 and frame 4 is 23, and the similarity of the frame contents between frame 4 and frame 5 is 40, it can be determined that the scene contents of the corresponding video segment change faster. On the other hand, if the similarity of the picture content between the picture frames 1 and 2 is 8, the similarity of the picture content between the picture frames 2 and 3 is 9, the similarity of the picture content between the picture frames 3 and 4 is 11, and the similarity of the picture content between the picture frames 4 and 5 is 15, it can be determined that the picture scene content of the corresponding video segment changes slowly.
As another implementation manner, the change speed of the motion vector of the pixel of the adjacent picture frame can be obtained, and then the change speed of the picture scene content can be determined according to the change speed of the motion vector. Optionally, if the change of the motion vector of the pixel of the adjacent picture frame is faster, it may be determined that the difference degree between the adjacent picture frames is larger, and the picture scene content representing the adjacent picture frames changes faster; if the change of the motion vector of the pixel of the adjacent picture frame is slower, it can be determined that the difference degree between the adjacent picture frames is smaller, and the picture scene content representing the adjacent picture frames is slower to change.
Optionally, or the more the color types of the picture frames in the video clip are, the higher the content complexity of the corresponding video clip is; the less the color types of the picture frames within a video clip, the lower the content complexity of the corresponding video clip. Or the more picture color changes between adjacent picture frames in the video clip, the higher the content complexity of the corresponding video clip; the less picture color changes between adjacent picture frames within a video segment, the less complex the content of the corresponding video segment.
Step S232: and determining quality threshold values respectively corresponding to the video clips based on the content complexity of the video clips.
In this embodiment, the higher the content complexity, the larger the quality threshold corresponding to the video segment is. As one approach, quality thresholds corresponding to respective video segments may be determined based on content complexity of the respective video segments of the plurality of video segments.
For example, in a specific application scenario, please refer to fig. 6, which illustrates a schematic diagram of a determination manner for determining quality thresholds corresponding to video segments according to content complexity of the video segments according to an embodiment of the present application. As shown in fig. 6, it is assumed that the video to be processed includes 4 video segments, i.e., video segment 1, video segment 2, video segment 3, and video segment 4 shown in fig. 6, if the size relationship of the content complexity corresponding to the 4 video segments is: "video segment 3> video segment 2> video segment 1> video segment 4", then it may be determined that video segment 3 corresponds to a quality threshold of 90, video segment 2 corresponds to a quality threshold of 85, video segment 1 corresponds to a quality threshold of 80, and video segment 4 corresponds to a quality threshold of 75.
By determining the quality threshold corresponding to the video segment according to the content complexity, when the difference between the quality index value of the transcoded video segment and the quality threshold is smaller than or equal to the first threshold, the video quality of the video segment with more complex content can be maintained at a higher level, the watching experience of a user is improved, the video quality of the video segment with less complex content can be maintained at a lower level, so that the code rate of the video segment can be reduced, and the storage space occupied by the video segment can be reduced.
As another implementation manner, please refer to fig. 7, which illustrates a flowchart of another method for determining quality thresholds corresponding to a plurality of video segments according to an embodiment of the present application. As shown in fig. 7, another way to determine the quality thresholds corresponding to the plurality of video segments may include the following steps:
step S233: and acquiring the fragment position of each video fragment in the video to be transcoded.
In this embodiment, the quality feature corresponding to the video segment may be a segment position of the video segment. The segment position may be understood as a relative arrangement position of the video segments in the plurality of video segments, and optionally, the plurality of video segments may be arranged from front to back in order of the playing time of the video to be processed. It is to be understood that each video segment includes a plurality of frames of video images, and alternatively, the position of the video segment (i.e., the segment position) may be determined by the time of the first frame of video image of each video segment (e.g., the shooting time), or the position of the video segment (i.e., the segment position) may be determined by the time of the last frame of video image of each video segment, or the interval of the adjacent start frames of the two adjacent video segments respectively used for determining the segment position of the video segment may be equal.
Step S234: and determining quality threshold values respectively corresponding to the video segments based on the segment positions of the video segments in the video to be transcoded.
In this embodiment, the more the video segment is centered in the segment position, the larger the corresponding quality threshold may be. It should be noted that the central position in this embodiment is the middle position of the playing timing sequence of the video to be processed. For example, assuming that the playing time length of the video to be processed is 12 minutes, the position of the video segment corresponding to the playing time length of the video to be processed from 4 th minute to 8 th minute may be taken as the center position of the video to be processed. As one way, the quality thresholds corresponding to the video segments may be determined based on the segment positions of the video segments in the video to be transcoded. Optionally, in order to quickly confirm the segment position of the video segment, segment region division may be performed on the plurality of video segments, so that the segment position of the video segment may be quickly confirmed according to the position of the segment region to which the video segment belongs.
Wherein each segment region may include at least one video segment. For example, assuming that the video to be processed includes 10 video segments, i.e., video segment 1, video segment 2, video segment 3, video segment 4, video segment 5, video segment 6, video segment 7, video segment 8, video segment 9, and video segment 10, if the segment region of the video to be processed is divided into three segments, i.e., segment region 1, segment region 2, and segment region 3, the segment region 1 may include video segment 1, video segment 2, video segment 3, and video segment 4, the segment region 2 may include video segment 5, video segment 6, video segment 7, and video segment 8, and the segment region 3 may include video segment 9 and video segment 10. Optionally, the foregoing division is merely described as an example, and the number of video segment regions included in each to-be-processed video and the number of video segments included in each video segment region may be set according to an actual situation in actual implementation, which is not limited herein.
For example, in a specific application scenario, please refer to fig. 8, which illustrates a schematic diagram of a determination manner of determining quality thresholds corresponding to a plurality of video segments according to segment positions corresponding to the video segments according to an embodiment of the present application. As shown in fig. 8, it is assumed that the video to be processed includes 3 video segment regions, namely, a segment region 13, a segment region 14 (the segment positions are relatively centered), and a segment region 15, and it is assumed that the segment region 13 includes two video segments, the segment region 14 includes several video segments, and the segment region 15 includes two video segments. Optionally, the server may divide the video segment regions according to the video content of the video to be processed, for example, for some movies with a long preamble or a relatively short beginning and a relatively short ending, the first segment region may include more video segments than the last segment region.
Alternatively, the quality threshold corresponding to the video segment with the segment position relatively centered may be set higher than the quality thresholds corresponding to the other video segments (i.e., the video segments with the segment position relatively non-centered). As shown in fig. 8, the quality thresholds corresponding to the video clips included in the clip area 14 are both greater than the quality thresholds corresponding to the video clips included in the clip area 13 and the clip area 15. Optionally, the quality thresholds corresponding to different video segments belonging to the central segment region may be the same or different, for example, in the segment region 14 shown in fig. 8, the quality thresholds corresponding to the two rightmost video segments are both 90 (i.e., the same), and the quality thresholds corresponding to the two leftmost video segments are 85 and 90 (i.e., different). Similarly, the quality thresholds corresponding to different video segments belonging to the non-centered segment region may be the same or different, for example, the quality thresholds corresponding to the two video segments in the segment region 13 shown in fig. 8 are both 80 (i.e., the same), and the quality thresholds corresponding to the two video segments in the segment region 15 are both 80 and 75 (i.e., different).
As another implementation, please refer to fig. 9, which illustrates a flowchart of another method for determining quality thresholds corresponding to a plurality of video segments according to an embodiment of the present application. As shown in fig. 9, yet another way of determining quality thresholds corresponding to a plurality of video segments may include the steps of:
step S235: inputting the plurality of video segments into a machine learning model.
In order to improve the accuracy of video transcoding, as one way, a plurality of video segments may be input into a machine learning model in combination with deep learning, and optionally, the machine learning model may be used to predict a quality threshold corresponding to a video segment, and the machine learning model may be obtained by training a large number of video segments in advance.
Step S236: and acquiring quality thresholds which are output by the machine learning model and respectively correspond to the plurality of video segments.
In this embodiment, the quality threshold values output by the machine learning model and corresponding to the plurality of video segments can be obtained. By acquiring the quality threshold corresponding to the video segmentation by means of the machine learning model, the calculation pressure of the server can be reduced, and the transcoding efficiency is improved. Meanwhile, the accuracy of video transcoding can be improved.
It should be noted that, during the actual operation of determining the quality threshold corresponding to each of the plurality of video segments, the quality thresholds corresponding to the plurality of video segments may be determined in combination with at least two of the above embodiments, for example, the quality thresholds corresponding to the plurality of video segments may be determined in combination with machine learning and content complexity, or the quality thresholds corresponding to the plurality of video segments may be determined in combination with machine learning and segment position, or the quality thresholds corresponding to the plurality of video clips may be determined in combination with the content complexity and the clip position (the order of determination of both may be unlimited), or the quality thresholds corresponding to the multiple video segments may be determined jointly by combining machine learning, content complexity, and segment positions, and a specific implementation process is not repeated here, and all quality threshold determination schemes including the above embodiments are within the scope of the present application.
Step S240: if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate corresponding to the video segment with the difference value larger than the first threshold value, and performing transcoding processing on the video segment based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
In this embodiment, if there is a video segment with a difference between the quality index value and the quality threshold being greater than the first threshold among the plurality of video segments, the target bitrate corresponding to the video segment with the difference being greater than the first threshold may be adjusted, and the transcoding process may be performed on the corresponding video segment again based on the adjusted target bitrate until the difference between the quality index value and the quality threshold is less than or equal to the first threshold. For the specific rate adjustment principle and the transcoding process, reference may be made to the description in the foregoing embodiments, and details are not described here.
The video data processing method provided by this embodiment implements transcoding processing on a plurality of video segments of a video to be processed, based on target code rates that are lower than initial code rates of the video segments of the video to be processed, and when a difference between a quality index value corresponding to each of the transcoded video segments and a corresponding quality threshold is greater than a first threshold, adjusts the target code rate corresponding to the video segment whose difference is greater than the first threshold, and transcodes the corresponding video segment again based on the adjusted target code rate until the difference is less than or equal to the first threshold, so that the target code rate corresponding to the video segment can be adjusted on the premise of ensuring the video quality of the video segment, thereby implementing code rate saving and network bandwidth saving. Meanwhile, the quality threshold corresponding to each video clip is determined based on the quality characteristics such as content complexity, clip position and the like corresponding to each video clip, and the reliability and accuracy of the quality threshold are improved.
Referring to fig. 10, a flowchart of a video data processing method according to another embodiment of the present application is shown, where the present embodiment provides a video data processing method applicable to a server, and the method includes:
step S310: and acquiring the video to be processed and the target code rates corresponding to the video clips.
Step S320: and calling a plurality of threads, and transcoding the plurality of video segments respectively based on the respective corresponding target code rates.
Optionally, in order to facilitate improving transcoding efficiency, in this embodiment, multiple threads may be called to perform transcoding processing on multiple video segments respectively based on respective target code rates corresponding to the multiple video segments. The multiple threads may be pre-created threads or threads created in real time. Alternatively, the server may automatically determine the number of calling threads according to its processing performance or execution performance.
It is understood that, if the processing performance or the running performance of the server is weak, the number of threads invoked by the server may be smaller than the number of video segments, i.e., in this case, the threads and the video segments may not be supported in a one-to-one correspondence. As an implementation manner, if the number of the multiple threads is smaller than the number of the video segments, a video segment that meets a specified condition in the multiple video segments may be acquired as a target video segment (which may be multiple video segments); then, a plurality of threads are called to transcode the corresponding video segments respectively based on the target code rates respectively corresponding to the video segments in the target video segments; after transcoding of each video segment in the target video segment is completed, the multiple threads are called to transcode the other video segments except the target video segment in the multiple video segments based on the target code rates corresponding to the other video segments except the target video segment.
Wherein the specified condition may include: the content complexity of the video segment is higher than the second threshold (alternatively, the second threshold in this embodiment may be understood as a complexity threshold); or the video clip is a clip at the middle position of the video to be processed, and the middle position is the middle position of the playing time sequence of the video to be processed. Optionally, the specific value of the second threshold may not be limited, for example, the second threshold may be 80, 85, or 90.
It should be noted that, when the target video segment is selected in the transcoding processing stage, if the specified condition is that the content complexity of the video segment is higher than the second threshold, in this case, when the quality threshold corresponding to the video segment is determined in the subsequent step, as an implementation manner, the quality threshold corresponding to the video segment may be determined quickly directly according to the determination result of whether the content complexity of the video segment is higher than the second threshold, for example, if it is determined that the content complexity of the video segment is higher than the second threshold, the quality threshold corresponding to the video segment may be set to be larger; if the content complexity of the video segment is determined to be lower than the second threshold, the quality threshold corresponding to the video segment may be set to be smaller. Wherein the higher the content complexity, the higher the quality threshold corresponding to the video segment may be.
As another embodiment, when it is determined that the content complexity of the video segment is higher than the second threshold, the segment position of the video segment may be further obtained, and if the content complexity of the video segment is higher than the second threshold and the video segment is a segment at the center position of the to-be-processed video, the quality threshold corresponding to the video segment may be set to be larger; if the content complexity of the video segment is higher than the second threshold, and the video segment is not a segment of the video to be processed in the middle position, the quality threshold corresponding to the video segment may be set to be relatively small.
When the content complexity of the video clip is judged to be higher than the second threshold, whether the video clip is the clip at the middle position of the video to be processed is further acquired, and then the quality threshold corresponding to the video clip is determined according to the acquisition result, so that the core picture content of the video to be processed can be ensured to keep higher video playing quality, the accuracy of quality threshold setting is improved, and the watching experience of a user can be improved.
Similarly, if the specified condition is that the video segment is the segment at the central position of the video to be processed, in this case, when the quality threshold corresponding to the video segment is determined subsequently, as an implementation manner, the quality threshold corresponding to the video segment may be determined rapidly directly according to the determination result of whether the video segment is the segment at the central position of the video to be processed, for example, if it is determined that the video segment is the segment at the central position of the video to be processed, the quality threshold corresponding to the video segment may be set to be larger; if the video segment is determined not to be the segment at the center of the video to be processed, the quality threshold corresponding to the video segment may be set to be smaller.
As another embodiment, when it is determined that the video segment is the segment at the center position of the video to be processed, the content complexity of the video segment may be further obtained, and if the video segment is the segment at the center position of the video to be processed and the content complexity of the video segment is higher than the second threshold, the quality threshold corresponding to the video segment may be set to be larger; if the video segment is a segment of the video to be processed in the middle position, and the content complexity of the video segment is lower than the second threshold, the quality threshold corresponding to the video segment may be set to be relatively small.
By further acquiring the content complexity of the video clip when the video clip is judged to be the clip at the middle position of the video to be processed, the video clip with higher content complexity at the middle position of the playing time sequence of the video to be processed can be ensured to be played at higher code rate, so that the video playing quality is improved, the user watching experience is improved, the video clip with relatively lower content complexity at the middle position of the playing time sequence of the video to be processed is played at relatively lower code rate, the code rate can be saved, and the waste of bandwidth resources can be reduced.
Step S330: and acquiring quality index values respectively corresponding to the plurality of transcoded video segments.
Step S340: if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate corresponding to the video segment with the difference value larger than the first threshold value, and performing transcoding processing on the video segment based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
The video data processing method provided by this embodiment implements transcoding processing on a plurality of video segments of a video to be processed, based on target code rates that are lower than initial code rates of the video segments of the video to be processed, and when a difference between a quality index value corresponding to each of the transcoded video segments and a corresponding quality threshold is greater than a first threshold, adjusts the target code rate corresponding to the video segment whose difference is greater than the first threshold, and transcodes the corresponding video segment again based on the adjusted target code rate until the difference is less than or equal to the first threshold, so that the target code rate corresponding to the video segment can be adjusted on the premise of ensuring the video quality of the video segment, thereby implementing code rate saving and network bandwidth saving. The multiple video clips are transcoded respectively by calling the multiple threads based on the target code rates corresponding to the multiple video clips, and transcoding efficiency is improved.
Referring to fig. 11, a flowchart of a video data processing method according to still another embodiment of the present application is shown, where the present embodiment provides a video data processing method applicable to a server, and the method includes:
step S410: and acquiring the video to be processed and the target code rates corresponding to the video clips.
Step S420: and respectively carrying out transcoding processing on the plurality of video segments based on the respective corresponding target code rates.
Step S430: and acquiring quality index values respectively corresponding to the plurality of transcoded video segments.
Step S440: if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate corresponding to the video segment with the difference value larger than the first threshold value, and performing transcoding processing on the video segment based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
Step S450: and judging whether the difference value between the quality index value corresponding to each video clip in the plurality of video clips and the quality threshold value corresponding to each video clip is smaller than or equal to the first threshold value.
Optionally, in order to ensure that the transcoded video to be processed can be played normally, it may be determined whether the difference between the quality index value corresponding to each of the plurality of video segments and the quality threshold corresponding to each of the plurality of video segments is smaller than or equal to the first threshold, so as to determine whether to start video splicing according to the determination result.
Optionally, in order to facilitate content generation of the Adaptive media stream based on Adaptive resolution protocols such as DASH (Dynamic Adaptive Streaming over HTTP), HLS (Dynamic bitrate Adaptive technology), and the like, a video segment of which a difference between a transcoded quality index value and a quality threshold is greater than a first threshold may be stored; and storing the video clips of which the difference value between the transcoded quality index value and the quality threshold value is smaller than or equal to the first threshold value so as to reduce secondary development processes and improve development efficiency.
Optionally, in some embodiments, the quality threshold in this embodiment may also be a threshold determined based on other video quality evaluation criteria. Other video quality evaluation criteria may include a Peak Signal to Noise Ratio (PSNR) method, a Structural SIMilarity (SSIM) method, and the like. Or at least two of the VMAF, PSNR, and SSIM methods may be used in combination to jointly confirm the quality threshold corresponding to the video segment. For example, when determining the quality threshold corresponding to a video segment by the VMAF method, the PSNR value and/or the SSIM value of the video segment may be calculated at the same time, and then the quality threshold corresponding to the video segment may be determined according to the calculation result.
Step S460: and splicing the plurality of video clips to obtain a target video.
As a mode, if the difference between the quality index value corresponding to each of the plurality of video segments and the quality threshold corresponding to each of the plurality of video segments is less than or equal to the first threshold, the plurality of video segments may be started to be spliced to obtain the target video. In this embodiment, the code rate of the target video is lower than the initial code rate of the video to be processed. Visually, the video quality of the target video is comparable to the video quality of the video to be processed.
As another mode, if there is a video segment with a quality index value greater than the first threshold value, the splicing may be resumed after the quality index values of all the video segments have a difference value smaller than or equal to the first threshold value.
Optionally, in this embodiment, if a difference between a quality index value corresponding to the transcoded video segment and a quality threshold corresponding to the transcoded video segment is less than or equal to the first threshold, the corresponding thread may be ended, or the corresponding thread may be instructed to start transcoding other video segments whose difference is not less than or equal to the first threshold.
Optionally, in some embodiments, if after transcoding multiple times, the difference between the quality index value corresponding to the video segment and the quality threshold corresponding to the video segment is greater than the first threshold, the quality threshold corresponding to the video segment may be adjusted, and a specific adjustment value may be selected according to an actual situation, which is not illustrated here.
The video data processing method provided by this embodiment implements transcoding processing on a plurality of video segments of a video to be processed, based on target code rates that are lower than initial code rates of the video segments of the video to be processed, and when a difference between a quality index value corresponding to each of the transcoded video segments and a corresponding quality threshold is greater than a first threshold, adjusts the target code rate corresponding to the video segment whose difference is greater than the first threshold, and transcodes the corresponding video segment again based on the adjusted target code rate until the difference is less than or equal to the first threshold, so that the target code rate corresponding to the video segment can be adjusted on the premise of ensuring the video quality of the video segment, thereby implementing code rate saving and network bandwidth saving. By splicing a plurality of video segments of which the difference value between the quality index value and the quality threshold value is smaller than or equal to the first threshold value, the whole storage space of the video to be processed can be reduced, and the storage cost is further reduced.
Referring to fig. 12, a block diagram of a video data processing apparatus according to an embodiment of the present disclosure is shown, in which the video data processing apparatus 500 may operate in a server, and the apparatus 500 includes: the first obtaining module 510, the first processing module 520, the second obtaining module 530, and the second processing module 540:
a first obtaining module 510, configured to obtain a video to be processed and a target bitrate corresponding to the video to be processed, where the target bitrate is lower than an initial bitrate of the video to be processed.
A first processing module 520, configured to perform transcoding processing on the video to be processed based on the target bitrate.
A second obtaining module 530, configured to obtain a quality index value corresponding to the transcoded video to be processed.
A second processing module 540, configured to, if the difference between the quality index value and the quality threshold is greater than a first threshold, adjust the target bitrate, and perform, based on the adjusted target bitrate, the transcoding processing on the video to be processed until the difference is less than or equal to the first threshold.
In an embodiment, the second processing module 540 may be configured to decrease the code rate if the difference is greater than a first threshold and the quality index value is greater than the quality threshold. In this way, the transcoding process may be performed on the video to be processed based on the reduced target bitrate until the difference is smaller than or equal to the first threshold.
In another embodiment, the second processing module 540 may be configured to increase the code rate if the difference is greater than a first threshold and the quality index value is less than the quality threshold. In this way, the transcoding process may be performed on the video to be processed based on the increased target bitrate until the difference is smaller than or equal to the first threshold.
Optionally, the apparatus 500 may further include a video segmentation module, which may be configured to segment the video to be transcoded to obtain a video to be processed including a plurality of video segments. In this case, the first obtaining module 510 may be configured to obtain target bitrate corresponding to each of the plurality of video segments; the first processing module 520 may be configured to transcode the plurality of video segments based on the respective corresponding target bitrate; the second obtaining module 530 may be configured to obtain quality indicator values corresponding to the transcoded multiple video segments respectively; the second processing module 540 may be configured to, if the difference between the quality index value and the quality threshold is greater than a first threshold, adjust a target bitrate corresponding to the video segment whose difference is greater than the first threshold, and perform the transcoding processing on the video segment based on the adjusted target bitrate until the difference is less than or equal to the first threshold.
Optionally, the apparatus 500 may further include a threshold setting module, where when the video to be processed includes a plurality of video segments, if the difference between the quality index value and the quality threshold is greater than a first threshold, before adjusting the target bitrate, as an implementation manner, the threshold setting module may be configured to obtain content complexity of each of the video segments; and determining quality thresholds respectively corresponding to the video clips based on the content complexity of the video clips, wherein the quality threshold corresponding to the video clip with higher content complexity is larger.
As another implementation manner, the threshold setting module may be configured to obtain a segment position of each video segment in the video to be transcoded; and determining quality threshold values respectively corresponding to the video segments based on the segment positions of the video segments in the video to be transcoded.
As yet another implementation, the threshold setting module may be configured to input the plurality of video segments into a machine learning model, the machine learning model being configured to predict a quality threshold corresponding to a video segment; and acquiring quality thresholds which are output by the machine learning model and respectively correspond to the plurality of video segments.
As an embodiment, the first processing module 520 may be configured to invoke a plurality of threads to respectively transcode the plurality of video segments based on the respective corresponding target bitrate. If the number of the threads is smaller than the number of the video clips, the video clips meeting the specified conditions in the video clips can be acquired as target video clips; then, the multiple threads are called to transcode the target video clip based on the respective corresponding target code rates; and then the multiple threads are called to transcode the video segments except the target video segment in the multiple video segments based on the transcoding parameters corresponding to the multiple threads. Optionally, the specified conditions in this embodiment may include: the content complexity of the video segment is higher than a second threshold; or the video clip is a clip at the middle position of the video to be processed, and the middle position is the middle position of the playing time sequence of the video to be processed.
Optionally, the apparatus 500 may further include a splicing module, configured to, after adjusting a target bitrate corresponding to a video segment whose difference is greater than a first threshold if the difference between the quality index value and the quality threshold is greater than the first threshold, and performing the transcoding processing on the video segment based on the adjusted target bitrate until the difference is less than or equal to the first threshold, splice the plurality of video segments to obtain the target video if the difference between the quality index value corresponding to each of the plurality of video segments and the quality threshold corresponding to each of the plurality of video segments is less than or equal to the first threshold. Optionally, in this embodiment, the bitrate of the target video is lower than the initial bitrate of the video to be processed.
Optionally, the apparatus 500 may further include a storage module, configured to store the video segment of which the difference between the quality index value and the quality threshold value is greater than a first threshold value after transcoding; and storing the video segment of which the difference value between the quality index value and the quality threshold value is smaller than or equal to the first threshold value after transcoding.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Referring to fig. 13, based on the video data processing method and apparatus, an embodiment of the present application further provides an electronic device 100 capable of executing the video data processing method. The electronic device 100 may be the server 12 described in fig. 1. The electronic device 100 includes a memory 102 and one or more processors 104 (only one shown) coupled to each other, the memory 102 and the processors 104 being communicatively coupled to each other. The memory 102 stores therein a program that can execute the contents of the foregoing embodiments, and the processor 104 can execute the program stored in the memory 102.
The processor 104 may include one or more processing cores, among other things. The processor 104 interfaces with various components throughout the electronic device 100 using various interfaces and circuitry to perform various functions of the electronic device 100 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 102 and invoking data stored in the memory 102. Alternatively, the processor 104 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 104 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 104, but may be implemented by a communication chip.
The Memory 102 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 102 may be used to store instructions, programs, code sets, or instruction sets. The memory 102 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the foregoing embodiments, and the like. The data storage area may also store data created by the electronic device 100 during use (e.g., phone book, audio-video data, chat log data), and the like.
Referring to fig. 14, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 600 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 600 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 600 includes a non-volatile computer-readable storage medium. The computer readable storage medium 600 has storage space for program code 610 for performing any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 610 may be compressed, for example, in a suitable form.
To sum up, according to the video data processing method, the video data processing device, the electronic device, and the storage medium provided by the embodiments of the present application, a video to be processed and a target bitrate corresponding to the video to be processed are obtained, then transcoding processing is performed on the video to be processed based on the target bitrate, then a quality index value corresponding to the video to be processed after transcoding processing is obtained, then, if a difference between the quality index value and a quality threshold is greater than a first threshold, the target bitrate is adjusted, and transcoding processing is performed on the video to be processed based on the adjusted target bitrate until the difference is less than or equal to the first threshold. Therefore, transcoding processing is performed on the video to be processed based on the target code rate lower than the initial code rate of the video to be processed, when the difference value between the quality index value corresponding to the video to be processed after transcoding processing and the quality threshold value is larger than the first threshold value, the target code rate is adjusted, transcoding processing is performed on the video to be processed again based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value, so that the target code rate can be adjusted on the premise of ensuring the video quality of the video to be processed, code rate saving is achieved, and network bandwidth is saved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (15)

1. A method of video data processing, the method comprising:
acquiring a video to be processed and a target code rate corresponding to the video to be processed, wherein the target code rate is lower than an initial code rate of the video to be processed;
transcoding the video to be processed based on the target code rate;
acquiring a quality index value corresponding to the video to be processed after transcoding processing;
and if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate, and performing transcoding processing on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
2. The method of claim 1, wherein the adjusting the target bitrate if the difference between the quality indicator value and the quality threshold is greater than a first threshold comprises:
if the difference value is larger than a first threshold value and the quality index value is larger than the quality threshold value, reducing the target code rate;
the transcoding the to-be-processed video based on the adjusted target bitrate until the difference is less than or equal to the first threshold includes:
and performing transcoding processing on the video to be processed based on the reduced target code rate until the difference value is smaller than or equal to the first threshold value.
3. The method of claim 1, wherein the adjusting the target bitrate if the difference between the quality indicator value and the quality threshold is greater than a first threshold comprises:
if the difference value is larger than a first threshold value and the quality index value is smaller than the quality threshold value, increasing the target code rate;
the transcoding the to-be-processed video based on the adjusted target bitrate until the difference is less than or equal to the first threshold includes:
and performing transcoding processing on the video to be processed based on the increased target code rate until the difference value is smaller than or equal to the first threshold value.
4. The method of claim 1, wherein the to-be-processed video includes a plurality of video segments, the video segments are obtained by splitting based on a to-be-transcoded video, the obtaining a target bitrate corresponding to the to-be-processed video, transcoding the to-be-processed video based on the target bitrate, obtaining a quality index value corresponding to the to-be-processed video after transcoding, if a difference between the quality index value and a quality threshold is greater than a first threshold, adjusting the target bitrate, and transcoding the to-be-processed video based on the adjusted target bitrate until the difference is less than or equal to the first threshold, includes:
acquiring target code rates corresponding to the plurality of video clips respectively;
respectively transcoding the plurality of video segments based on the respective corresponding target code rates;
acquiring quality index values respectively corresponding to the plurality of video segments after transcoding processing;
if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, adjusting the target code rate corresponding to the video segment with the difference value larger than the first threshold value, and performing transcoding processing on the video segment based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
5. The method of claim 4, wherein before adjusting the target bitrate if the difference between the quality indicator value and the quality threshold is greater than a first threshold, the method further comprises:
acquiring the content complexity of each video clip;
and determining quality thresholds respectively corresponding to the video clips based on the content complexity of the video clips, wherein the quality threshold corresponding to the video clip with higher content complexity is larger.
6. The method of claim 4, wherein before adjusting the target bitrate if the difference between the quality indicator value and the quality threshold is greater than a first threshold, the method further comprises:
acquiring the fragment position of each video fragment in the video to be transcoded;
and determining quality threshold values respectively corresponding to the video segments based on the segment positions of the video segments in the video to be transcoded.
7. The method of claim 4, wherein before adjusting the target bitrate if the difference between the quality indicator value and the quality threshold is greater than a first threshold, the method further comprises:
inputting the plurality of video segments into a machine learning model, the machine learning model for predicting quality thresholds corresponding to the video segments;
and acquiring quality thresholds which are output by the machine learning model and respectively correspond to the plurality of video segments.
8. The method of claim 4, wherein the transcoding the plurality of video segments based on the respective corresponding target bitrate respectively comprises:
and calling a plurality of threads, and transcoding the plurality of video segments respectively based on the respective corresponding target code rates.
9. The method of claim 8, wherein the invoking the plurality of threads to transcode the plurality of video segments based on the respective corresponding target bitrate comprises:
if the number of the threads is smaller than the number of the video clips, acquiring the video clips meeting specified conditions in the video clips as target video clips;
calling the multiple threads to transcode the target video clip based on the respective corresponding target code rates;
and calling the multiple threads to transcode the video clips except the target video clip in the multiple video clips based on the target code rates corresponding to the video clips except the target video clip.
10. The method of claim 9, wherein the specified condition comprises: the content complexity of the video segment is higher than a second threshold; or the video clip is a clip at the middle position of the video to be processed, and the middle position is the middle position of the playing time sequence of the video to be processed.
11. The method of claim 4, wherein if the difference between the quality index value and the quality threshold is greater than a first threshold, adjusting a target bitrate corresponding to a video segment with the difference greater than the first threshold, and performing the transcoding process on the video segment based on the adjusted target bitrate until the difference is less than or equal to the first threshold, further comprising:
and if the difference value between the quality index value corresponding to each video clip in the plurality of video clips and the quality threshold value corresponding to each video clip is smaller than or equal to the first threshold value, splicing the plurality of video clips to obtain the target video.
12. The method of claim 4, further comprising:
storing the video clips of which the difference value between the quality index value and the quality threshold value is larger than a first threshold value after transcoding;
and storing the video segment of which the difference value between the quality index value and the quality threshold value is smaller than or equal to the first threshold value after transcoding.
13. A video data processing apparatus, operating on a server, the apparatus comprising:
the first obtaining module is used for obtaining a video to be processed and a target code rate corresponding to the video to be processed, wherein the target code rate is lower than an initial code rate of the video to be processed;
the first processing module is used for transcoding the video to be processed based on the target code rate;
the second acquisition module is used for acquiring a quality index value corresponding to the video to be processed after transcoding processing;
and the second processing module is used for adjusting the target code rate if the difference value between the quality index value and the quality threshold value is larger than a first threshold value, and carrying out transcoding processing on the video to be processed based on the adjusted target code rate until the difference value is smaller than or equal to the first threshold value.
14. An electronic device comprising one or more processors and memory;
one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-12.
15. A computer-readable storage medium, having program code stored therein, wherein the program code when executed by a processor performs the method of any of claims 1-12.
CN202010997869.3A 2020-09-21 2020-09-21 Video data processing method and device, electronic equipment and storage medium Pending CN111970565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010997869.3A CN111970565A (en) 2020-09-21 2020-09-21 Video data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010997869.3A CN111970565A (en) 2020-09-21 2020-09-21 Video data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111970565A true CN111970565A (en) 2020-11-20

Family

ID=73387489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010997869.3A Pending CN111970565A (en) 2020-09-21 2020-09-21 Video data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111970565A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114430501A (en) * 2021-12-28 2022-05-03 上海网达软件股份有限公司 Content adaptive encoding method and system for file transcoding
CN115002520A (en) * 2022-04-14 2022-09-02 百果园技术(新加坡)有限公司 Video stream data processing method, device, equipment and storage medium
CN115361571A (en) * 2022-08-04 2022-11-18 武汉依迅北斗时空技术股份有限公司 Playing method and device of cloud storage video data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101578875A (en) * 2007-01-04 2009-11-11 英国电讯有限公司 Video signal encoding
CN102137258A (en) * 2011-03-22 2011-07-27 宁波大学 Method for controlling three-dimensional video code rates
CN105263066A (en) * 2014-06-13 2016-01-20 珠海全志科技股份有限公司 Mobile equipment video stream transmission control method and system
US20160212373A1 (en) * 2015-01-16 2016-07-21 Microsoft Technology Licensing, Llc Dynamically updating quality to higher chroma sampling rate
WO2019037471A1 (en) * 2017-08-24 2019-02-28 中兴通讯股份有限公司 Video processing method, video processing device and terminal
CN110225340A (en) * 2019-05-31 2019-09-10 北京猿力未来科技有限公司 A kind of control method and device of Video coding calculate equipment and storage medium
CN111107395A (en) * 2019-12-31 2020-05-05 广州市百果园网络科技有限公司 Video transcoding method, device, server and storage medium
CN111263243A (en) * 2020-02-17 2020-06-09 网易(杭州)网络有限公司 Video coding method and device, computer readable medium and electronic equipment
CN111327950A (en) * 2020-03-05 2020-06-23 腾讯科技(深圳)有限公司 Video transcoding method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101578875A (en) * 2007-01-04 2009-11-11 英国电讯有限公司 Video signal encoding
CN102137258A (en) * 2011-03-22 2011-07-27 宁波大学 Method for controlling three-dimensional video code rates
CN105263066A (en) * 2014-06-13 2016-01-20 珠海全志科技股份有限公司 Mobile equipment video stream transmission control method and system
US20160212373A1 (en) * 2015-01-16 2016-07-21 Microsoft Technology Licensing, Llc Dynamically updating quality to higher chroma sampling rate
WO2019037471A1 (en) * 2017-08-24 2019-02-28 中兴通讯股份有限公司 Video processing method, video processing device and terminal
CN110225340A (en) * 2019-05-31 2019-09-10 北京猿力未来科技有限公司 A kind of control method and device of Video coding calculate equipment and storage medium
CN111107395A (en) * 2019-12-31 2020-05-05 广州市百果园网络科技有限公司 Video transcoding method, device, server and storage medium
CN111263243A (en) * 2020-02-17 2020-06-09 网易(杭州)网络有限公司 Video coding method and device, computer readable medium and electronic equipment
CN111327950A (en) * 2020-03-05 2020-06-23 腾讯科技(深圳)有限公司 Video transcoding method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114430501A (en) * 2021-12-28 2022-05-03 上海网达软件股份有限公司 Content adaptive encoding method and system for file transcoding
CN115002520A (en) * 2022-04-14 2022-09-02 百果园技术(新加坡)有限公司 Video stream data processing method, device, equipment and storage medium
CN115002520B (en) * 2022-04-14 2024-04-02 百果园技术(新加坡)有限公司 Video stream data processing method, device, equipment and storage medium
CN115361571A (en) * 2022-08-04 2022-11-18 武汉依迅北斗时空技术股份有限公司 Playing method and device of cloud storage video data

Similar Documents

Publication Publication Date Title
JP6469788B2 (en) Using quality information for adaptive streaming of media content
EP3562163B1 (en) Audio-video synthesis method and system
CN106791956B (en) Network live broadcast card pause processing method and device
US20220030244A1 (en) Content adaptation for streaming
CN111970565A (en) Video data processing method and device, electronic equipment and storage medium
US20140219634A1 (en) Video preview creation based on environment
US20220232222A1 (en) Video data processing method and apparatus, and storage medium
US20170103577A1 (en) Method and apparatus for optimizing video streaming for virtual reality
CN110662114B (en) Video processing method and device, electronic equipment and storage medium
CN106688239A (en) Video downloading method, apparatus, and system
CN111404882B (en) Media stream processing method and device
CN111093094A (en) Video transcoding method, device and system, electronic equipment and readable storage medium
US20200296470A1 (en) Video playback method, terminal apparatus, and storage medium
CN112584119A (en) Self-adaptive panoramic video transmission method and system based on reinforcement learning
CN112153415B (en) Video transcoding method, device, equipment and storage medium
CN112929712A (en) Video code rate adjusting method and device
EP4152755A1 (en) Methods, systems, and apparatuses for adaptive bitrate ladder construction based on dynamically adjustable neural networks
CN111263243A (en) Video coding method and device, computer readable medium and electronic equipment
CN111083536B (en) Method and device for adjusting video code rate
CN112866746A (en) Multi-path streaming cloud game control method, device, equipment and storage medium
CN111031032A (en) Cloud video transcoding method and device, decoding method and device, and electronic device
CN105323593A (en) Multimedia transcoding scheduling method and multimedia transcoding scheduling device
CN110784731B (en) Data stream transcoding method, device, equipment and medium
CN105338371A (en) Multimedia transcoding scheduling method and apparatus
CN104333765A (en) Processing method and device of video live streams

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201120