CN108174290A

CN108174290A - For handling the method and apparatus of video

Info

Publication number: CN108174290A
Application number: CN201810073414.5A
Authority: CN
Inventors: 邢怀飞; 郭帆; 史纯华
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-01-25
Filing date: 2018-01-25
Publication date: 2018-06-15
Anticipated expiration: 2038-01-25
Also published as: CN108174290B

Abstract

The embodiment of the present application discloses the method and apparatus for handling video.One specific embodiment of this method includes：It obtains video to be transcoded and video to be transcoded is divided at least two sub-videos to be transcoded；For each sub-video to be transcoded at least two sub-videos to be transcoded, following steps are performed：Determine the attributive character of the sub-video to be transcoded；Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated；By the feature vector generated input complexity prediction model trained in advance, obtain for characterize the sub-video to be transcoded complexity complexity index；Based on the complexity index and pre-set code check obtained, the transcoding code check of the sub-video to be transcoded is determined；Based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, generates sub-video after transcoding.This embodiment improves the reliabilities of video processing.

Description

For handling the method and apparatus of video

Technical field

The invention relates to field of computer technology, and in particular to for handling the method and apparatus of video.

Background technology

Video code conversion (Video transcoding) refers to that the video code flow by compressed encoding is converted into another kind and regards Frequency code stream, to adapt to different network bandwidths, different terminal processing capacities or different user demands.Video Transcoding Technology Development and ever-increasing demand and broadcasting digitalization process it is closely related, main application fields of transcoding technology at present It is digital television broadcasting and Digital Media front-end processing.

Invention content

The embodiment of the present application proposes the method and apparatus for handling video.

In a first aspect, the embodiment of the present application provides a kind of method for handling video, this method includes：It obtains and waits to turn Video to be transcoded and is divided at least two sub-videos to be transcoded at video by code, wherein, video to be transcoded includes at least two Image group, each sub-video to be transcoded at least two sub-videos to be transcoded include at least one image group；For at least two Each sub-video to be transcoded in a sub-video to be transcoded performs following steps：Determine the attributive character of the sub-video to be transcoded； Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated；The feature vector generated input is advance Trained complexity prediction model, obtain for characterize the sub-video to be transcoded complexity complexity index, wherein, it is multiple Miscellaneous degree prediction model is used to characterize the correspondence of the feature vector and complexity index of video；Referred to based on the complexity obtained Number and pre-set code check determine the transcoding code check of the sub-video to be transcoded；Based on identified transcoding code check, to this Sub-video to be transcoded carries out transcoding, generates sub-video after transcoding.

In some embodiments, video to be transcoded is divided at least two sub-videos to be transcoded, including：It determines to be transcoded The attributive character of image group included by video；For each image at least two image groups included by video to be transcoded The image group adjacent with the image group is determined as closing on image group and determining the image group and the figure for the image group by group As the difference of the characteristic value of the attributive character for closing on image group of group；Whether difference determined by determining is less than predetermined threshold value；It rings Difference should be less than predetermined threshold value determined by determine, by the image group and the image group close on image group merge into it is to be transcoded Sub-video.

In some embodiments, training obtains complexity prediction model as follows：Obtain Sample video and pre- Complexity index first demarcating, for characterizing the complexity of Sample video；Determine the attributive character of Sample video, Yi Jiji In the attributive character of identified Sample video, the feature vector of Sample video is generated；Using machine learning method, will be generated Sample video feature vector input complexity prediction model, the complexity index prediction result of Sample video is obtained, by sample The complexity index prediction result of this video and complexity index demarcating in advance, for characterizing the complexity of Sample video It is compared, the parameter of complexity prediction model is adjusted according to comparison result, so that the complexity index prediction knot of Sample video Fruit and demarcate in advance, for characterizing the difference between the complexity index of the complexity of Sample video less than preset difference Value.

In some embodiments, attributive character includes at least one of following：Bit number, code check, resolution ratio, frame per second, space Complexity, time complexity.

In some embodiments, this method further includes：Sub-video after the transcoding that is generated is merged, generation target regards Frequently.

In some embodiments, this method further includes：Export target video.

Second aspect, the embodiment of the present application provide a kind of device for being used to handle video, which includes：It divides single Member is configured to obtain video to be transcoded and video to be transcoded is divided at least two sub-videos to be transcoded, wherein, it treats Transcoded video includes at least two image groups, and each sub-video to be transcoded at least two sub-videos to be transcoded includes at least one A image group；Execution unit is configured to for each sub-video to be transcoded at least two sub-videos to be transcoded, perform with Lower step：Determine the attributive character of the sub-video to be transcoded；Based on identified attributive character, the sub-video to be transcoded is generated Feature vector；By the feature vector generated input complexity prediction model trained in advance, obtain to be transcoded for characterizing this The complexity index of the complexity of sub-video, wherein, complexity prediction model is used to characterize the feature vector and complexity of video Spend the correspondence of index；Based on the complexity index and pre-set code check obtained, the sub-video to be transcoded is determined Transcoding code check；Based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, generates sub-video after transcoding.

In some embodiments, division unit includes：Determining module is configured to determine the figure included by video to be transcoded As the attributive character of group；Merging module is configured to for each at least two image groups included by video to be transcoded Image group, by the image group adjacent with the image group be determined as the image group close on image group and determine the image group with The difference of the characteristic value of the attributive character for closing on image group of the image group；Whether difference determined by determining is less than default threshold Value；In response to determining that identified difference is less than predetermined threshold value, the image group of closing on of the image group and the image group is merged into Sub-video to be transcoded.

In some embodiments, which further includes：Combining unit, be configured to sub-video after the transcoding that is generated into Row merges, and generates target video.

In some embodiments, which further includes：Output unit is configured to output target video.

The third aspect, the embodiment of the present application provide a kind of server, including：One or more processors；Storage device, For storing one or more programs, when one or more programs are executed by one or more processors so that one or more The method that processor realizes any embodiment in the above-mentioned method for being used to handle video.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence, which realizes any embodiment in the above-mentioned method for being used to handle video method when being executed by processor.

Method and apparatus provided by the embodiments of the present application for handling video by obtaining video to be transcoded, will be waited to turn Code video is divided at least two sub-videos to be transcoded and for each son to be transcoded at least two sub-videos to be transcoded Video performs following steps：Determine the attributive character of the sub-video to be transcoded；Based on identified attributive character, generate this and treat The feature vector of transcoding sub-video；By the feature vector generated input complexity prediction model trained in advance, it is used for Characterize the complexity index of the complexity of the sub-video to be transcoded；Based on the complexity index and pre-set code obtained Rate determines the transcoding code check of the sub-video to be transcoded；Based on identified transcoding code check, which is carried out Transcoding generates sub-video after transcoding, so as to consider the complexity of sub-video to be transcoded, and answering according to sub-video to be transcoded Miscellaneous degree distributes transcoding code check, and then carry out video code conversion to sub-video to be transcoded for sub-video to be transcoded, improves video The reliability of processing.

Description of the drawings

By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other Feature, objects and advantages will become more apparent upon：

Fig. 1 is that this application can be applied to exemplary system architecture figures therein；

Fig. 2 is the flow chart for being used to handle one embodiment of the method for video according to the application；

Fig. 3 is the schematic diagram for being used to handle an application scenarios of the method for video according to the application；

Fig. 4 is the flow chart for being used to handle another embodiment of the method for video according to the application；

Fig. 5 is the structure diagram for being used to handle one embodiment of the device of video according to the application；

Fig. 6 is adapted for the structure diagram of the computer system of the server for realizing the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention rather than the restriction to the invention.It also should be noted that in order to Convenient for description, illustrated only in attached drawing and invent relevant part with related.

It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the method for being used to handle video that can apply the application or the implementation for handling the device of video The exemplary system architecture 100 of example.

As shown in Figure 1, system architecture 100 can include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be interacted with using terminal equipment 101,102,103 by network 104 with server 105, to receive or send out Send message etc..Various client applications on terminal device 101,102,103 can be installed, such as web browser applications, regarded Frequency player, searching class application, instant messaging tools, mailbox client, social platform software etc..

Terminal device 101,102,103 can be the various electronic equipments for having display screen, including but not limited to intelligent hand Machine, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc. Deng.

Server 105 can be to provide the server of various services, such as to being shown on terminal device 101,102,103 The video processing service device that video is handled.Video processing service device can carry out the data such as the video to be transcoded that receives The processing such as analysis, and handling result (such as sub-video after transcoding) is fed back into terminal device.

It should be noted that generally being held for the method that handles video by server 105 of being provided of the embodiment of the present application Row, correspondingly, the device for handling video is generally positioned in server 105.

It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realization need Will, can have any number of terminal device, network and server.

With continued reference to Fig. 2, the flow for being used to handle one embodiment of the method for video according to the application is shown 200.This is used for the method for handling video, includes the following steps：

Step 201, it obtains video to be transcoded and video to be transcoded is divided at least two sub-videos to be transcoded.

In the present embodiment, for handling electronic equipment (such as the service shown in FIG. 1 of the method for video operation thereon Device) can video to be transcoded be obtained by wired connection mode or radio connection and be divided into video to be transcoded At least two sub-videos to be transcoded.Wherein, video to be transcoded can include at least two image groups (Group of Pictures, GOP), each sub-video to be transcoded at least two sub-videos to be transcoded can include at least one image group.On specifically, It states electronic equipment and can obtain and be pre-stored within local video to be transcoded or obtain terminal (such as terminal shown in FIG. 1 is set Standby 101,102,103) video to be transcoded that sends.

In the present embodiment, above-mentioned electronic equipment can using the image group included by video to be transcoded as divide benchmark, Transcoded video is treated to be divided.Specifically, as an example, above-mentioned electronic equipment can be incited somebody to action using image group as division unit Video to be transcoded is divided at least two sub-videos to be transcoded.For example, video to be transcoded includes three image groups, then above-mentioned electronics Video to be transcoded can be divided into three sub-videos to be transcoded, and above three is treated by equipment using image group as division unit Each sub-video to be transcoded in transcoding sub-video includes an image group.It should be noted that image group is video structure Basic unit, an image group are exactly one group of continuous picture.In practice, MPEG (Moving Picture Experts Group, dynamic image expert group) picture (i.e. frame) is divided into tri- kinds of I, P, B by coding, I frames are intra-coded frames, P frames be it is preceding to Predict frame, B frames are two-way interpolation frames.Herein, image group is generally using I frames as start image, therefore above-mentioned electronic equipment can be with It distinguishes each image group included by video to be transcoded by identifying I frames and using image group as benchmark is divided, treats and turn Code video is divided.It is understood that video to be transcoded has duration, at least two images included by video to be transcoded Group can be arranged according to the sequencing of time.

In some optional realization methods of the present embodiment, can as follows by video to be transcoded be divided into Few two sub-videos to be transcoded：

It is possible, firstly, to determine the attributive character of the image group included by video to be transcoded, wherein, the attributive character of image group It can be used for characterizing the complexity of image group.Specifically, as an example, the attributive character of image group can be wrapped by image group The average number of bits of the image included is the bit numbers of the I frames included by image group, the time complexity etc. of image group.It needs It is noted that the bit number of image can be used for weighing the size of the information content included by image, bit number is bigger, image institute Including information content it is bigger, image is more complicated.The time that the time complexity of image group can be used for characterizing image group becomes Change amount can represent that numerical value is bigger, and time complexity is higher in the form of numerical value.As an example, the time complexity of image group It can carry out table with SAD (the sum of Sum of Absolute Differences, absolute value) information of the image included by image group Show.

It then, can will be with the figure for each image group at least two image groups included by video to be transcoded As organize adjacent image group be determined as the image group close on image group and determine the image group and the image group closes on figure As the difference of the characteristic value of the attributive character of group, wherein, the characteristic value of the attributive character of image group can be used for weighing image group Complexity height, specifically, characteristic value is bigger, the complexity of image group can be higher；Next it may be determined that institute is really Whether fixed difference is less than predetermined threshold value；In response to determining that identified difference is less than predetermined threshold value, by the image group and the figure As the image group of closing on of group merges into sub-video to be transcoded.

Illustratively, video to be transcoded includes three image groups, and the attributive character of image group is complicated for the time of image group Degree, predetermined threshold value 10.According to the sequencing of time, video to be transcoded can be expressed as " the first image group；Second image Group；Third image group ".In order to which video to be transcoded is divided at least two sub-videos to be transcoded, above-mentioned electronic equipment can be first First determine the attributive character of the first image group, the second image group and third image group.Specifically, the attribute of the first image group is special Sign and characteristic value can be expressed as " time complexity：79”；The attributive character and characteristic value of second image group can be expressed as " when Between complexity：60”；The attributive character and characteristic value of third image group can be expressed as " time complexity：65”.Then, it is above-mentioned Electronic equipment can determine the spy of the first image group and the attributive character for closing on image group (the second image group) of the first image group The difference of value indicative, i.e. difference=79-60=19.Herein, numerical value " 19 " is not less than predetermined threshold value " 10 ", therefore to the first image group With the second image group without merging.Then, above-mentioned electronic equipment can determine closing on for the second image group and the second image group The difference of the characteristic value of the attributive character of image group (third image group), i.e. difference=65-60=5.Herein, numerical value " 5 " is small In predetermined threshold value " 10 ", therefore the second image group and third image group are merged into sub-video to be transcoded.It should be noted that not into The image group of row merging treatment then can be separately as a sub-video to be transcoded.Although it is understood that the second image group Close on image group include the first image group and third image group, still, the analysis for the second image group and the first image group It has been completed when using the first image group as analysis benchmark, therefore the step is can be omitted when using the second image as analysis benchmark Suddenly.Similarly, it is convenient to omit using third image group as the analysis to third image group and the second image group of analysis benchmark.Its In, the image group as analysis benchmark is that the image group for closing on image group of the image group is determined based on the image group.

It should be noted that the above-mentioned image group based on included by video to be transcoded video to be transcoded is divided into it is to be transcoded The step of sub-video, can include the situation of scene switching suitable for video to be transcoded.Also, using image group as to be transcoded The division benchmark of video can divide the image for belonging to different scenes in different sub-videos to be transcoded, and then can reduce The mutation of the video quality of sub-video after transcoding.

Step 202, for each sub-video to be transcoded at least two sub-videos to be transcoded, following steps are performed：Really The attributive character of the fixed sub-video to be transcoded；Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated； By the feature vector generated input complexity prediction model trained in advance, obtain to characterize answering for the sub-video to be transcoded The complexity index of miscellaneous degree；Based on the complexity index and pre-set code check obtained, the sub-video to be transcoded is determined Transcoding code check；Based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, son regards after generating transcoding Frequently.

In the present embodiment, for each to be transcoded in obtained based on step 201 at least two sub-videos to be transcoded Sub-video can perform following steps：

Step 2021, the attributive character of the sub-video to be transcoded is determined.Wherein, attributive character can be used for characterizing to be transcoded The complexity of sub-video.For example, attributive character can be the duration of sub-video to be transcoded.

In some optional realization methods of the present embodiment, attributive character can include but is not limited to following at least one ：Bit number, code check, resolution ratio, frame per second, space complexity, time complexity.Wherein, bit number can be the son to be transcoded The average number of bits of the average number of bits of image included by video or the I frame images included by the sub-video to be transcoded Deng.Space complexity can be used for characterizing the complexity of the texture included by image.

Step 2022, based on identified attributive character, the feature vector of the sub-video to be transcoded is generated.It specifically, can With the characteristic value of attributive character determined by determining, and then generate based on identified characteristic value the feature of the sub-video to be transcoded Vector.For example, attributive character, that is, list of feature values is shown as " I frame average number of bits：5；Code check：50 ", then the characteristic value of attributive character Feature vector for " 5 " and " 50 ", and then sub-video to be transcoded can be [5,50].

Step 2023, it by the feature vector generated input complexity prediction model trained in advance, obtains to characterize The complexity index of the complexity of the sub-video to be transcoded.Wherein, the size of complexity index can be used for weighing video The height of complexity.Specifically, as an example, the value range of complexity index can be [0.1,1], wherein, complexity The complexity of video that index " 0.1 " is characterized is minimum, the complexity highest for the video that complexity index " 1 " is characterized.

Herein, complexity prediction model can be used for characterizing the feature vector pass corresponding with complexity index of video System.Wherein, complexity index can be used for characterizing the complexity of video.Specifically, as an example, complexity prediction model can Think technical staff based on to a large amount of feature vector and for characterizing the statistics of the complexity index of the complexity of video and Correspondence pre-establishing, being stored with complexity index of multiple feature vectors with being used for the complexity for characterizing video Mapping table；Can also be that technical staff is pre-set based on the statistics to mass data and stored to above-mentioned electronic equipment In, calculation formula of the numerical computations to obtain is carried out to one or more of feature vector numerical value.The meter of the calculation formula Calculating result can be as above-mentioned for characterizing the complexity index of the complexity of video, for example, the calculation formula can be pair Numerical value in feature vector carries out the formula of read group total, and obtained summing value can be as characterizing the complexity of video Complexity index.

In some optional realization methods of the present embodiment, above-mentioned complexity prediction model can instruct as follows It gets：

First, above-mentioned electronic equipment can obtain Sample video and demarcate in advance, for characterizing answering for Sample video The complexity index of miscellaneous degree.Specifically, technical staff can the mark in advance such as the duration based on Sample video, resolution ratio, code check Determine the complexity index of Sample video.The size of complexity index can be used for weighing the height of the complexity of Sample video. Illustratively, complex exponent is bigger, and the complexity of Sample video is higher.Above-mentioned Sample video can include multiple videos.It is right Each video in multiple videos, can demarcate in advance for characterize the video complexity complexity index.

Then, above-mentioned electronic equipment can determine the attributive character of Sample video and based on identified Sample video Attributive character, generate the feature vector of Sample video.

Finally, above-mentioned electronic equipment can utilize machine learning method, and the feature vector of the Sample video generated is defeated Enter complexity prediction model, obtain the complexity index prediction result of Sample video, the complexity index of Sample video is predicted As a result with it is demarcating in advance, be compared for characterizing the complexity index of the complexity of Sample video, according to comparison result Adjust complexity prediction model parameter so that the complexity index prediction result of Sample video and demarcate in advance, for table The difference levied between the complexity index of the complexity of Sample video is less than preset difference value.

Specifically, above-mentioned electronic equipment can train neural network, and by the nerve after training by back-propagation algorithm Network is determined as complexity prediction model.It should be noted that forward-propagating of the back-propagation algorithm by signal in learning process It is formed with two processes of backpropagation of error.In feedforward network, input signal is inputted through input layer, by hidden layer calculate by Output layer exports, and output valve is compared with the complexity index demarcated in advance, if there is error, by error reversely from output layer to input Es-region propagations, in this process, can utilize gradient descent algorithm to neuron weights (such as in convolutional layer convolution kernel parameter Deng) be adjusted.Herein, it can utilize between preset loss function characterization output valve and the complexity index demarcated in advance Error.

Step 2024, based on the complexity index and pre-set code check obtained, the sub-video to be transcoded is determined Transcoding code check.Wherein, pre-set code check can be the code check that technical staff is inputted by above-mentioned electronic equipment.

Specifically, as an example, above-mentioned electronic equipment can characterize answering for the sub-video to be transcoded by being obtained The complexity index of miscellaneous degree is multiplied with pre-set code check, obtains the transcoding code check of the sub-video to be transcoded.For example, institute It is 0.8 for characterizing the complexity index of the complexity of the sub-video to be transcoded to obtain, and pre-set code check is 50, then The transcoding code check of the sub-video to be transcoded is 40 (0.8*50=40).

Optionally, complexity index has maximum value and minimum value, and then, above-mentioned electronic equipment will can be used for table first The maximum value of complexity index and complexity index for levying the complexity of the sub-video to be transcoded is divided by, and obtains the son to be transcoded The relative complexity index of video, the relative complexity of the sub-video to be transcoded that then above-mentioned electronic equipment can will be obtained Index is multiplied with pre-set code check, obtains the transcoding code check of the sub-video to be transcoded.

Step 2025, based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, generates son after transcoding Video.

It should be noted that the transcoding process at least two sub-videos to be transcoded included by above-mentioned video to be transcoded Used parameter, it is each to wait to turn except the transcoding code check of each sub-video to be transcoded determined except through above-mentioned steps Other transcoding parameters of numeral video can be identical.

With continued reference to Fig. 3, Fig. 3 is to be illustrated according to the present embodiment for handling one of the application scenarios of the method for video Figure.In the application scenarios of Fig. 3, first server 301 can obtain terminal device 302 transmission video to be transcoded 303 and Video 303 to be transcoded is divided into two sub-videos 3031 and 3032 to be transcoded, wherein, video 303 to be transcoded can be included extremely Few two image groups, each sub-video to be transcoded in two sub-videos 3031,3032 to be transcoded can include at least one figure As group；Then for sub-video 3031 to be transcoded and sub-video to be transcoded 3032, server 301 can perform following steps：Point Not Que Ding sub-video 3031 and sub-video to be transcoded 3032 to be transcoded attributive character；Based on identified attributive character, generation The feature vector of sub-video 3031,3032 to be transcoded；It is respectively that the feature vector generated input complexity trained in advance is pre- Survey model, obtain for characterize sub-video 3031 to be transcoded complexity complexity index and regard for characterizing son to be transcoded Frequently the complexity index of 3032 complexity, wherein, complexity prediction model can be used for characterize video feature vector with The correspondence of complexity index；Then, based on the complexity index and pre-set code check obtained, server 301 can To determine the transcoding transcoding of code check 304 and sub-video to be transcoded 3032 code check 305 of sub-video 3031 to be transcoded；Finally take Being engaged in device 301 can be based on identified transcoding code check 304,305, respectively to sub-video 3031 to be transcoded and 3032 turns Code generates sub-video 306 and 307 after transcoding.

Video to be transcoded is divided at least by the method that above-described embodiment of the application provides by obtaining video to be transcoded Two sub-videos to be transcoded and for each sub-video to be transcoded at least two sub-videos to be transcoded, perform following walk Suddenly：Determine the attributive character of the sub-video to be transcoded；Based on identified attributive character, the feature of the sub-video to be transcoded is generated Vector；By the feature vector generated input complexity prediction model trained in advance, obtain and regarded for characterizing the son to be transcoded The complexity index of the complexity of frequency；Based on the complexity index and pre-set code check obtained, determine that this is to be transcoded The transcoding code check of sub-video；Based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, after generating transcoding Sub-video, so as to consider the complexity of sub-video to be transcoded, and the complexity according to sub-video to be transcoded is to be transcoded Sub-video distributes transcoding code check, and then carries out video code conversion to sub-video to be transcoded, improves the reliability of video processing.

With further reference to Fig. 4, it illustrates for handling the flow 400 of another embodiment of the method for video.The use In the flow 400 of the method for processing video, include the following steps：

Step 401, it obtains video to be transcoded and video to be transcoded is divided at least two sub-videos to be transcoded.

In the present embodiment, above-mentioned electronic equipment can using the image group included by video to be transcoded as divide benchmark, Transcoded video is treated to be divided.

Step 402, for each sub-video to be transcoded at least two sub-videos to be transcoded, following steps are performed：Really The attributive character of the fixed sub-video to be transcoded；Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated； By the feature vector generated input complexity prediction model trained in advance, obtain to characterize answering for the sub-video to be transcoded The complexity index of miscellaneous degree；Based on the complexity index and pre-set code check obtained, the sub-video to be transcoded is determined Transcoding code check；Based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, son regards after generating transcoding Frequently.

In the present embodiment, for each to be transcoded in obtained based on step 401 at least two sub-videos to be transcoded Sub-video can perform following steps：

Step 4021, the attributive character of the sub-video to be transcoded is determined.Wherein, attributive character can be used for characterizing to be transcoded The complexity of sub-video.

Step 4022, based on identified attributive character, the feature vector of the sub-video to be transcoded is generated.It specifically, can With the characteristic value of attributive character determined by determining, and then generate based on identified characteristic value the feature of the sub-video to be transcoded Vector.

Step 4023, it by the feature vector generated input complexity prediction model trained in advance, obtains to characterize The complexity index of the complexity of the sub-video to be transcoded.Wherein, the size of complexity index can be used for weighing video The height of complexity.

Herein, complexity prediction model can be used for characterizing the feature vector pass corresponding with complexity index of video System.Wherein, complexity index can be used for characterizing the complexity of video.

Step 4024, based on the complexity index and pre-set code check obtained, the sub-video to be transcoded is determined Transcoding code check.Wherein, pre-set code check can be the code check that technical staff is inputted by above-mentioned electronic equipment.

Step 4025, based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, generates son after transcoding Video.

Above-mentioned steps 401, step 402 are consistent with step 201, the step 2023 in previous embodiment respectively, above with respect to The description of step 201, step 202 is also applied for step 401, step 402, and details are not described herein again.

Step 403, sub-video after the transcoding that is generated is merged, generates target video.

In the present embodiment, sub-video after the transcoding generated based on step 402, above-mentioned electronic equipment can be to being generated Transcoding after sub-video merge, generate target video.It is understood that video to be transcoded has duration, thus it is to be transcoded At least two sub-videos to be transcoded included by video can be arranged according to the sequencing of time, correspondingly, being generated Transcoding after sub-video can be merged according to the sequencing of time.

Illustratively, video to be transcoded when a length of 40 minutes.According to the sequencing of time, video to be transcoded can wrap Include " the first sub-video to be transcoded；Second sub-video to be transcoded；Third sub-video to be transcoded ", wherein, the first sub-video to be transcoded The corresponding period can be:“00:00-14:00”；Period corresponding to second sub-video to be transcoded can be:“14: 01-28:00”；Period corresponding to third sub-video to be transcoded can be:“28:01-40:00”.Pass through the first son to be transcoded Sub-video is sub-video after the first transcoding after the transcoding of video generation；Son regards after the transcoding generated by the second sub-video to be transcoded Frequency is sub-video after the second transcoding；Sub-video is regarded for son after third transcoding after the transcoding generated by third sub-video to be transcoded Frequently.Wherein, the period after the first transcoding corresponding to sub-video and first for sub-video after the first transcoding of generation are to be transcoded Period corresponding to sub-video is identical；Period after second transcoding corresponding to sub-video with for generating son after the second transcoding Period corresponding to second sub-video to be transcoded of video is identical；Period after third transcoding corresponding to sub-video is with being used for Period after generation third transcoding corresponding to the third sub-video to be transcoded of sub-video is identical.And then according to the priority of time Sequentially, above-mentioned electronic equipment can be to sub-video and third transcoding after sub-video, the second transcoding after the first transcoding for being generated Sub-video merges afterwards, generates target video, wherein, the arrangement mode of sub-video is after the transcoding included by target video " sub-video after the first transcoding；Sub-video after second transcoding；Sub-video after third transcoding ".

Step 404, target video is exported.

In the present embodiment, the target video obtained based on step 403, above-mentioned electronic equipment can export above-mentioned target and regard Frequently.Specifically, as an example, above-mentioned electronic equipment can export target video, to client, (such as terminal shown in FIG. 1 is set Standby 101,102,103) or be output to the local of above-mentioned electronic equipment.

Figure 4, it is seen that compared with the corresponding embodiments of Fig. 2, in the present embodiment for the method that handles video Flow 400 highlight the step of sub-video after transcoding is merged and exported.The scheme of the present embodiment description is drawn as a result, The video-processing steps after more video code conversions are entered, it is achieved thereby that more fully video is handled.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, regarded this application provides one kind for handling One embodiment of the device of frequency, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in figure 5, the present embodiment includes for handling the device 500 of video：Division unit 501 and execution unit 502.Wherein, division unit 501 is configured to obtain video to be transcoded and video to be transcoded is divided at least two to wait to turn Numeral video；Execution unit 502 is configured to, for each sub-video to be transcoded at least two sub-videos to be transcoded, perform Following steps：Determine the attributive character of the sub-video to be transcoded；Based on identified attributive character, the sub-video to be transcoded is generated Feature vector；By the feature vector generated input complexity prediction model trained in advance, obtain and wait to turn for characterizing this The complexity index of the complexity of numeral video；Based on the complexity index and pre-set code check obtained, determining should The transcoding code check of sub-video to be transcoded；Based on identified transcoding code check, transcoding, generation are carried out to the sub-video to be transcoded Sub-video after transcoding.

In the present embodiment, division unit 501 can be obtained by wired connection mode or radio connection waits to turn Video to be transcoded and is divided at least two sub-videos to be transcoded at video by code.Wherein, video to be transcoded can be included at least Two image groups (Group of Pictures, GOP), each sub-video to be transcoded at least two sub-videos to be transcoded can To include at least one image group.Specifically, above-mentioned electronic equipment can obtain be pre-stored within local video to be transcoded or Person obtains the video to be transcoded that terminal (such as terminal device shown in FIG. 1 101,102,103) is sent.

In the present embodiment, division unit 501 can be right using the image group included by video to be transcoded as division benchmark Video to be transcoded is divided.Specifically, as an example, above-mentioned electronic equipment can will be treated using image group as division unit Transcoded video is divided at least two sub-videos to be transcoded.For example, video to be transcoded includes three image groups, then division unit Video to be transcoded can be divided into three sub-videos to be transcoded, and above three is waited to turn by 501 using image group as division unit Each sub-video to be transcoded in numeral video includes an image group.It should be noted that image group is the base of video structure This unit, an image group are exactly one group of continuous picture.In practice, picture (i.e. frame) is divided into tri- kinds of I, P, B by mpeg encoded, I frames are intra-coded frames, and P frames are forward predicted frames, and B frames are two-way interpolation frames.Herein, image group generally using I frames as rise Beginning image, therefore division unit 501 can distinguish each image group included by video to be transcoded and incite somebody to action by identifying I frames Image group is treated transcoded video and is divided as benchmark is divided.

In the present embodiment, for each being treated in obtained based on division unit 501 at least two sub-videos to be transcoded Transcoding sub-video, execution unit 502 can perform following steps：Determine the attributive character of the sub-video to be transcoded, wherein, attribute Feature can be used for characterizing the complexity of sub-video to be transcoded；Based on identified attributive character, generate the son to be transcoded and regard The feature vector of frequency；By the feature vector generated input complexity prediction model trained in advance, obtain and treated for characterizing this The complexity index of the complexity of transcoding sub-video, wherein, the size of complexity index can be used for weighing the complexity of video The height of degree, complexity prediction model can be used for characterizing the feature vector of video and the correspondence of complexity index；Base In the complexity index and pre-set code check that are obtained, the transcoding code check of the sub-video to be transcoded is determined, wherein, in advance The code check of setting can be the code check that technical staff is inputted by above-mentioned electronic equipment；It is right based on identified transcoding code check The sub-video to be transcoded carries out transcoding, generates sub-video after transcoding.

In some optional realization methods of the present embodiment, division unit 501 can include：Determining module, configuration are used In the attributive character for determining the image group included by video to be transcoded；Merging module is configured to wrap video to be transcoded The image group adjacent with the image group is determined as closing on for the image group by each image group in at least two image groups included Image group and the difference for determining the image group and the characteristic value of the attributive character for closing on image group of the image group；Determine institute Whether determining difference is less than predetermined threshold value；In response to determining that identified difference is less than predetermined threshold value, by the image group and it is somebody's turn to do The image group of closing on of image group merges into sub-video to be transcoded.

In some optional realization methods of the present embodiment, complexity prediction model can be trained as follows It arrives：Obtain Sample video and demarcate in advance, complexity index for characterizing the complexity of Sample video；Determine sample The attributive character of video and the attributive character based on identified Sample video generate the feature vector of Sample video；It utilizes The feature vector of the Sample video generated is inputted complexity prediction model, obtains answering for Sample video by machine learning method Miscellaneous degree exponential forecasting as a result, by the complexity index prediction result of Sample video with it is in advance demarcating, for characterizing Sample video The complexity index of complexity be compared, the parameter of complexity prediction model is adjusted according to comparison result, so that sample The complexity index prediction result of video and demarcate in advance, for characterize the complexity index of the complexity of Sample video it Between difference be less than preset difference value.

In some optional realization methods of the present embodiment, attributive character can include but is not limited to following at least one ：Bit number, code check, resolution ratio, frame per second, space complexity, time complexity.

In some optional realization methods of the present embodiment, the device 500 for handling video can also include：Merge Unit is configured to merge sub-video after the transcoding that is generated, generates target video.

In some optional realization methods of the present embodiment, the device 500 for handling video can also include：Output Unit is configured to output target video.

The device 500 for being used to handle video that above-described embodiment of the application provides is obtained by division unit 501 to be waited to turn Video to be transcoded and is divided at least two sub-videos to be transcoded at video by code, is regarded sequentially at least two sons to be transcoded Each sub-video to be transcoded in frequency, execution unit 502 perform following steps：Determine the attributive character of the sub-video to be transcoded； Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated；The feature vector generated input is advance Trained complexity prediction model, obtain for characterize the sub-video to be transcoded complexity complexity index；Based on institute The complexity index of acquisition and pre-set code check determine the transcoding code check of the sub-video to be transcoded；Based on identified Transcoding code check carries out transcoding to the sub-video to be transcoded, sub-video after transcoding is generated, so as to consider sub-video to be transcoded Complexity, and the complexity according to sub-video to be transcoded distributes transcoding code check for sub-video to be transcoded, and then treats and turn Numeral video carries out video code conversion, improves the reliability of video processing.

Below with reference to Fig. 6, it illustrates suitable for being used for realizing the computer system 600 of the server of the embodiment of the present application Structure diagram.Server shown in Fig. 6 is only an example, should not be to the function of the embodiment of the present application and use scope band Carry out any restrictions.

As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into program in random access storage device (RAM) 603 from storage section 608 and Perform various appropriate actions and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.

I/O interfaces 605 are connected to lower component：Importation 606 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.；Storage section 608 including hard disk etc.； And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net performs communication process.Driver 610 is also according to needing to be connected to I/O interfaces 605.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 610, as needed in order to be read from thereon Computer program be mounted into storage section 608 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product, including being carried on computer-readable medium On computer program, which includes for the program code of the method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609 and/or from detachable media 611 are mounted.When the computer program is performed by central processing unit (CPU) 601, perform what is limited in the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two arbitrarily combines.Computer readable storage medium for example can be --- but It is not limited to --- electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor or arbitrary above combination. The more specific example of computer readable storage medium can include but is not limited to：Electrical connection with one or more conducting wires, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium can any be included or store The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And In the application, computer-readable signal media can include the data letter propagated in a base band or as a carrier wave part Number, wherein carrying computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by instruction execution system, device either device use or program in connection.It is included on computer-readable medium Program code any appropriate medium can be used to transmit, including but not limited to：Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.

Flow chart and block diagram in attached drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey Architectural framework in the cards, function and the operation of sequence product.In this regard, each box in flow chart or block diagram can generation The part of one module of table, program segment or code, the part of the module, program segment or code include one or more use In the executable instruction of logic function as defined in realization.It should also be noted that it in some implementations as replacements, is marked in box The function of note can also be occurred with being different from the sequence marked in attached drawing.For example, two boxes succeedingly represented are actually It can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depended on the functions involved.Also it to note Meaning, the combination of each box in block diagram and/or flow chart and the box in block diagram and/or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be set in the processor, for example, can be described as：A kind of processor packet Include division unit and execution unit.Wherein, the title of these units does not form the limit to the unit in itself under certain conditions It is fixed, for example, division unit is also described as " for video to be transcoded to be divided into the list of at least two sub-videos to be transcoded Member ".

As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in device described in above-described embodiment；Can also be individualism, and without be incorporated the device in.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are performed by the device so that should Device：It obtains video to be transcoded and video to be transcoded is divided at least two sub-videos to be transcoded, wherein, it is to be transcoded to regard Frequency includes at least two image groups, and each sub-video to be transcoded at least two sub-videos to be transcoded includes at least one image Group；For each sub-video to be transcoded at least two sub-videos to be transcoded, following steps are performed：Determine that the son to be transcoded regards The attributive character of frequency；Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated；The feature that will be generated Vector input complexity prediction model trained in advance, obtain for characterize the sub-video to be transcoded complexity complexity Index, wherein, complexity prediction model is used to characterize the feature vector of video and the correspondence of complexity index；Based on being obtained The complexity index obtained and pre-set code check determine the transcoding code check of the sub-video to be transcoded；Based on identified turn Code code check carries out transcoding to the sub-video to be transcoded, generates sub-video after transcoding.

The preferred embodiment and the explanation to institute's application technology principle that above description is only the application.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the specific combination of above-mentioned technical characteristic forms Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature The other technical solutions for arbitrarily combining and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein The technical solution that the technical characteristic of energy is replaced mutually and formed.

Claims

1. a kind of method for handling video, including：

It obtains video to be transcoded and the video to be transcoded is divided at least two sub-videos to be transcoded, wherein, it is described to treat Transcoded video includes at least two image groups, and each sub-video to be transcoded at least two sub-video to be transcoded is included extremely A few image group；

For each sub-video to be transcoded in described at least two sub-videos to be transcoded, following steps are performed：Determine that this waits to turn The attributive character of numeral video；Based on identified attributive character, the feature vector of the sub-video to be transcoded is generated；It will be generated Feature vector input trained in advance complexity prediction model, obtain the complexity for characterizing the sub-video to be transcoded Complexity index, wherein, complexity prediction model is used to characterize the feature vector of video and the correspondence of complexity index；Base In the complexity index and pre-set code check that are obtained, the transcoding code check of the sub-video to be transcoded is determined；Based on really Fixed transcoding code check carries out transcoding to the sub-video to be transcoded, generates sub-video after transcoding.

It is 2. described that the video to be transcoded is divided at least two sons to be transcoded according to the method described in claim 1, wherein Video, including：

Determine the attributive character of the image group included by the video to be transcoded；

It, will be adjacent with the image group for each image group at least two image groups included by the video to be transcoded Image group is determined as the attribute for closing on image group for closing on image group and determining the image group and the image group of the image group The difference of the characteristic value of feature；Whether difference determined by determining is less than predetermined threshold value；In response to determining that identified difference is small In predetermined threshold value, the image group of closing on of the image group and the image group is merged into sub-video to be transcoded.

3. according to the method described in claim 1, wherein, training obtains the complexity prediction model as follows：

Obtain Sample video and demarcate in advance, complexity index for characterizing the complexity of Sample video；

The attributive character and the attributive character based on identified Sample video for determining Sample video, generate Sample video Feature vector；

Using machine learning method, the feature vector of the Sample video generated is inputted into the complexity prediction model, is obtained The complexity index prediction result of the Sample video, by the complexity index prediction result of the Sample video and calibration in advance , be compared for characterizing the complexity index of the complexity of Sample video, the complexity is adjusted according to comparison result The parameter of prediction model so that the complexity index prediction result of the Sample video and demarcate in advance, for characterizing sample Difference between the complexity index of the complexity of video is less than preset difference value.

4. according to the method described in claim 1, wherein, attributive character includes at least one of following：Bit number, code check are differentiated Rate, frame per second, space complexity, time complexity.

5. according to the method described in one of claim 1-4, wherein, the method further includes：

Sub-video after the transcoding that is generated is merged, generates target video.

6. according to the method described in claim 5, wherein, the method further includes：

Export the target video.

7. it is a kind of for handling the device of video, including：

Division unit, is configured to obtain video to be transcoded and that the video to be transcoded is divided at least two is to be transcoded Sub-video, wherein, the video to be transcoded includes at least two image groups, each at least two sub-video to be transcoded Sub-video to be transcoded includes at least one image group；

Execution unit is configured to for each sub-video to be transcoded in described at least two sub-videos to be transcoded, perform with Lower step：Determine the attributive character of the sub-video to be transcoded；Based on identified attributive character, the sub-video to be transcoded is generated Feature vector；By the feature vector generated input complexity prediction model trained in advance, obtain to be transcoded for characterizing this The complexity index of the complexity of sub-video, wherein, complexity prediction model is used to characterize the feature vector and complexity of video Spend the correspondence of index；Based on the complexity index and pre-set code check obtained, the sub-video to be transcoded is determined Transcoding code check；Based on identified transcoding code check, transcoding is carried out to the sub-video to be transcoded, generates sub-video after transcoding.

8. device according to claim 7, wherein, the division unit includes：

Determining module is configured to determine the attributive character of the image group included by the video to be transcoded；

Merging module is configured to for each image group at least two image groups included by the video to be transcoded, The image group adjacent with the image group is determined as closing on image group and determining the image group and the image group for the image group The attributive character for closing on image group characteristic value difference；Whether difference determined by determining is less than predetermined threshold value；In response to Difference is less than predetermined threshold value determined by determining, the image group of closing on of the image group and the image group is merged into son to be transcoded and regarded Frequently.

9. device according to claim 7, wherein, training obtains the complexity prediction model as follows：

10. device according to claim 7, wherein, attributive character includes at least one of following：Bit number, code check are differentiated Rate, frame per second, space complexity, time complexity.

11. according to the device described in one of claim 7-10, wherein, described device further includes：

Combining unit is configured to merge sub-video after the transcoding that is generated, generates target video.

12. according to the devices described in claim 11, wherein, described device further includes：

Output unit is configured to export the target video.

13. a kind of server, including：

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are performed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-6.

14. a kind of computer readable storage medium, is stored thereon with computer program, wherein, when which is executed by processor Realize the method as described in any in claim 1-6.