CN104837019B

CN104837019B - AVS to HEVC optimization video transcoding methods based on SVMs

Info

Publication number: CN104837019B
Application number: CN201510215888.5A
Authority: CN
Inventors: 解蓉; 罗瑞; 张文军; 张良
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2015-04-30
Filing date: 2015-04-30
Publication date: 2018-01-02
Anticipated expiration: 2035-04-30
Also published as: CN104837019A

Abstract

A kind of AVS to HEVC optimization video transcoding methods based on SVMs, by the characteristic vector for gathering AVS code streams, and it is learnt using SVMs and obtains training pattern, the CU that the AVS characteristic vectors extracted are divided into the relevant position in HEVC divides or not divided two classes, whether need to divide with training pattern prediction CU in the transcoding stage, when obtain current CU need division when, carry out 2N × 2N patterns and SKIP mode computations respectively under current HEVC depth again, and select optimal prediction modes from both patterns, it need not be divided when prediction obtains current CU, then optimization model selection is carried out according to HEVC standard cataloged procedure.Present invention incorporates machine learning basic thought, whole transcoding process training stage and transcoding stage have been divided into, training pattern is obtained by study, predict the division of CU in HEVC, and combine fast mode decision algorithm, both the speed of transcoding had been improved, in turn ensure that the overall video quality of transcoding rear video.

Description

AVS to HEVC optimization video transcoding methods based on SVMs

Technical field

It is specifically a kind of based on SVMs the present invention relates to a kind of technology of video signal processing field AVS to HEVC optimizes video transcoding method.

Background technology

Video Transcoding Technology, it is that the code stream that will have been compressed obtains satisfactory target code stream by decoding re-encoding.With The extensive use and fast development of multimedia technology and internet etc., is had become in the various video datas of transmission over networks existing In the trend of network technical development, there has been various video coding standard, including MPEG-4, MPEG-2, H.264, AVS, HEVC etc..Because video resource is varied, and the display capabilities of different terminal devices, storage capacity, to code stream Disposal ability have differences, user is also not quite similar to the demand of video under different situations, therefore, how to realize height The transcoding of effect, it is allowed to be adapted in different hardware equipment and network transmission environment, is paid close attention to extensively by industry always.HEVC It is newest video encoding standard at present, compression efficiency improves about 50% than the standard such as H.264, and it will obtain more and more wider General application.AVS is the standard of China's independent research, have with H.264 suitable coding efficiency and lower encoder complexity, There is important influence power in field of video applications.Many AVS code streams compressed existing at present, it will grow in standards such as HEVC Phase coexists, therefore realizes that AVS turns into important research direction to the Video Quality Metric between HEVC standard.

Machine learning is applied to multiple research fields, including artificial intelligence, machine translation, data mining, Text region with And commercial field etc., with the development of Digital object identifier and transmission network etc., it be also constantly applied to video search, Video analysis and video compile the research directions such as transcoding, and machine learning algorithm is applied to the application in video code conversion field at present It is being on the increase.SVMs is an important method of machine learning, is extracted from input code flow front face framing Decoded information etc., the corresponding relation for learning between AVS information and HEVC coding modes using SVMs, encoded below During, the pattern that HEVC is encoded directly just is predicted, without being changed completely according to the AVS decoded information The process of generation traversal, so as to reach the purpose for reducing Video Quality Metric complexity.Therefore, by using this machine of SVMs Learning method, how to learn to obtain accurate training pattern, realize Accurate Prediction to reduce the complexity of Video Quality Metric, raising regards The speed of frequency transcoding turns into an important topic of current research.

Found by the retrieval to prior art, Chinese patent literature CN104320667A discloses (bulletin) day 2015.01.28, a kind of more process optimizing coded systems, including several parallel encoders, prediction buffer and two are disclosed The output end of secondary encoder, the input of buffer of looking forward to the prospect and parallel encoder connects, the output end for buffer of looking forward to the prospect with it is secondary The input connection of encoder, and its method is disclosed, including the first coding stage, optimization selection stage and the second coding rank 3 steps of section, the first coding stage are encoded simultaneously by several parallel encoders, and prediction buffer is to the first coding stage Resulting result carries out optimization selection to obtain optimum code path, and secondary coding device is obtained according to the optimization selection stage Second of the optimum code path obtained coding, obtains final and optimal coding result.The technical performance, quality, bandwidth efficiency Higher, coding/transcoding result is more preferable, is highly susceptible to configuring and very flexibly, can be not only used for the 4K and superelevation of high video quality Clear application, it can also be used to the mobile video application of ultra high efficiency bandwidth.It is no to utilize but the input of the technology is source code flow The coding information to be included during compressed bit stream is inputted, and for encoder more complicated HEVC, coding path is various, realizes multiple Miscellaneous degree also can be relatively high.

The content of the invention

The present invention is directed to deficiencies of the prior art, proposes that a kind of AVS to HEVC based on SVMs is excellent Change video transcoding method, simplified video code conversion framework is employed, in the characteristic vector that the training stage will extract from AVS code streams Training obtains training pattern, and then carrying out region to the coding unit dividing condition in HEVC using the model in the transcoding stage turns Code, in combination with fast mode decision algorithm, reduces recodification stage coding dividing elements and the complexity of model selection, Video code conversion speed is substantially increased, while ensure that transcoded video Quality Down limited.

The present invention is achieved by the following technical solutions：

The present invention is learnt and instructed to it using SVMs by gathering the characteristic vector of AVS code streams Practice model, the AVS characteristic vectors extracted are divided into the CU divisions of the relevant position in HEVC or do not divide two classes, in transcoding rank Whether section needs to divide with training pattern prediction CU.

Described AVS characteristic vectors are gathered from AVS code streams, including：Macro-block coding pattern, motion vector and conversion coefficient etc. Information.

Described macro-block coding pattern refers to：The pattern information of each macro block in AVS.

Described motion vector refers to：The average motion vector size of macro block in AVS.

Described conversion coefficient refers to：The number of nonzero coefficient in discrete cosine transform (DCT) coefficient of AVS codings.

Described corresponding relation refers to：Whether AVS characteristic vectors need division by learning with coding unit in HEVC The training pattern arrived, namely mapping relations.

Described prediction, when obtaining current CU and needing division, then carry out 2N × 2N respectively under current HEVC depth Pattern and SKIP mode computations, and optimal prediction modes are selected from both patterns, do not needed when prediction obtains current CU Divided, then carry out optimization model selection according to HEVC standard cataloged procedure.

Technique effect

Compared with prior art, the present invention combine machine learning basic thought and SVMs learning method, to regarding Frequency study obtains training pattern and predicts HEVC coding modes in follow-up transcoding process, so as to reduce transcoding complexity, improves Transcoding speed.Whole transcoding process is divided into two stages, i.e. training stage and transcoding stage.In the training stage, AVS spies are extracted The CU division informations of relevant position in vector sum HEVC are levied, are trained to obtain both correspondences according to algorithm of support vector machine and instrument Relation, i.e. training pattern；The model obtained using the training stage, it is 0 to depth according to AVS characteristic vector in the transcoding stage It is predicted with whether HEVC relevant positions CU when 1 divides, if prediction obtains current CU and needs to divide, to the CU of current depth SKIP and 2Nx2N model selections are only carried out, if prediction obtains current CU and need not divided, according to HEVC standard process to optimal Pattern is selected.Combined by machine learning and fast schema selection method, each sequence of basis itself that can be adaptive Feature obtain corresponding training pattern, ensure transcoded video Quality Down it is limited in the case of, greatly reduce transcoding process In computation complexity, so as to improve transcoding speed, save the transcoding time.

Brief description of the drawings

Fig. 1 is transcoding frame diagram of the present invention；

Fig. 2 is flow chart of the present invention.

Embodiment

Embodiments of the invention are elaborated below, the present embodiment is carried out lower premised on technical solution of the present invention Implement, give detailed embodiment and specific operating process, but protection scope of the present invention is not limited to following implementation Example.

Embodiment 1

As shown in Fig. 2 the present embodiment is divided into following four step：

Step 1: the characteristic vector of collection AVS code streams, is specially：It is with depth in corresponding HEVC in collection AVS code streams The information that relevant position CU is divided when 0 and 1.

Described macro-block coding pattern refers to：When depth is 0 in HEVC corresponding to AVS, include 16 in each CU Macro block, therefore contain 16 features；When depth is 1, includes 4 macro blocks in each CU, there are 4 pattern features accordingly.

Described motion vector refers to：Motion vector is for unit with 8 × 8 in AVS, and 4 motions are included in a macro block Vectorial elementary cell, each motion vector distinguish modulus, then using the average value of four motion vector moulds in a macro block as One feature.Equally, 16 motion features are contained when depth is 0 in CU, depth contains 4 motion features when being 1.Motion vector The average value v of mould_avgIt is equal toWherein N is for 8 × 8 size blocks that current coded unit is included Number, v_xAnd v (i)_y(i) be respectively i-th of 8 × 8 pieces of motion vectors in coding unit horizontal component and vertical component.

Described conversion coefficient refers to：DCT is also for unit, by all DCT coefficients in a macro block with 8 × 8 pieces in AVS The number of middle nonzero coefficient is as a feature, and therefore, depth contains 16 DCT coefficient features when being 0, and depth contains 4 when being 1 Individual DCT coefficient feature.

Step 2: the AVS characteristic vectors obtained using SVMs to step 1 are learnt and obtain training pattern, For realize to HEVC depth be 0 and 1 when coding unit classify, the AVS characteristic vectors extracted are divided into two classes, i.e., The CU of relevant position is divided or not divided in HEVC.

Step 3: directly current CU is made whether using the training pattern obtained in step 2 to need the prediction divided, When prediction result for that need not divide, then division terminates and no longer carries out further model selection, otherwise will enter HEVC Next depth iteration and perform step 4.

Step 4: carry out 2N × 2N patterns and SKIP mode computations respectively under current HEVC depth, and from both Optimal prediction modes are selected in pattern, need not be divided when prediction obtains current CU, then it is encoded according to HEVC standard Cheng Jinhang optimization models select, and so as to reduce the computation complexity in mode selection processes, improve whole transcoding process Transcoding speed.

Described 2N × 2N mode computations refer to：Represent that current coded unit can not be divided again；

Described SKIP mode computations refer to：Represent that current coding macro block does not need coded prediction residual information；

Described optimal prediction modes refer to：Rate distortion is carried out to 2N × 2N and SKIP patterns to compare, the rate selected is lost The minimum coding mode of true cost (rate-distortion cost)；

Described optimization model selection refers to：The comparison of rate distortion is carried out to coding mode all in HEVC, so as to select Select out the minimum process of rate distortion costs.

In summary, advantage of this embodiment is that：

1) feature of video itself is combined, each sequence can train to obtain the model suitable for current sequence.

2) computation complexity for determining CU optimal dividings in transcoding in HEVC cataloged procedures is predicted according to training pattern, so as to In the case where ensureing predictablity rate, transcoding computation complexity is greatly reduced, saves the transcoding time, improves transcoding effect Rate.

3) fast mode decision algorithm is combined, unnecessary calculating process is further reduced, improves transcoding speed.

Algorithm proposed by the present invention is realized on the basis of HEVC identifying codes HM10.1.By AVS to HEVC complete solutions Full transcoder of compiling is compared as standard transcoder, by comparing three standards come the validity of assessment algorithm, i.e. Δ T, Δ PSNR and Δ BR.Calculation is as follows：

Δ PSNR=PSNR_ref-PSNR_prop

Δ T, Δ PSNR and Δ BR represent scramble time percentage, PSNR decreasing value and the code check growth rate saved respectively.

Experimental result is as shown in table 1.

AVS to HEVC transcoding algorithm experimental result of the table 1 based on SVMs.

By test result in table 1 can be seen that based on the transcoding algorithm of SVMs for 1920 × 1080 sequence Row, transcoding speed can be made, which averagely to improve 63.68%, PSNR, averagely reduces 0.083dB, while code check rises only 1.82%, for 1280 × 720 sequence so that transcoding speed averagely improves 71.78%, PSNR and averagely reduced 0.09dB, code check rise 1.55%.Test result indicates that inventive algorithm while video code conversion speed is greatly improved, is protected Limited video quality has been demonstrate,proved to decline.

Claims

1. a kind of AVS to HEVC optimization video transcoding methods based on SVMs, it is characterised in that by gathering AVS codes The characteristic vector of stream, and it is learnt using SVMs and obtains training pattern, the AVS characteristic vectors that will be extracted It is divided into the CU divisions of the relevant position in HEVC or does not divide two classes, whether needs to draw with training pattern prediction CU in the transcoding stage Point, when obtaining current CU and needing division, then carry out 2N × 2N patterns and SKIP pattern meters respectively under current HEVC depth Calculate, and select optimal prediction modes from both patterns, need not be divided when prediction obtains current CU, then according to HEVC standard cataloged procedure carries out optimization model selection；

Described AVS characteristic vectors are gathered from AVS code streams, including：Macro-block coding pattern, motion vector and conversion coefficient；

Described optimal prediction modes refer to：Rate distortion is carried out to 2N × 2N and SKIP patterns to compare, the rate distortion generation selected The minimum coding mode of valency.

2. according to the method for claim 1, it is characterized in that, described macro-block coding pattern refers to：Each macro block in AVS Pattern information；Described motion vector refers to：The average motion vector size of macro block in AVS；Described conversion coefficient refers to： The number of nonzero coefficient in the discrete cosine transform coefficient of AVS codings.

3. method according to claim 1 or 2, it is characterized in that, described macro-block coding pattern refers to：Corresponding to AVS When depth is 0 in HEVC, include 16 macro blocks in each CU, therefore contain 16 features；When depth is 1, in each CU Include 4 macro blocks, there are 4 pattern features accordingly；Described motion vector refers to：Motion vector is to be with 8 × 8 in AVS Unit, 4 motion vector elementary cells are included in a macro block, each motion vector distinguishes modulus, then by a macro block The average value of four motion vector moulds contains 16 motion features, when depth is 1 when depth is 0 as a feature in CU Contain 4 motion features；Described conversion coefficient refers to：DCT is also for unit, by institute in a macro block with 8 × 8 pieces in AVS There is the number of nonzero coefficient in DCT coefficient as a feature, contain 16 DCT coefficient features, depth 1 when depth is 0 4 DCT coefficient features of Shi Hanyou.

4. according to the method for claim 1, it is characterized in that, the selection of described optimization model refers to：To all in HEVC Coding mode carries out the comparison of rate distortion, so as to select the minimum process of rate distortion costs.