WO2018192235A1 - 编码单元深度确定方法及装置 - Google Patents

编码单元深度确定方法及装置 Download PDF

Info

Publication number
WO2018192235A1
WO2018192235A1 PCT/CN2017/115175 CN2017115175W WO2018192235A1 WO 2018192235 A1 WO2018192235 A1 WO 2018192235A1 CN 2017115175 W CN2017115175 W CN 2017115175W WO 2018192235 A1 WO2018192235 A1 WO 2018192235A1
Authority
WO
WIPO (PCT)
Prior art keywords
coding
unit
coding unit
processed
coded
Prior art date
Application number
PCT/CN2017/115175
Other languages
English (en)
French (fr)
Inventor
张宏顺
林四新
程曦铭
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to KR1020197027603A priority Critical patent/KR102252816B1/ko
Priority to EP17906258.3A priority patent/EP3614666A4/en
Priority to JP2019527221A priority patent/JP6843239B2/ja
Publication of WO2018192235A1 publication Critical patent/WO2018192235A1/zh
Priority to US16/366,595 priority patent/US10841583B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Definitions

  • the present application relates to the field of video coding technologies, and in particular, to a coding unit depth determination method and apparatus.
  • HEVC High Efficiency Video Coding
  • the coding process of the HEVC coding standard is introduced in conjunction with FIG. 1.
  • the predicted value is obtained, and the predicted value is subtracted from the input video frame.
  • the residual is obtained, and the residual is subjected to DCT (Discrete Cosine Transform) and quantized to obtain a residual coefficient, which is then sent to an entropy coding module for encoding, and outputs a video code stream.
  • DCT Discrete Cosine Transform
  • the residual coefficient is inverse quantized and inverse transformed
  • the residual value of the reconstructed image is obtained, and the residual value of the reconstructed image and the predicted value between the frame and the frame are added to obtain a reconstructed image, and the reconstructed image is obtained.
  • a reconstructed frame is obtained, and the reconstructed frame is used as a reference frame of the input image of the next frame, and the reference frame sequence is added.
  • an input video frame is divided into a series of Coding Tree Units (CTUs).
  • CTUs Coding Tree Units
  • each CTU starts from a maximum coding unit LCU (Largest Code Unit), and each layer is divided into coding units CU (Coding Units) of different sizes in a quadtree form.
  • LCU Large Code Unit
  • CU Coding Units
  • the level of depth 0 is LCU, its size is generally 64*64, and the depth of 1-3 is 32*32, 16*16, 8*8.
  • the existing HEVC adopts a full traversal mode when selecting the best mode in the CU block depth division of the coding unit.
  • FIG. 2 an example of CU partitioning in an optimal mode is illustrated.
  • the left side of FIG. 2 is a specific division manner
  • the right side diagram is a quadtree corresponding to the division manner of the left diagram
  • a leaf node in the quadtree Indicates whether the four CU blocks in each layer need further division according to the division order indicated by the arrows in the left diagram, where 1 indicates that it is needed and 0 indicates that it is not needed.
  • some CU blocks find the optimal mode after doing one layer division, and do not need to divide and calculate and compare the rate distortion cost, as shown in FIG. 2, the first layer in the quadtree.
  • the second CU block in the middle has a node value of 0, indicating that no further division is needed.
  • the encoding prediction process takes a very long time and requires a large amount of computing resources.
  • the present application provides a coding unit depth determination method and apparatus for solving the problem that the existing full traversal method to determine the coding unit depth has a long coding prediction time and consumes a large amount of computing resources.
  • a first aspect of the present application provides a coding unit depth determining method, including:
  • the prediction model is pre-trained by using training samples marked with classification results, and the training samples include coding information features of the set type.
  • the second aspect of the present application further provides a coding unit depth determining apparatus, including:
  • a residual coefficient determining unit configured to determine a residual coefficient of a current optimal mode of the coding unit to be processed
  • a feature acquiring unit configured to acquire a setting type from the neighboring coding tree unit of the coding tree unit and the coding tree unit where the coding unit to be processed is located when the residual coefficient is not zero Coded information features, composing predictive feature vector samples;
  • a model prediction unit configured to input the predicted feature vector sample into a pre-trained prediction model, to obtain a prediction result output by the prediction model, where the prediction result is used to indicate whether the to-be-processed coding unit needs to perform depth division;
  • the prediction model is pre-trained by using training samples marked with classification results, and the training samples include coding information features of the set type.
  • a third aspect of the embodiments of the present application further provides a computer readable storage medium storing program instructions, the processor executing one of the foregoing methods when executing the stored program instructions.
  • the coding unit depth determining method preliminarily uses a training sample marked with a classification result to train a prediction model, where the training sample includes a coding information feature of a set type, and further determines a current optimal mode of the coding unit to be processed.
  • the residual coefficient is not zero, it indicates that the coding unit to be processed is not a skip coding unit, and the coding depth prediction is needed, and then the coding of the set type is obtained from the coding unit of the coding unit to be processed and the neighbor coding tree unit of the coding tree unit.
  • the information features are composed of predicted feature vector samples, input into the prediction model, and the machine learning prediction model is used to predict whether the coding unit to be processed needs to be deeply divided.
  • the prediction result indicates that the coding unit to be processed does not need to perform depth division
  • the calculation and comparison of the depth division and the rate distortion cost of the coding unit to be processed are not required, and the coding prediction time is greatly reduced and reduced compared with the prior art.
  • Computing resources reduces computational complexity.
  • Figure 1 is a schematic diagram of a HEVC coding framework
  • FIG. 2 illustrates a schematic diagram of CU partitioning of an optimal mode
  • FIG. 3 is a schematic structural diagram of a server hardware according to an embodiment of the present disclosure.
  • FIG. 4 is a flowchart of a method for determining a depth of a coding unit according to an embodiment of the present application
  • FIG. 5 is a flowchart of another method for determining a coding unit depth according to an embodiment of the present application.
  • FIG. 6 is a flowchart of still another method for determining a depth of a coding unit according to an embodiment of the present disclosure
  • FIG. 7 is a flowchart of a method for determining a first average cost disclosed in an embodiment of the present application.
  • FIG. 8 illustrates a schematic diagram of CU partitioning of each neighbor coding tree unit of a Current CTU
  • FIG. 9 is a flowchart of a method for determining whether a coding unit to be processed needs to perform depth division according to an embodiment of the present disclosure
  • FIG. 10 is a schematic structural diagram of a coding unit depth determining apparatus according to an embodiment of the present application.
  • the embodiment of the present application provides a coding unit depth determining scheme, which can be applied to a video encoder, and the video encoder is implemented based on a server.
  • the hardware structure of the server may be a processing device such as a computer or a notebook. Before introducing the coding unit depth determination method of the present application, the hardware structure of the server is first introduced. As shown in FIG. 3, the server may include:
  • Processor 1 communication interface 2, memory 3, communication bus 4, and display screen 5;
  • the processor 1, the communication interface 2, the memory 3 and the display screen 5 complete communication with each other via the communication bus 4.
  • the method includes:
  • Step S100 Determine a residual coefficient of a current optimal mode of the coding unit to be processed
  • a candidate mv (motion vector) list is constructed according to a standard protocol, and then each mv in the list is traversed, and motion compensation is performed to obtain a predicted value, and then the predicted value and the to-be-processed code are calculated.
  • the SSD calculated result corresponding to the optimal mv is transformed and quantized to obtain a residual coefficient. If the residual coefficient is 0, the coding unit to be processed is a skip block, otherwise it is a merge block.
  • the residual coefficient is zero, it indicates that the coding unit to be processed is a skip block, and the CU partition can be directly ended. Otherwise, it indicates that the coding unit to be processed needs to perform partition prediction.
  • the image of the video frame to be processed may be stored into the memory 3 through the communication interface 2 in advance.
  • the processor 1 acquires the image of the to-be-processed video frame stored in the memory through the communication bus 4, and divides it into a plurality of coding units, determines the coding unit to be processed therefrom, and determines the current optimal mode of the coding unit to be processed. Residual coefficient.
  • the communication interface 2 can be an interface of the communication module, such as an interface of the GSM module.
  • the processor 1 may be a central processing unit CPU, or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • Step S110 When the residual coefficient is not zero, obtain, from the to-be-coded coding unit and the neighboring coding tree unit of the coding tree unit where the to-be-coded coding unit is located, acquire coding information features of the set type, and form Predicting feature vector samples;
  • the type of the encoded information feature acquired in this step is the same as the type of the training sample used in the predictive model training process.
  • each type of coding information feature template may be preset, and then, according to the coding information feature template, the coding information feature is obtained from the to-be-coded coding unit and the neighbor coding tree unit of the coding tree unit where the coding unit is to be processed.
  • the acquired coding information features constitute a prediction feature vector sample.
  • the obtained object of the coding information feature is: a coding unit to be processed CU, and a neighbor coding tree unit CTU of the coding tree unit CTU where the coding unit CU to be processed is located.
  • each type of coding information feature template may be stored in the memory 3 in advance, and then the processor 1 may encode from the coding unit to be processed and the coding tree unit of the coding unit to be processed according to the coding information feature template.
  • the coding information features are obtained in the tree unit to form a prediction feature vector sample.
  • Step S120 Input the predicted feature vector sample into the pre-trained prediction model, and obtain a prediction result output by the prediction model, where the prediction result is used to indicate whether the to-be-processed coding unit needs to perform depth division.
  • the prediction model is pre-trained by using training samples marked with classification results, and the training samples include coding information features of the set type.
  • the prediction model may be stored in advance in the memory 3.
  • the predictor 1 inputs the predicted feature vector samples into the pre-trained predictive model to obtain a predicted result output by the predictive model, and outputs the display through the display screen 5.
  • the prediction model may be a SVM (Support Vector Machine) model, or a neural network model machine learning model.
  • the coding unit depth determining method preliminarily uses a training sample marked with a classification result to train a prediction model, where the training sample includes a coding information feature of a set type, and further determines a current optimal mode of the coding unit to be processed.
  • the residual coefficient is not zero, it indicates that the coding unit to be processed is not a skip coding unit, and the coding depth prediction is needed, and then the coding of the set type is obtained from the coding unit of the coding unit to be processed and the neighbor coding tree unit of the coding tree unit.
  • the information features are composed of predicted feature vector samples, input into the prediction model, and the machine learning prediction model is used to predict whether the coding unit to be processed needs to be deeply divided.
  • the prediction result indicates that the coding unit to be processed does not need to perform depth division
  • the calculation and comparison of the depth division and the rate distortion cost of the coding unit to be processed are not required, and the coding prediction time is greatly reduced and reduced compared with the prior art.
  • Computing resources reduces computational complexity.
  • the present application may perform depth determination only on the to-be-coded coding unit that belongs to the non-I frame video image, that is, the to-be-processed coding unit belongs to the non-I frame. Video image.
  • another coding unit depth determining method is introduced. As shown in FIG. 5, the method includes:
  • Step S200 Determine a residual coefficient of a current optimal mode of the coding unit to be processed
  • Step S210 when the residual coefficient is not zero, it is determined whether the coded depth of the to-be-coded coding unit is zero, and if so, step S220 is performed;
  • the coded depth of the to-be-coded coding unit is zero, it indicates that the to-be-processed coding unit is the largest coding unit LCU, that is, the coding tree unit CTU is not divided.
  • the following operation of predicting whether the coding unit to be processed needs to perform depth division is performed using the prediction model.
  • Step S220 Acquire, from the to-be-coded coding unit and the neighboring coding tree unit of the coding tree unit where the to-be-coded coding unit is located, respectively, the coding information features of the set type, and form a prediction feature vector sample;
  • Step S230 Input the predicted feature vector sample into the pre-trained prediction model to obtain a prediction result output by the prediction model, where the prediction result is used to indicate whether the to-be-processed coding unit needs to perform depth division.
  • the prediction model is pre-trained by using training samples marked with classification results, and the training samples include coding information features of the set type.
  • the present invention adds a judgment condition for performing code depth prediction using the prediction model, that is, the process of performing model prediction is performed when it is determined that the coded depth of the coding unit to be processed is zero.
  • the process of prediction using the prediction model is also complicated, and the present application can be predicted by other methods. For details, refer to the related description below.
  • a predictive model is introduced.
  • the present application can set the prediction model including the P frame prediction model and the B frame prediction model.
  • the training samples used in the pre-training of the P-frame prediction model are the coding information features of the set type extracted in the coding unit belonging to the P-frame video image.
  • the training samples used in the pre-training of the B-frame prediction model are the coding information features of the set type extracted in the coding unit belonging to the B-frame video image.
  • step S230 the predicted feature vector sample is input into the pre-trained prediction model to obtain a prediction result output by the prediction model, and the specific implementation includes the following steps:
  • the present invention improves the accuracy of the prediction result by using different prediction models for the coding units to be processed included in the B frame and the P frame video image.
  • the training samples used in predictive model training are introduced.
  • the coding information feature of the set type used in the training prediction model of the present application Can include:
  • the neighboring coding tree unit of the current coding unit may be an upper neighbor coding tree unit and a left neighbor coding tree unit of the coding tree unit where the current coding unit is located, and the coding information feature 5 may specifically include:
  • the above coded information feature 6 may specifically include:
  • Depth information (above_depth) of the upper neighbor coding tree unit of the current coding unit is 62.
  • the type of the encoded information feature used in training the predictive model is consistent with the type of the encoded information feature obtained when the coding unit to be processed performs model prediction.
  • the present application can select a video code stream sequence of different scenes, extract the above-mentioned types of coding information features offline for the training coding unit included in the sequence, and record whether the training coding unit is actually performed during the actual coding process. Depth division, if yes, the classification result of the marker training coding unit is the first marker value; otherwise, the classification result of the marker training coding unit is the second marker value.
  • the first tag value may be one and the second tag value may be -1.
  • Each type of coding information feature acquired by the training coding unit is composed into a training feature vector, and the training feature vector and the classification result of the training coding unit are combined into a training sample.
  • the B frame prediction model and the P frame prediction model are separately trained, so the coding information features of the B frame and the P frame are also extracted separately. Moreover, in this embodiment, only the training coding unit with the coded depth of 0 can be extracted, and the trained prediction model also predicts only the coding unit to be processed with the coded depth of zero.
  • the SVM model training can be selected, and the third-party open source software is used for offline training.
  • the standardization operation of the training samples in this step is to facilitate the unification of the data format and improve the accuracy of the prediction.
  • another coding unit depth determining method is introduced. As shown in FIG. 6, the method includes:
  • Step S300 determining a residual coefficient of a current optimal mode of the coding unit to be processed
  • Step S310 when the residual coefficient is not zero, it is determined whether the coded depth of the to-be-coded coding unit is zero, and if so, step S320 is performed, and if not, step S340 is performed;
  • the coded depth of the to-be-coded coding unit is zero, it indicates that the to-be-processed coding unit is the largest coding unit LCU, that is, the coding tree unit CTU is not divided.
  • the following operation of predicting whether the coding unit to be processed needs to perform depth division is performed using the prediction model.
  • another method is used to predict the coded depth.
  • Step S320 Acquire, from the to-be-coded coding unit and the neighboring coding tree unit of the coding tree unit where the to-be-coded coding unit is located, respectively, to obtain the coding information features of the set type, and form a prediction feature direction. Quantity sample
  • Step S330 the prediction feature vector sample is input into the pre-trained prediction model, and the prediction result output by the prediction model is obtained, where the prediction result indicates whether the to-be-processed coding unit needs to perform depth division;
  • the prediction model is pre-trained by using training samples marked with classification results, and the training samples include coding information features of the set type.
  • Step S340 determining, in the neighboring coding tree unit of the coding tree unit where the to-be-coded coding unit is located, an average cost of the coding unit having the same coding depth as the to-be-coded coding unit, as the first average cost;
  • Step S350 determining an average cost of the encoded coding unit of the same coding depth in the coding tree unit where the to-be-coded coding unit is located, as a second average cost;
  • Step S360 Determine, according to the first average cost and the second average cost, whether the to-be-coded coding unit needs to perform depth division.
  • the process of predicting the coded depth of the coding unit to be processed when determining the coded depth of the coding unit to be processed is not zero is added in the embodiment, that is, according to the coding unit to be processed and the coding code thereof.
  • the neighboring coding tree unit of the tree unit averages the coding unit of the same coded depth to predict whether the coding unit to be processed needs to be deeply divided. Since the difference in pixel distribution of the coding tree units of neighbors in one frame of video image is not excessive, it is possible to predict whether the coding unit to be processed needs to be depth based on the average cost of coding units of the same coded depth in the encoded neighboring coding tree.
  • the accuracy of the prediction result is relatively high, and there is no need to calculate and compare the depth division and rate distortion cost of the coding unit to be processed.
  • the coding prediction time is greatly reduced, and the computing resources are reduced, and the calculation is reduced. the complexity.
  • the process may include:
  • Step S400 Determine, from each neighboring coding tree unit of the coding tree unit where the coding unit to be processed is located, an average cost of coding units having the same coding depth as the coding unit to be processed;
  • Step S410 Determine, according to an azimuth relationship between each of the neighboring coding tree units and a coding tree unit where the coding unit to be processed is located, a weight value of each of the neighboring coding tree units;
  • the coding tree unit in which the coding unit to be processed is located is a Current CTU
  • the neighbor coding tree unit of the Current CTU may include: a left neighbor coding tree unit Left CTU, an upper left adjacent coding tree unit AboveLeft CTU, and an upper neighbor coding.
  • Figure 8 illustrates the various neighbor coding tree units of the Current CTU.
  • the weight ratio of the neighbor CTU is:
  • Step S420 Determine, according to a weight value of each of the neighboring coding tree units and an average cost thereof, a weighted average cost of each of the neighboring coding tree units as a first average cost.
  • each neighbor coding tree unit is multiplied by the corresponding weight value to obtain a multiplication result, and each multiplication result is added to obtain a weighted average cost as the first average cost.
  • the coding depth of the coding unit to be processed is 1.
  • the Left CTU includes 4 CU32*32s with a code depth of 1
  • the AboveLeft CTU includes 3 CU32*32s with a code depth of 1
  • the Above CTU includes 0 CU32*32 with a code depth of 1
  • the AboveRight CTU Includes 2 CU32*32 with a code depth of 1.
  • the positions of the four CU32*32s with a coded depth of 1 in the CTU are defined as clockwise, starting from the upper left corner, and the position markers are 0, 1, 2, and 3, respectively.
  • Left_depth1_cost left_depth1_cost0+left_depth1_cost1+left_depth1_cost2+left_depth1_cost3;
  • Aboveleft_depth1_cost aboveleft_depth1_cost0+aboveleft_depth1_cost2+aboveleft_depth1_cost3;
  • Aboveright_depth1_cost aboveright_depth1_cost1+aboveright_depth1_cost2;
  • the first formula is taken as an example.
  • the left_depth1_cost represents the average cost of the CU with a code depth of 1 in the left neighbor CTU
  • the left_depth1_cost0 represents the cost of the CU with the code mark 0 in the CU of the left neighbor CTU. .
  • weighted average cost of a CU with a code depth of 1 in all neighbor CTUs is:
  • Avg_depth1_cost (left_depth1_cost*2+aboveleft_depth1_cost*1+aboveright_depth1_cost*1)/(left_depth1_num*2+aboveleft_depth1_num*1+aboveright_depth1_num*1)
  • the left_depth1_num, the aboveleft_depth1_num, and the aboveright_depth1_num respectively represent the number of CUs with a code depth of 1 in the left neighbor, the upper neighbor, and the upper left neighbor CTU.
  • step S360 the implementation process of determining whether the to-be-processed coding unit needs to be deep-divided is introduced according to the first average cost and the second average cost.
  • FIG. Can include:
  • Step S500 determining a cost threshold according to the first average cost and the second average cost
  • different weight values may be set for the first average cost and the second average cost, and then the first average cost and the second average cost are weighted and added, and the result may be used as a cost threshold.
  • the weight value of the first average cost may be set to be greater than the weight value of the second average cost.
  • Step S510 determining whether the cost of the current optimal mode of the to-be-coded coding unit is less than the cost threshold; if yes, executing step S520; if not, executing step S530;
  • Step S520 determining that the to-be-coded coding unit does not need to perform depth division
  • Step S530 Determine that the to-be-processed coding unit needs to perform depth division.
  • the present application considers that the coding unit to be processed does not need to perform depth division. Otherwise, it indicates that the coding unit to be processed further needs to perform depth division.
  • the coded depth of the coding unit to be processed is still 1, as explained in conjunction with the example of FIG. 8:
  • Avg_curr_CU_depth1 Avg_curr_CU_depth1
  • the weight value ratio of the first average cost and the second average cost is set to 4:3. Then the cost threshold is expressed as:
  • Threshold_depth1 (Avg_depth1_cost*4+Avg_curr_CU_depth1*3)/(3+4)
  • curr_cost_depth1 If the cost of the current optimal mode of the coding unit to be processed is curr_cost_depth1, if curr_cost_depth1 ⁇ Threshold_depth1 is determined, it is considered that the coding unit to be processed does not need to perform deep division, otherwise depth division is required.
  • the coding speed of the method of the present application is improved by 94% and the compression ratio is decreased by 3.1% compared with the existing full traversal method.
  • a small amount of compression ratio is reduced in exchange for a large degree of coding speed improvement, so that the coding speed of the video encoder is greatly accelerated, and the computational complexity is greatly reduced.
  • the coding unit depth determining apparatus provided in the embodiment of the present application is described below, and the coding unit depth determining apparatus described below and the coding unit depth determining method described above may refer to each other.
  • FIG. 10 is a schematic structural diagram of a coding unit depth determining apparatus according to an embodiment of the present application.
  • the device includes:
  • a residual coefficient determining unit 11 configured to determine a residual coefficient of a current optimal mode of the coding unit to be processed
  • the feature acquiring unit 12 is configured to obtain, respectively, a coding of a set type from the neighboring coding tree unit of the coding unit that is to be processed and the coding tree unit where the coding unit to be processed is located when the residual coefficient is not zero Information features, composing predictive feature vector samples;
  • the model prediction unit 13 is configured to input the predicted feature vector sample into the pre-trained prediction model to obtain a prediction result output by the prediction model, where the prediction result is used to indicate whether the to-be-processed coding unit needs to be deeply divided. ;
  • the prediction model is pre-trained by using training samples marked with classification results, and the training samples include coding information features of the set type.
  • the coding unit depth determining apparatus preliminarily uses a training sample marked with a classification result to train a prediction model, where the training sample includes a coding information feature of a set type, and further determines a current optimal mode of the coding unit to be processed.
  • the residual coefficient is not zero, it indicates that the coding unit to be processed is not a skip coding unit, and the coding depth prediction is needed, and then the coding of the set type is obtained from the coding unit of the coding unit to be processed and the neighbor coding tree unit of the coding tree unit.
  • the information features are composed of predicted feature vector samples, input into the prediction model, and the machine learning prediction model is used to predict whether the coding unit to be processed needs to be deeply divided.
  • the prediction result indicates that the coding unit to be processed does not need to perform depth division
  • the calculation and comparison of the depth division and the rate distortion cost of the coding unit to be processed are not required, and the coding prediction time is greatly reduced and reduced compared with the prior art.
  • Computing resources reduces computational complexity.
  • the residual coefficient determining unit may be specifically configured to determine a residual of a current optimal mode of the to-be-coded coding unit of the non-I frame video image.
  • the apparatus of the present application may further include:
  • a coded depth determining unit configured to determine whether a coded depth of the to-be-coded coding unit is zero
  • the feature acquiring unit is specifically configured to: when the determining result of the coded depth determining unit is yes, from the to-be-coded tree unit of the coding tree unit and the coding tree unit where the to-be-coded coding unit is located , respectively extracting the coding information features of the set type.
  • the apparatus of the present application may further include:
  • a neighboring average cost determining unit configured to determine, in the neighboring coding tree unit of the coding tree unit where the to-be-coded coding unit is located, the same as the to-be-coded coding unit, when determining that the coded depth of the to-be-coded coding unit is not zero
  • the average cost of coding units of coded depth as the first average cost
  • a self-average cost determining unit configured to determine an average cost of the encoded coding unit of the same coding depth in the coding tree unit where the to-be-coded coding unit is located, as a second average cost
  • the depth division determining unit is configured to determine, according to the first average cost and the second average cost, whether the coding unit to be processed needs to perform depth division.
  • the prediction model may include a P frame prediction model and a B frame prediction model, where the training samples used in the P frame prediction model pre-training are the settings extracted in the coding unit belonging to the P frame video image.
  • a type of coding information feature, the training sample used in the pre-training of the B-frame prediction model is an encoding information feature of the set type extracted in a coding unit belonging to a B-frame video image.
  • the model prediction unit may include:
  • a frame type determining unit configured to determine whether a type of the video frame image to which the to-be-coded coding unit belongs is a P frame or a B frame;
  • a P frame model prediction unit configured to input the predicted feature vector sample into the P frame prediction model when the frame type determining unit determines to be a P frame, to obtain a prediction result output by the P frame prediction model
  • a B frame model prediction unit configured to input the predicted feature vector sample into the B frame prediction model when the frame type determining unit determines to be a B frame, to obtain a prediction result output by the B frame prediction model.
  • the feature acquiring unit may include:
  • a first feature acquiring unit configured to acquire a cost, a quantization coefficient, a distortion, and a variance of the to-be-coded coding unit
  • a second feature acquiring unit configured to acquire cost and depth information of a neighboring coding tree unit of the coding tree unit where the to-be-coded coding unit is located.
  • the neighbor average cost determining unit may include:
  • a first neighbor average cost determining subunit configured to determine, from each neighboring coding tree unit of the coding tree unit in which the coding unit to be processed is located, an average cost of coding units having the same coding depth as the to-be-coded coding unit;
  • a second neighboring average cost determining subunit configured to determine a weight value of each of the neighboring coding tree units according to an azimuth relationship between each of the neighboring coding tree units and a coding tree unit where the coding unit to be processed is located;
  • a third neighboring average cost determining subunit configured to determine, according to a weight value of each of the neighboring coding tree units and an average cost thereof, a weighted average cost of each of the neighboring coding tree units as a first average cost.
  • the depth division determining unit may include:
  • a cost threshold determining unit configured to determine a cost threshold according to the first average cost and the second average cost
  • a cost threshold comparison unit configured to determine whether a cost of the current optimal mode of the to-be-coded coding unit is less than the cost threshold; if yes, determining that the to-be-processed coding unit does not need to perform depth division, Otherwise, it is determined that the to-be-coded coding unit needs to perform depth division.
  • the embodiment of the present application further discloses a video encoder, where the video encoder includes the foregoing coding unit depth determining apparatus.
  • the video encoder may also include the prediction model described above. Compared with the existing video encoder, the video encoder disclosed in the present application greatly increases the encoding speed and greatly reduces the computational complexity.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种编码单元深度确定方法及装置,本申请预先利用标记有分类结果的训练样本来训练预测模型,该训练样本包括设定类型的编码信息特征,进而在确定待处理编码单元的当前最优模式的残差系数不为零时,判断需要进行编码深度预测,从待处理编码单元及其所在编码树单元的近邻编码树单元中,获取设定类型的编码信息特征,组成预测特征向量样本,输入至预测模型中,利用机器学习预测模型来预测待处理编码单元是否需要进行深度划分。本申请在预测结果表明待处理编码单元不需要进行深度划分时,无需对待处理编码单元进行深度划分及率失真代价的计算和比较,相比于现有技术其编码预测时间大大降低,减少了计算资源。

Description

编码单元深度确定方法及装置
本申请要求于2017年4月21日提交中国专利局、申请号为2017102667988、发明名称为“编码单元深度确定方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频编码技术领域,更具体地说,涉及一种编码单元深度确定方法及装置。
背景技术
近年来,数字视频已经成为许多消费应用中占据主导地位的媒体内容,人们对于更高的分辨率和更好的视频质量的要求也在增加。针对这种需求,新一代视频编码国际标准HEVC(High Efficiency Video Coding,高效率视频编码标准)开始制定。同H.264/AVC标准相比,HEVC取得了更高的编码压缩性能。
结合附图1对HEVC编码标准的编码过程进行介绍:原始视频序列中一帧图像与缓存的参考帧一起经过帧内预测或者帧间预测之后,得到预测值,预测值与输入的视频帧相减得到残差,残差经过DCT(Discrete Cosine Transform,离散余弦变换)和量化之后得到残差系数,然后送入熵编码模块进行编码,并输出视频码流。同时,残差系数经过反量化和反变换之后,得到重构图像的残差值,重构图像的残差值和帧内或帧间的预测值相加,得到重构图像,重构图像经过去方块效应滤波、环路滤波之后,得到重建帧,重建帧作为下一帧输入图像的参考帧,加入参考帧序列。
在HEVC标准中,输入的视频帧被划分为一系列的编码树单元CTU(Coding Tree Unit)。在帧内或帧间预测时,每个CTU从最大编码单元LCU(Largest Code Unit)开始,每层按照四叉树形式划分为不同大小的编码单元CU(Coding Unit)。深度为0的层次即为LCU,其大小一般为64*64,深度为1-3的层次大小分别为32*32、16*16、8*8。为了达到最佳的编码性能,现有HEVC在编码单元CU块深度划分中选择最佳模式时,采用的是全遍历方式, 它将CU块不同深度的所有模式都进行率失真代价的计算,然后逐层比较,进而选取率失真代价最小的模式。参见图2,其示例了一种最优模式的CU划分情况,图2中左侧图为具体划分方式,右侧图为左侧图划分方式对应的四叉树,四叉树中的叶子节点表示按照左侧图中箭头示意的划分顺序,每层中四个CU块是否需要进一步划分,其中1表示需要,0表示不需要。
根据上述图2可知,有些CU块在做了一层划分之后就找到了最优的模式,不需要再向下划分并计算、比较率失真代价,如图2所示四叉树中第1层中第2个CU块,其节点数值为0,表示不需要进一步划分。显然,按照现有全遍历算法,其编码预测过程耗时极长,且需要消耗大量的计算资源。
发明内容
有鉴于此,本申请提供了一种编码单元深度确定方法及装置,用于解决现有全遍历方法来确定编码单元深度的方法所存在的编码预测时间长、消耗大量计算资源的问题。
本申请第一方面提供一种编码单元深度确定方法,包括:
确定待处理编码单元的当前最优模式的残差系数;
在所述残差系数不为零时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向量样本;
将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分;
其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
本申请第二方面还提供一种编码单元深度确定装置,包括:
残差系数确定单元,用于确定待处理编码单元的当前最优模式的残差系数;
特征获取单元,用于在所述残差系数不为零时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型 的编码信息特征,组成预测特征向量样本;
模型预测单元,用于将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分;
其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
本申请实施例第三方面还提供了一种计算机可读存储介质,存储有程序指令,处理器执行所存储的程序指令时执行上述任一方法中的一种。
本申请实施例提供的编码单元深度确定方法,预先利用标记有分类结果的训练样本来训练预测模型,该训练样本包括设定类型的编码信息特征,进而在确定待处理编码单元的当前最优模式的残差系数不为零时,说明待处理编码单元并非skip编码单元,需要进行编码深度预测,进而从待处理编码单元及其所在编码树单元的近邻编码树单元中,获取设定类型的编码信息特征,组成预测特征向量样本,输入至预测模型中,利用机器学习预测模型来预测待处理编码单元是否需要进行深度划分。本申请在预测结果表明待处理编码单元不需要进行深度划分时,无需对待处理编码单元进行深度划分及率失真代价的计算和比较,相比于现有技术其编码预测时间大大降低,且减少了计算资源,降低了计算复杂度。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的技术人员来讲,还可以根据这些附图获得其他的附图。
图1为HEVC编码框架示意图;
图2示例了一种最优模式的CU划分示意图;
图3为本申请实施例公开的一种服务器硬件结构示意图;
图4为本申请实施例公开的一种编码单元深度确定方法流程图;
图5为本申请实施例公开的另一种编码单元深度确定方法流程图;
图6为本申请实施例公开的又一种编码单元深度确定方法流程图;
图7为本申请实施例公开的一种第一平均代价确定方法流程图;
图8示例了一种Current CTU的各个近邻编码树单元的CU划分示意图;
图9为本申请实施例公开的一种确定待处理编码单元是否需要进行深度划分的方法流程图;
图10为本申请实施例公开的一种编码单元深度确定装置结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供了一种编码单元深度确定方案,该挖掘方案可以应用于视频编码器,视频编码器基于服务器实现。该服务器的硬件结构可以是电脑、笔记本等处理设备,在介绍本申请的编码单元深度确定方法之前,首先介绍一下服务器的硬件结构。如图3所示,该服务器可以包括:
处理器1,通信接口2,存储器3,通信总线4,和显示屏5;
其中处理器1、通信接口2、存储器3和显示屏5通过通信总线4完成相互间的通信。
接下来,我们结合服务器硬件结构,对本申请的编码单元深度确定方法进行介绍,如图4所示,该方法包括:
步骤S100、确定待处理编码单元的当前最优模式的残差系数;
具体地,针对待处理编码单元,根据标准协议构造候选mv(motion vector,运动向量)列表,然后遍历列表中的每个mv,并做运动补偿,得到预测值,然后计算预测值和待处理编码单元原始像素的差值的平方和(SSD,Sum of Squared Difference),并估算对应mv的索引所编出的比特数bits,并找到率失真代价rdcost最小所对应的mv,即为最优模式mv。其中:
rdcost=SSD+λ*bit(λ为常量);
进一步,将最优mv对应的SSD计算后结果做变换和量化,得到残差系数,。如果残差系数为0表示待处理编码单元为skip块,否则为merge块。
可以理解的是,如果残差系数为零,则表示待处理编码单元为skip块,可以直接结束CU划分,否则,说明待处理编码单元CU需要进行划分预测。
具体实施时,可以预先通过通信接口2,将待处理视频帧图像存储至存储器3中。在编码时,由处理器1通过通信总线4获取存储器中存储的待处理视频帧图像,并划分为多个编码单元,从中确定待处理编码单元,并确定待处理编码单元的当前最优模式的残差系数。
可选的,通信接口2可以为通信模块的接口,如GSM模块的接口。
可选的,处理器1可能是一个中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本申请实施例的一个或多个集成电路。
步骤S110、在所述残差系数不为零时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向量样本;
具体地,本步骤中获取的编码信息特征的类型与预测模型训练过程所使用的训练样本的类型相同。本申请可以预先设置各类型的编码信息特征模板,进而按照编码信息特征模板,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中获取编码信息特征,由获取的编码信息特征组成预测特征向量样本。获取的编码信息特征的对象为:待处理编码单元CU,以及待处理编码单元CU所在编码树单元CTU的近邻编码树单元CTU。
具体实施时,可以预先在存储器3中存储各类型的编码信息特征模板,进而可以由处理器1按照编码信息特征模板,从待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中获取编码信息特征,组成预测特征向量样本。
步骤S120、将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分。
其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
具体实施时,可以将预测模型预先存储在存储器3中。在预测时,由处理 器1将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,并通过显示屏5输出显示。
其中,预测模型可以是SVM(Support Vector Machine,支持向量机)模型,或者是神经网络模型机器学习模型等。
本申请实施例提供的编码单元深度确定方法,预先利用标记有分类结果的训练样本来训练预测模型,该训练样本包括设定类型的编码信息特征,进而在确定待处理编码单元的当前最优模式的残差系数不为零时,说明待处理编码单元并非skip编码单元,需要进行编码深度预测,进而从待处理编码单元及其所在编码树单元的近邻编码树单元中,获取设定类型的编码信息特征,组成预测特征向量样本,输入至预测模型中,利用机器学习预测模型来预测待处理编码单元是否需要进行深度划分。本申请在预测结果表明待处理编码单元不需要进行深度划分时,无需对待处理编码单元进行深度划分及率失真代价的计算和比较,相比于现有技术其编码预测时间大大降低,且减少了计算资源,降低了计算复杂度。
可选的,由于整个编码过程中I帧所占比值较小,因此本申请可以只对属于非I帧视频图像的待处理编码单元进行深度确定,也即,上述待处理编码单元属于非I帧视频图像。
在本申请的另一个实施例中,介绍了另一种编码单元深度确定方法,如图5所示,该方法包括:
步骤S200、确定待处理编码单元的当前最优模式的残差系数;
步骤S210、在所述残差系数不为零时,判断所述待处理编码单元的编码深度是否为零,若是,执行步骤S220;
具体地,待处理编码单元的编码深度若为零,则表示待处理编码单元为最大编码单元LCU,也即编码树单元CTU并未进行划分。
本实施例中,在确定待处理编码单元的编码深度为零时,才执行下述使用预测模型预测待处理编码单元是否需要进行深度划分的操作。
需要说明的是,对于编码深度不为零的待处理编码单元,使用预测模型进行预测的过程,其计算过程也比较复杂,本申请可以采用其它方式来预测,详细见说明书下文相关介绍。
步骤S220、从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向量样本;
步骤S230、将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分。
其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
相比于上一实施例,本实施例中增加了使用预测模型进行编码深度预测的判断条件,即在确定待处理编码单元的编码深度为零时,才执行模型预测的过程。对于编码深度不为零的待处理编码单元,使用预测模型进行预测的过程,其计算过程也比较复杂,本申请可以采用其它方式来预测,详细见说明书下文相关介绍。
在本申请的另一个实施例中,对预测模型进行介绍。
由于视频码流中B帧和P帧的误差累计周期不同,为了使得预测模型的预测结果更加准确,本申请可以设置预测模型包括P帧预测模型和B帧预测模型,
其中:
P帧预测模型预训练时使用的训练样本为,属于P帧视频图像的编码单元中提取的设定类型的编码信息特征。
B帧预测模型预训练时使用的训练样本为,属于B帧视频图像的编码单元中提取的所述设定类型的编码信息特征。
在上述步骤S230中,将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果的过程,具体实现包括以下步骤:
S1、确定所述待处理编码单元所属视频帧图像的类型为P帧还是B帧;
S2、若为P帧,则将所述预测特征向量样本输入所述P帧预测模型,得到所述P帧预测模型输出的预测结果;
S3、若为B帧,则将所述预测特征向量样本输入所述B帧预测模型,得到所述B帧预测模型输出的预测结果。
本申请通过对B帧和P帧视频图像包括的待处理编码单元使用不同的预测模型进行预测,提高了预测结果的准确度。
接下来,对本申请预测模型的建立过程进行介绍。
一、训练特征获取
首先介绍预测模型训练时所使用的训练样本。定义待处理编码单元为当前编码单元,待处理编码单元所在编码树单元的近邻编码树单元为当前编码单元的近邻编码树单元,则本申请训练预测模型时所使用的设定类型的编码信息特征可以包括:
1、当前编码单元的代价(curr_merge_rdcost)
2、当前编码单元的失真(curr_merge_distortion)
3、当前编码单元的量化系数(curr_qp)
4、当前编码单元的方差(curr_var)
5、当前编码单元的近邻编码树单元的代价(around_rdcost)
6、当前编码单元的近邻编码树单元的深度信息(around_depth)。
其中,当前编码单元的近邻编码树单元可以是当前编码单元所在编码树单元的上侧近邻编码树单元和左侧近邻编码树单元,则上述编码信息特征5具体可以包括:
51、当前编码单元的左侧近邻编码树单元的代价(left_rdcost)
52、当前编码单元的上侧近邻编码树单元的代价(above_rdcost)
上述编码信息特征6具体可以包括:
61、当前编码单元的左侧近邻编码树单元的深度信息(left_depth)
62、当前编码单元的上侧近邻编码树单元的深度信息(above_depth)。
需要说明的是,在训练预测模型时所使用编码信息特征的类型与待处理编码单元进行模型预测时所获取的编码信息特征的类型要保持一致。
在此基础上,本申请可以选取不同场景的视频码流序列,针对序列中包括的训练编码单元,离线提取上述各类型的编码信息特征,并记录下实际编码过程中,训练编码单元是否做了深度划分,如果是,则标记训练编码单元的分类结果为第一标记值,否则,标记训练编码单元的分类结果为第二标记值。第一标记值可以是1,第二标记值可以是-1。
将训练编码单元获取的各类型的编码信息特征组成训练特征向量,将训练特征向量及训练编码单元的分类结果组成训练样本。
需要说明的是,B帧预测模型和P帧预测模型是分开进行训练的,因此B帧和P帧的编码信息特征也要分开提取。并且,本实施例可以仅提取编码深度为0的训练编码单元,训练得到的预测模型也仅对编码深度为零的待处理编码单元进行预测。
二、模型训练
本实施例可以选用SVM模型训练,采用第三方开源软件,离线训练。
S1、组合训练样本。按照1:1的比例,获取分类结果为需要深度划分以及不需要深度划分的训练样本,然后交错组成整个训练样本集。
S2、训练样本标准化。对整理好的训练样本做标准化,将训练样本映射至[-1,1]区间中。
本步骤中进行训练样本标准化的操作,是为了便于数据格式的统一,可以提高预测的准确度。
S3、模型训练。调用第三方开源软件,采用RBF内核,对属于B帧的训练样本、属于P帧的训练样本分开训练,最后分别得到B帧预测模型和P帧预测模型,记为mode_B_cu64*64及mode_P_cu64*64。
在本申请的又一个实施例中,介绍了又一种编码单元深度确定方法,如图6所示,该方法包括:
步骤S300、确定待处理编码单元的当前最优模式的残差系数;
步骤S310、在所述残差系数不为零时,判断所述待处理编码单元的编码深度是否为零,若是,执行步骤S320,若否,执行步骤S340;
具体地,待处理编码单元的编码深度若为零,则表示待处理编码单元为最大编码单元LCU,也即编码树单元CTU并未进行划分。
本实施例中,在确定待处理编码单元的编码深度为零时,执行下述使用预测模型预测待处理编码单元是否需要进行深度划分的操作。在确定待处理编码单元的编码深度不为零时,使用另一种方法进行编码深度的预测。
步骤S320、从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向 量样本;
步骤S330、将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果表明所述待处理编码单元是否需要进行深度划分;
其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
上述步骤S300-S330与前述实施例中步骤S200-S230一一对应,此处不再赘述。
步骤S340、确定所述待处理编码单元所在编码树单元的近邻编码树单元中,与所述待处理编码单元相同编码深度的编码单元的平均代价,作为第一平均代价;
步骤S350、确定所述待处理编码单元所在编码树单元中相同编码深度的已编码的编码单元的平均代价,作为第二平均代价;
步骤S360、根据所述第一平均代价及所述第二平均代价,确定所述待处理编码单元是否需要进行深度划分。
相比于前述各实施例,本实施例中增加了在确定待处理编码单元的编码深度不为零时,对待处理编码单元的编码深度进行预测的过程,即根据待处理编码单元及其所在编码树单元的近邻编码树单元中相同编码深度的编码单元的平均代价,来预测待处理编码单元是否需要进行深度划分。由于一帧视频图像中近邻的编码树单元的像素分布差异不会过大,因此可以基于已编码的近邻编码树中相同编码深度的编码单元的平均代价,来预测待处理编码单元是否需要进行深度划分,其预测结果准确率比较高,且无需对待处理编码单元进行深度划分及率失真代价的计算和比较,相比于现有技术其编码预测时间大大降低,且减少了计算资源,降低了计算复杂度。
进一步,对上述步骤S340的实现过程进行介绍,详细参见图7所示,该过程可以包括:
步骤S400、从所述待处理编码单元所在编码树单元的每一近邻编码树单元中,确定与所述待处理编码单元相同编码深度的编码单元的平均代价;
步骤S410、按照每一所述近邻编码树单元与所述待处理编码单元所在编码树单元的方位关系,确定每一所述近邻编码树单元的权重值;
具体地,为了便于表述,定义待处理编码单元所在编码树单元为Current CTU,Current CTU的近邻编码树单元可以包括:左边近邻编码树单元Left CTU、左上角近邻编码树单元AboveLeft CTU、上边近邻编码树单元Above CTU、右上角近邻编码树单元AboveRight CTU。
图8示例了Current CTU的各个近邻编码树单元。
可以理解的是,Current CTU与各个近邻CTU的方位关系不同,进而各个近邻CTU的权重值也不同。
一种可选的对应关系中,近邻CTU的权重比值为:
Left CTU:Above CTU:AboveLeft CTU:AboveRight CTU=2:2:1:1。
步骤S420、根据每一所述近邻编码树单元的权重值及其平均代价,确定各所述近邻编码树单元的加权平均代价,作为第一平均代价。
具体地,每一近邻编码树单元的平均代价与对应的权重值相乘,得到相乘结果,各个相乘结果相加,得到加权平均代价,作为第一平均代价。
以图8示例的情况为例,对第一平均代价的确定过程进行说明:
假设待处理编码单元的编码深度为1。则由图8可知,Left CTU包括4个编码深度为1的CU32*32,AboveLeft CTU包括3个编码深度为1的CU32*32,Above CTU包括0个编码深度为1的CU32*32,AboveRight CTU包括2个编码深度为1的CU32*32。
定义CTU中编码深度为1的4个CU32*32的位置标记为按照顺时针方向,从左上角开始,位置标记依次为0、1、2、3。
则结合图8可知:
left_depth1_cost=left_depth1_cost0+left_depth1_cost1+left_depth1_cost2+left_depth1_cost3;
aboveleft_depth1_cost=aboveleft_depth1_cost0+aboveleft_depth1_cost2+aboveleft_depth1_cost3;
aboveright_depth1_cost=aboveright_depth1_cost1+aboveright_depth1_cost2;
其中,以第一个公式为例进行说明,left_depth1_cost表示左边近邻CTU中编码深度为1的CU的平均代价,left_depth1_cost0表示左边近邻CTU中编码深度为1的CU中,位置标记为0的CU的代价。
进一步,所有近邻CTU中编码深度为1的CU的加权平均代价为:
Avg_depth1_cost=(left_depth1_cost*2+aboveleft_depth1_cost*1+aboveright_depth1_cost*1)/(left_depth1_num*2+aboveleft_depth1_num*1+aboveright_depth1_num*1)
其中,left_depth1_num、aboveleft_depth1_num、aboveright_depth1_num分别表示左边近邻、上边近邻及左上角近邻CTU中,编码深度为1的CU的个数。
可以理解的是,上述仅仅以编码深度为1进行说明,对于编码深度为2、3的计算方式同上。
再进一步,对上述步骤S360,根据所述第一平均代价及所述第二平均代价,确定所述待处理编码单元是否需要进行深度划分的实现过程进行介绍,详细参见图9所示,该过程可以包括:
步骤S500、根据所述第一平均代价及所述第二平均代价,确定代价阈值;
具体地,可以为第一平均代价和第二平均代价设置不同的权重值,进而对第一平均代价及第二平均代价进行加权相加,结果可以作为代价阈值。
可选的,由于近邻CTU均已经全部完成编码,因此可以设置第一平均代价的权重值大于第二平均代价的权重值。
步骤S510、判断所述待处理编码单元的当前最优模式的代价是否小于所述代价阈值;若是,执行步骤S520,若否,执行步骤S530;
步骤S520、确定所述待处理编码单元不需要进行深度划分;
步骤S530、确定所述待处理编码单元需要进行深度划分。
具体地,若待处理编码单元的当前最优模式的代价小于代价阈值,则本申请认为待处理编码单元已经无需进行深度划分,否则,表示待处理编码单元还需要进行深度划分。
仍以待处理编码单元的编码深度为1,结合图8示例的进行说明:
定义所述待处理编码单元所在编码树单元中编码深度为1的已编码的编 码单元的平均代价表示为Avg_curr_CU_depth1,也即第二平均代价表示为Avg_curr_CU_depth1。
设置第一平均代价和第二平均代价的的权重值比值为4:3。则代价阈值表示为:
Threshold_depth1=(Avg_depth1_cost*4+Avg_curr_CU_depth1*3)/(3+4)
定义待处理编码单元的当前最优模式的代价为curr_cost_depth1,则如果确定curr_cost_depth1<Threshold_depth1,则认为待处理编码单元不需要再做深度划分,否则需要作深度划分。
通过使用本申请上述提供的方法及现有技术进行实验验证,可以发现,相比于现有全遍历方法,本申请方法的编码速度提升了94%,压缩比下降了3.1%,由此可见,本申请通过少量的压缩比的下降来换取了很大程度的编码速度的提升,使得视频编码器的编码速度大大加快,计算复杂度大大降低。
下面对本申请实施例提供的编码单元深度确定装置进行描述,下文描述的编码单元深度确定装置与上文描述的编码单元深度确定方法可相互对应参照。
参见图10,图10为本申请实施例公开的一种编码单元深度确定装置结构示意图。
如图10所示,该装置包括:
残差系数确定单元11,用于确定待处理编码单元的当前最优模式的残差系数;
特征获取单元12,用于在所述残差系数不为零时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向量样本;
模型预测单元13,用于将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分;
其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
本申请实施例提供的编码单元深度确定装置,预先利用标记有分类结果的训练样本来训练预测模型,该训练样本包括设定类型的编码信息特征,进而在确定待处理编码单元的当前最优模式的残差系数不为零时,说明待处理编码单元并非skip编码单元,需要进行编码深度预测,进而从待处理编码单元及其所在编码树单元的近邻编码树单元中,获取设定类型的编码信息特征,组成预测特征向量样本,输入至预测模型中,利用机器学习预测模型来预测待处理编码单元是否需要进行深度划分。本申请在预测结果表明待处理编码单元不需要进行深度划分时,无需对待处理编码单元进行深度划分及率失真代价的计算和比较,相比于现有技术其编码预测时间大大降低,且减少了计算资源,降低了计算复杂度。
可选的,所述残差系数确定单元具体可以用于,确定处于非I帧视频图像的待处理编码单元的当前最优模式的残差。
可选的,本申请的装置还可以包括:
编码深度判断单元,用于判断所述待处理编码单元的编码深度是否为零;
基于此,所述特征获取单元具体用于,在所述编码深度判断单元的判断结果为是时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别提取设定类型的编码信息特征。
可选的,本申请的装置还可以包括:
近邻平均代价确定单元,用于在判断所述待处理编码单元的编码深度不为零时,确定所述待处理编码单元所在编码树单元的近邻编码树单元中,与所述待处理编码单元相同编码深度的编码单元的平均代价,作为第一平均代价;
自身平均代价确定单元,用于确定所述待处理编码单元所在编码树单元中相同编码深度的已编码的编码单元的平均代价,作为第二平均代价;
深度划分判断单元,用于根据所述第一平均代价及所述第二平均代价,确定所述待处理编码单元是否需要进行深度划分。
可选的,所述预测模型可以包括P帧预测模型和B帧预测模型,所述P帧预测模型预训练时使用的训练样本为,属于P帧视频图像的编码单元中提取的所述设定类型的编码信息特征,所述B帧预测模型预训练时使用的训练样本为,属于B帧视频图像的编码单元中提取的所述设定类型的编码信息特征。 在此基础上,所述模型预测单元可以包括:
帧类型确定单元,用于确定所述待处理编码单元所属视频帧图像的类型为P帧还是B帧;
P帧模型预测单元,用于在所述帧类型确定单元确定为P帧时,将所述预测特征向量样本输入所述P帧预测模型,得到所述P帧预测模型输出的预测结果;
B帧模型预测单元,用于在所述帧类型确定单元确定为B帧时,将所述预测特征向量样本输入所述B帧预测模型,得到所述B帧预测模型输出的预测结果。
可选的,所述特征获取单元可以包括:
第一特征获取单元,用于获取所述待处理编码单元的代价、量化系数、失真及方差;
第二特征获取单元,用于获取所述待处理编码单元所在编码树单元的近邻编码树单元的代价和深度信息。
可选的,所述近邻平均代价确定单元可以包括:
第一近邻平均代价确定子单元,用于从所述待处理编码单元所在编码树单元的每一近邻编码树单元中,确定与所述待处理编码单元相同编码深度的编码单元的平均代价;
第二近邻平均代价确定子单元,用于按照每一所述近邻编码树单元与所述待处理编码单元所在编码树单元的方位关系,确定每一所述近邻编码树单元的权重值;
第三近邻平均代价确定子单元,用于根据每一所述近邻编码树单元的权重值及其平均代价,确定各所述近邻编码树单元的加权平均代价,作为第一平均代价。
可选的,所述深度划分判断单元可以包括:
代价阈值确定单元,用于根据所述第一平均代价及所述第二平均代价,确定代价阈值;
代价阈值比较单元,用于判断所述待处理编码单元的当前最优模式的代价是否小于所述代价阈值;若是,确定所述待处理编码单元不需要进行深度划分, 否则,确定所述待处理编码单元需要进行深度划分。
本申请实施例进一步公开了一种视频编码器,该视频编码器包括上述的编码单元深度确定装置。
进一步,视频编码器还可以包括上述介绍的预测模型。本申请公开的视频编码器相比于现有视频编码器,其编码速度大大提升,计算复杂度也大大降低。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (16)

  1. 一种编码单元深度确定方法,其特征在于,包括:
    确定待处理编码单元的当前最优模式的残差系数;
    在所述残差系数不为零时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向量样本;
    将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分;
    其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
  2. 根据权利要求1所述的方法,其特征在于,所述待处理编码单元属于非I帧视频图像。
  3. 根据权利要求1所述的方法,其特征在于,所述从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别提取设定类型的编码信息特征之前,该方法还包括:
    判断所述待处理编码单元的编码深度是否为零,若是,则执行所述从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别提取设定类型的编码信息特征的步骤。
  4. 根据权利要求3所述的方法,其特征在于,还包括:
    在判断所述待处理编码单元的编码深度不为零时,确定所述待处理编码单元所在编码树单元的近邻编码树单元中,与所述待处理编码单元相同编码深度的编码单元的平均代价,作为第一平均代价;
    确定所述待处理编码单元所在编码树单元中相同编码深度的已编码的编码单元的平均代价,作为第二平均代价;
    根据所述第一平均代价及所述第二平均代价,确定所述待处理编码单元是否需要进行深度划分。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述预测模型包括P帧预测模型和B帧预测模型,所述P帧预测模型预训练时使用的训练样本为, 属于P帧视频图像的编码单元中提取的所述设定类型的编码信息特征,所述B帧预测模型预训练时使用的训练样本为,属于B帧视频图像的编码单元中提取的所述设定类型的编码信息特征;
    所述将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,包括:
    确定所述待处理编码单元所属视频帧图像的类型为P帧还是B帧;
    若为P帧,则将所述预测特征向量样本输入所述P帧预测模型,得到所述P帧预测模型输出的预测结果;
    若为B帧,则将所述预测特征向量样本输入所述B帧预测模型,得到所述B帧预测模型输出的预测结果。
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,包括:
    获取所述待处理编码单元的代价、量化系数、失真及方差;
    获取所述待处理编码单元所在编码树单元的近邻编码树单元的代价和深度信息。
  7. 根据权利要求4所述的方法,其特征在于,所述确定所述待处理编码单元所在编码树单元的近邻编码树单元中,与所述待处理编码单元相同编码深度的编码单元的平均代价,作为第一平均代价,包括:
    从所述待处理编码单元所在编码树单元的每一近邻编码树单元中,确定与所述待处理编码单元相同编码深度的编码单元的平均代价;
    按照每一所述近邻编码树单元与所述待处理编码单元所在编码树单元的方位关系,确定每一所述近邻编码树单元的权重值;
    根据每一所述近邻编码树单元的权重值及其平均代价,确定各所述近邻编码树单元的加权平均代价,作为第一平均代价。
  8. 根据权利要求4或7所述的方法,其特征在于,所述根据所述第一平均代价及所述第二平均代价,确定所述待处理编码单元是否需要进行深度划分,包括:
    根据所述第一平均代价及所述第二平均代价,确定代价阈值;
    判断所述待处理编码单元的当前最优模式的代价是否小于所述代价阈值;
    若是,确定所述待处理编码单元不需要进行深度划分,否则,确定所述待处理编码单元需要进行深度划分。
  9. 一种编码单元深度确定装置,其特征在于,包括:
    残差系数确定单元,用于确定待处理编码单元的当前最优模式的残差系数;
    特征获取单元,用于在所述残差系数不为零时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别获取设定类型的编码信息特征,组成预测特征向量样本;
    模型预测单元,用于将所述预测特征向量样本输入预训练的预测模型中,得到所述预测模型输出的预测结果,所述预测结果用于表明所述待处理编码单元是否需要进行深度划分;
    其中,所述预测模型为利用标记有分类结果的训练样本预训练得到,所述训练样本包括所述设定类型的编码信息特征。
  10. 根据权利要求9所述的装置,其特征在于,所述残差系数确定单元具体用于,确定处于非I帧视频图像的待处理编码单元的当前最优模式的残差系数。
  11. 根据权利要求9所述的装置,其特征在于,还包括:
    编码深度判断单元,用于判断所述待处理编码单元的编码深度是否为零;
    所述特征获取单元具体用于,在所述编码深度判断单元的判断结果为是时,从所述待处理编码单元及所述待处理编码单元所在编码树单元的近邻编码树单元中,分别提取设定类型的编码信息特征。
  12. 根据权利要求11所述的装置,其特征在于,还包括:
    近邻平均代价确定单元,用于在判断所述待处理编码单元的编码深度不为零时,确定所述待处理编码单元所在编码树单元的近邻编码树单元中,与所述待处理编码单元相同编码深度的编码单元的平均代价,作为第一平均代价;
    自身平均代价确定单元,用于确定所述待处理编码单元所在编码树单元中相同编码深度的已编码的编码单元的平均代价,作为第二平均代价;
    深度划分判断单元,用于根据所述第一平均代价及所述第二平均代价,确 定所述待处理编码单元是否需要进行深度划分。
  13. 根据权利要求9-12任一项所述的装置,其特征在于,所述预测模型包括P帧预测模型和B帧预测模型,所述P帧预测模型预训练时使用的训练样本为,属于P帧视频图像的编码单元中提取的所述设定类型的编码信息特征,所述B帧预测模型预训练时使用的训练样本为,属于B帧视频图像的编码单元中提取的所述设定类型的编码信息特征;
    所述模型预测单元包括:
    帧类型确定单元,用于确定所述待处理编码单元所属视频帧图像的类型为P帧还是B帧;
    P帧模型预测单元,用于在所述帧类型确定单元确定为P帧时,将所述预测特征向量样本输入所述P帧预测模型,得到所述P帧预测模型输出的预测结果;
    B帧模型预测单元,用于在所述帧类型确定单元确定为B帧时,将所述预测特征向量样本输入所述B帧预测模型,得到所述B帧预测模型输出的预测结果。
  14. 根据权利要求9-12任一项所述的装置,其特征在于,所述特征获取单元包括:
    第一特征获取单元,用于获取所述待处理编码单元的代价、量化系数、失真及方差;
    第二特征获取单元,用于获取所述待处理编码单元所在编码树单元的近邻编码树单元的代价和深度信息。
  15. 根据权利要求12所述的装置,其特征在于,所述近邻平均代价确定单元包括:
    第一近邻平均代价确定子单元,用于从所述待处理编码单元所在编码树单元的每一近邻编码树单元中,确定与所述待处理编码单元相同编码深度的编码单元的平均代价;
    第二近邻平均代价确定子单元,用于按照每一所述近邻编码树单元与所述待处理编码单元所在编码树单元的方位关系,确定每一所述近邻编码树单元的权重值;
    第三近邻平均代价确定子单元,用于根据每一所述近邻编码树单元的权重值及其平均代价,确定各所述近邻编码树单元的加权平均代价,作为第一平均代价。
  16. 一种计算机可读存储介质,存储有程序指令,其特征在于,处理器执行所存储的程序指令时执行根据权利要求1至8中任一项所述的方法。
PCT/CN2017/115175 2017-04-21 2017-12-08 编码单元深度确定方法及装置 WO2018192235A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020197027603A KR102252816B1 (ko) 2017-04-21 2017-12-08 부호화유닛 심도 확정 방법 및 장치
EP17906258.3A EP3614666A4 (en) 2017-04-21 2017-12-08 METHOD AND DEVICE FOR DETERMINING DEPTH OF CODING UNIT
JP2019527221A JP6843239B2 (ja) 2017-04-21 2017-12-08 符号化ユニットの深さ特定方法及び装置
US16/366,595 US10841583B2 (en) 2017-04-21 2019-03-27 Coding unit depth determining method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710266798.8 2017-04-21
CN201710266798.8A CN108737841B (zh) 2017-04-21 2017-04-21 编码单元深度确定方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/366,595 Continuation US10841583B2 (en) 2017-04-21 2019-03-27 Coding unit depth determining method and apparatus

Publications (1)

Publication Number Publication Date
WO2018192235A1 true WO2018192235A1 (zh) 2018-10-25

Family

ID=63856188

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/115175 WO2018192235A1 (zh) 2017-04-21 2017-12-08 编码单元深度确定方法及装置

Country Status (6)

Country Link
US (1) US10841583B2 (zh)
EP (1) EP3614666A4 (zh)
JP (1) JP6843239B2 (zh)
KR (1) KR102252816B1 (zh)
CN (1) CN108737841B (zh)
WO (1) WO2018192235A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109862354A (zh) * 2019-02-18 2019-06-07 南京邮电大学 一种基于残差分布的hevc快速帧间深度划分方法
CN115278260A (zh) * 2022-07-15 2022-11-01 重庆邮电大学 基于空时域特性的vvc快速cu划分方法及存储介质
WO2023051583A1 (zh) * 2021-09-30 2023-04-06 中兴通讯股份有限公司 视频编码单元划分方法及装置、计算机设备和计算机可读存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11128871B2 (en) 2018-04-25 2021-09-21 Panasonic Intellectual Property Corporation Of America Encoder for adaptively determining information related to splitting based on characteristics of neighboring samples
GB2578769B (en) 2018-11-07 2022-07-20 Advanced Risc Mach Ltd Data processing systems
GB2583061B (en) * 2019-02-12 2023-03-15 Advanced Risc Mach Ltd Data processing systems
CN109889842B (zh) * 2019-02-21 2022-02-08 北方工业大学 基于knn分类器的虚拟现实视频cu划分算法
CN110581990B (zh) * 2019-09-25 2021-07-27 杭州当虹科技股份有限公司 一种适用于hevc 4k和8k超高清编码的tu递归快速算法
CN113593539B (zh) * 2020-04-30 2024-08-02 阿里巴巴集团控股有限公司 流式端到端语音识别方法、装置及电子设备
CN112866692B (zh) * 2021-01-18 2022-04-26 北京邮电大学 一种基于hevc的编码单元划分方法、装置及电子设备
CN112866693B (zh) * 2021-03-25 2023-03-24 北京百度网讯科技有限公司 编码单元cu的划分方法、装置、电子设备和存储介质
CN113691808A (zh) * 2021-07-01 2021-11-23 杭州未名信科科技有限公司 一种基于神经网络的帧间编码单元尺寸划分方法
CN113382245A (zh) * 2021-07-02 2021-09-10 中国科学技术大学 图像划分方法和装置
CN114157863B (zh) * 2022-02-07 2022-07-22 浙江智慧视频安防创新中心有限公司 基于数字视网膜的视频编码方法、系统及存储介质
CN116170594B (zh) * 2023-04-19 2023-07-14 中国科学技术大学 一种基于率失真代价预测的编码方法和装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106162167A (zh) * 2015-03-26 2016-11-23 中国科学院深圳先进技术研究院 基于学习的高效视频编码方法

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI280803B (en) * 2005-07-20 2007-05-01 Novatek Microelectronics Corp Method and apparatus for motion estimation
US8913662B2 (en) * 2011-01-06 2014-12-16 Qualcomm Incorporated Indicating intra-prediction mode selection for video coding using CABAC
US8964852B2 (en) * 2011-02-23 2015-02-24 Qualcomm Incorporated Multi-metric filtering
US9247258B2 (en) * 2011-10-26 2016-01-26 Qualcomm Incorporated Unified design for picture partitioning schemes
CN102420990B (zh) * 2011-12-15 2013-07-10 北京工业大学 一种面向多视点视频的快速编码方法
KR20140056599A (ko) * 2012-10-30 2014-05-12 광주과학기술원 Hevc 예측 모드 결정 방법 및 장치
CN103067704B (zh) * 2012-12-12 2015-12-09 华中科技大学 一种基于编码单元层次提前跳过的视频编码方法和系统
US9674542B2 (en) * 2013-01-02 2017-06-06 Qualcomm Incorporated Motion vector prediction for video coding
US10021414B2 (en) * 2013-01-04 2018-07-10 Qualcomm Incorporated Bitstream constraints and motion vector restriction for inter-view or inter-layer reference pictures
CN103533349A (zh) * 2013-09-26 2014-01-22 广东电网公司电力科学研究院 基于支持向量机的b帧快速帧间预测宏块模式选择方法
CN104853191B (zh) * 2015-05-06 2017-09-05 宁波大学 一种hevc的快速编码方法
CN105306947B (zh) * 2015-10-27 2018-08-07 中国科学院深圳先进技术研究院 基于机器学习的视频转码方法
CN105430407B (zh) * 2015-12-03 2018-06-05 同济大学 应用于h.264到hevc转码的快速帧间模式决策方法
CN105721865A (zh) * 2016-02-01 2016-06-29 同济大学 一种hevc帧间编码单元划分的快速决策算法

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106162167A (zh) * 2015-03-26 2016-11-23 中国科学院深圳先进技术研究院 基于学习的高效视频编码方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIQUAN SHEN ; ZHAOYANG ZHANG ; PING AN : "Fast CU size decision and mode decision algorithm for HEVC intra coding", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, vol. 59, no. 1, 4 April 2013 (2013-04-04), pages 207 - 213, XP011499424 *
See also references of EP3614666A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109862354A (zh) * 2019-02-18 2019-06-07 南京邮电大学 一种基于残差分布的hevc快速帧间深度划分方法
CN109862354B (zh) * 2019-02-18 2023-02-10 南京邮电大学 一种基于残差分布的hevc快速帧间深度划分方法
WO2023051583A1 (zh) * 2021-09-30 2023-04-06 中兴通讯股份有限公司 视频编码单元划分方法及装置、计算机设备和计算机可读存储介质
CN115278260A (zh) * 2022-07-15 2022-11-01 重庆邮电大学 基于空时域特性的vvc快速cu划分方法及存储介质

Also Published As

Publication number Publication date
KR20190117708A (ko) 2019-10-16
US10841583B2 (en) 2020-11-17
JP6843239B2 (ja) 2021-03-17
EP3614666A4 (en) 2020-04-08
US20190222842A1 (en) 2019-07-18
KR102252816B1 (ko) 2021-05-18
JP2020500482A (ja) 2020-01-09
EP3614666A1 (en) 2020-02-26
CN108737841A (zh) 2018-11-02
CN108737841B (zh) 2020-11-24

Similar Documents

Publication Publication Date Title
WO2018192235A1 (zh) 编码单元深度确定方法及装置
US11070803B2 (en) Method and apparatus for determining coding cost of coding unit and computer-readable storage medium
KR100952340B1 (ko) 시공간적 복잡도를 이용한 부호화 모드 결정 방법 및 장치
RU2559738C2 (ru) Способ и устройство для кодирования/декодирования вектора движения
WO2022104498A1 (zh) 帧内预测方法、编码器、解码器以及计算机存储介质
JP5882228B2 (ja) 映像符号化装置、映像復号化方法及び装置
WO2018010492A1 (zh) 视频编码中帧内预测模式的快速决策方法
CN107846593B (zh) 一种率失真优化方法及装置
WO2019148906A1 (zh) 视频编码方法、计算机设备和存储介质
CN105430391B (zh) 基于逻辑回归分类器的帧内编码单元快速选择方法
CN101888546B (zh) 一种运动估计的方法及装置
CN112291562B (zh) 针对h.266/vvc的快速cu分区和帧内模式决策方法
TWI722465B (zh) 子塊的邊界增強
WO2006133613A1 (en) Method for reducing image block effects
CN114900691B (zh) 编码方法、编码器及计算机可读存储介质
WO2020248715A1 (zh) 基于高效率视频编码的编码管理方法及装置
CN109688411B (zh) 一种视频编码率失真代价估计方法和装置
JP2011239365A (ja) 動画像符号化装置及びその制御方法、コンピュータプログラム
JP4216769B2 (ja) 動画像符号化方法、動画像符号化装置、動画像符号化プログラム及びそのプログラムを記録したコンピュータ読み取り可能な記録媒体
WO2021031225A1 (zh) 一种运动矢量导出方法、装置及电子设备
CN113273194A (zh) 图像分量预测方法、编码器、解码器以及存储介质
Chen et al. CNN-based fast HEVC quantization parameter mode decision
CN114666579A (zh) 视频编码方法、装置、电子设备及存储介质
WO2020258052A1 (zh) 图像分量预测方法、装置及计算机存储介质
CN114143536A (zh) 一种shvc空间可伸缩帧的视频编码方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17906258

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019527221

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20197027603

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017906258

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017906258

Country of ref document: EP

Effective date: 20191121