WO2024077767A1 - Procédé et appareil de traitement de décision de codage orienté modèle d'apprentissage, et dispositif - Google Patents

Procédé et appareil de traitement de décision de codage orienté modèle d'apprentissage, et dispositif Download PDF

Info

Publication number
WO2024077767A1
WO2024077767A1 PCT/CN2022/139790 CN2022139790W WO2024077767A1 WO 2024077767 A1 WO2024077767 A1 WO 2024077767A1 CN 2022139790 W CN2022139790 W CN 2022139790W WO 2024077767 A1 WO2024077767 A1 WO 2024077767A1
Authority
WO
WIPO (PCT)
Prior art keywords
coding
rate
distortion
decision
coding unit
Prior art date
Application number
PCT/CN2022/139790
Other languages
English (en)
Chinese (zh)
Inventor
高伟
袁航
李革
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Publication of WO2024077767A1 publication Critical patent/WO2024077767A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria

Definitions

  • the present application relates to the field of video coding technology, and in particular to a coding decision processing method, device and equipment for a learning model.
  • each frame of the video is determined as a video picture, and then each video picture is encoded.
  • the selection of the encoding mode is an important task in encoding each video picture.
  • the encoder will pre-encode all candidate modes when making encoding mode decisions, and then calculate the rate-distortion cost (Rate-Distortion Cost, RDC) corresponding to each mode, where the rate-distortion cost directly reflects the encoding cost corresponding to the use of the encoding mode. Therefore, the encoding mode with the smallest RDC will be selected as the best mode.
  • the current purpose of training learning models is to improve the accuracy of candidate mode selection by the learning model, and it does not actually reduce the encoding cost of the encoder for the image while improving the accuracy of candidate mode selection.
  • a learning model with a 99% encoding mode selection accuracy rate and a 1% error rate may lead to a large increase in encoding cost, which is likely to be greater than the encoding cost of a learning model with a 97% encoding mode selection accuracy rate.
  • the present application provides a coding decision processing method, device and equipment for a learning model, which can calculate the coding rate distortion difference of each coding unit through the training rate distortion loss value and the optimal rate distortion loss value under the training decision mode of each coding unit, calculate the rate distortion loss reference value according to the coding rate distortion difference, and calculate the loss function of the coding decision model according to the rate distortion loss reference value, so as to solve the problem that the coding cost of the encoder for the picture is not really reduced while improving the accuracy of the candidate mode selection in the related art, so as to achieve that in the process of training the learning model, the rate distortion cost value of the picture coding is added to the loss function of the learning model, and the initial coding decision model is trained, so as to ensure that the target coding decision model obtained by training can reduce the rate distortion cost of encoding the picture while improving the accuracy of the coding decision, so as to achieve the effect of optimizing the video coding.
  • Some embodiments of the present application provide a coding decision processing method for a learning model, the method comprising: dividing a training picture into a plurality of coding units, and determining an optimal decision mode corresponding to each coding unit; for each coding unit, performing the following processing: inputting the coding unit into an initial coding decision model to obtain a training decision mode of the coding decision model, determining a training rate-distortion loss value of the coding unit under the training decision mode, and calculating a coding rate-distortion difference of the coding unit according to the optimal rate-distortion loss value and the training rate-distortion loss value corresponding to the coding unit; determining a rate-distortion loss reference value of each coding unit according to the rate-distortion difference of each coding unit; calculating a loss function value of the initial coding decision model according to the rate-distortion loss reference value of each coding unit; and training the initial coding decision model according to the obtained loss function value to obtain a
  • the training picture can be obtained in the following manner: obtain a training video set, wherein the training video set includes multiple training videos; for each training video, obtain an image frame, thereby obtaining multiple training pictures.
  • the following processing can also be performed: determine multiple optional decision modes of the coding unit, and calculate the rate-distortion loss value of the coding unit in each optional decision mode, wherein the step of determining the training rate-distortion loss value of the coding unit in the training decision mode includes: determining a decision mode corresponding to the training decision mode from the optional decision modes, wherein the training decision mode is one of the optional decision modes; and determining the rate-distortion loss value of the decision mode corresponding to the training decision mode in the optional decision modes as the training rate-distortion loss value.
  • the optimal decision mode may be one of the optional decision modes.
  • the coding rate distortion difference of each coding unit is calculated by the following formula:
  • C modes represents all optional decision modes corresponding to the coding unit
  • C j represents the training decision mode corresponding to the coding unit
  • the training decision mode is the j-th decision mode in C modes .
  • Ji (C j ) represents the training rate-distortion loss value of the i-th coding unit in the training decision mode.
  • C best represents the best decision mode corresponding to the coding unit.
  • Ji (C best ) represents the best rate-distortion loss value of the i-th coding unit in the best decision mode.
  • RDCG i,j represents the coding rate-distortion difference of the i-th coding unit in the j-th decision mode.
  • the step of determining the rate-distortion loss reference value of each coding unit may include: determining the maximum rate-distortion difference among all coding units under the target constraint and the minimum rate-distortion difference among all coding units under the target constraint; calculating a first difference between the coding rate-distortion difference of each coding unit and the minimum rate-distortion difference among all coding units under the target constraint; calculating a second difference between the maximum rate-distortion difference among all coding units under the target constraint and the minimum rate-distortion difference among all coding units under the target constraint; and determining the ratio of the first difference to the second difference as the rate-distortion loss reference value of each coding unit.
  • the rate-distortion loss reference value of each coding unit is calculated by the following formula:
  • Di ,j,s represents the rate-distortion loss reference value of the i-th coding unit in the j-th decision mode under the target constraint condition S
  • RDCG i,j represents the rate-distortion loss reference value of the i-th coding unit in the j-th decision mode under the target constraint condition S
  • RDCG Max,s represents the maximum rate-distortion difference among all coding units under the target constraint condition S
  • RDCG Min,s represents the minimum rate-distortion difference among all coding units under the target constraint condition S.
  • the step of calculating the loss function value of the initial coding decision model may include: determining the original loss function value and the loss function parameter value of the initial coding decision model; calculating the loss function value of the initial coding decision model according to the original loss function value, the loss function parameter value and the rate-distortion loss reference value of each coding unit.
  • the loss function value of the initial encoding decision model is calculated by the following formula:
  • loss opt loss Org + ⁇ ⁇ D i,j,s ,
  • loss opt represents the loss function value of the initial coding decision model
  • represents the loss function parameter value
  • Di,j,s represents the rate-distortion loss reference value of each coding unit under the target constraint condition S
  • loss Org represents the original loss function value
  • Some other embodiments of the present application further provide a coding decision processing device for a learning model, the device comprising: a training picture determination module, configured to divide the training picture into a plurality of coding units, and determine the best decision mode corresponding to each coding unit;
  • the coding unit calculation module is configured to perform the following processing for each coding unit: input the coding unit into the initial coding decision model to obtain the training decision mode of the coding decision model, determine the training rate distortion loss value of the coding unit in the training decision mode, and calculate the coding rate distortion difference of the coding unit according to the optimal rate distortion loss value and the training rate distortion loss value corresponding to the coding unit;
  • a rate-distortion loss reference value calculation module configured to determine a rate-distortion loss reference value of each coding unit according to a coding rate-distortion difference of each coding unit;
  • a loss function value calculation module is configured to calculate a loss function value of an initial coding decision model according to a rate-distortion loss reference value of each coding unit;
  • the target coding decision model training module is configured to train the initial coding decision model according to the obtained loss function value to obtain a trained target coding decision model.
  • the training pictures may be obtained in the following manner: obtaining a training video set, wherein the training video set includes a plurality of training videos; and obtaining an image frame for each training video, thereby obtaining a plurality of training pictures.
  • the coding unit calculation module is also configured to perform the following processing: determine multiple optional decision modes of the coding unit, and calculate the rate-distortion loss value of the coding unit under each optional decision mode, wherein the step of determining the training rate-distortion loss value of the coding unit under the training decision mode includes: determining a decision mode corresponding to the training decision mode from the optional decision modes, and the training decision mode is one of the optional decision modes; determining the rate-distortion loss value of the decision mode corresponding to the training decision mode in the optional decision modes as the training rate-distortion loss value.
  • the rate-distortion loss reference value calculation module determines the rate-distortion loss reference value of each coding unit according to the coding rate-distortion difference of each coding unit, including: determining the maximum rate-distortion difference among all coding units under the target constraint and the minimum rate-distortion difference among all coding units under the target constraint; calculating a first difference between the coding rate-distortion difference of each coding unit and the minimum rate-distortion difference among all coding units under the target constraint; calculating a second difference between the maximum rate-distortion difference among all coding units under the target constraint and the minimum rate-distortion difference among all coding units under the target constraint; and determining the ratio of the first difference to the second difference as the rate-distortion loss reference value of each coding unit.
  • the step in which the loss function value calculation module calculates the loss function value of the initial coding decision model according to the rate-distortion loss reference value of each coding unit includes: determining the original loss function value and the loss function parameter value of the initial coding decision model; calculating the loss function value of the initial coding decision model according to the original loss function value, the loss function parameter value and the rate-distortion loss reference value of each coding unit.
  • Some further embodiments of the present application also provide an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, when the electronic device is running, the processor communicates with the memory through the bus, and when the machine-readable instructions are executed by the processor, the steps of the coding decision processing method for the learning model as described above are performed.
  • Some other embodiments of the present application further provide a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, the steps of the above-mentioned coding decision processing method for the learning model are executed.
  • the coding decision processing method, device and equipment for learning models provided in the embodiments of the present application can calculate the coding rate distortion difference of each coding unit through the training rate distortion loss value and the optimal rate distortion loss value under the training decision mode of each coding unit, calculate the rate distortion loss reference value according to the coding rate distortion difference, and calculate the loss number of the coding decision model according to the rate distortion loss reference value, so as to solve the problem that the accuracy of candidate mode selection is not improved while the encoding cost of the encoder for the picture is not reduced.
  • the rate distortion cost value of the picture encoding is added to the loss function of the learning model, and the initial coding decision model is trained, so as to ensure that the target coding decision model obtained by training can reduce the rate distortion cost of encoding the picture while improving the accuracy of the coding decision, so as to achieve the effect of optimizing video encoding.
  • FIG1 is a flow chart of a coding decision processing method for a learning model provided in an embodiment of the present application
  • FIG2 is a schematic diagram of a training image provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of the structure of a coding decision processing device for a learning model provided in an embodiment of the present application
  • FIG. 4 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • the present application can be applied to the field of video coding technology.
  • each frame of the video is determined as a video picture, and then each video picture is encoded.
  • the selection of the encoding mode is an important task in encoding each video picture.
  • the encoder will pre-encode all candidate modes when making encoding mode decisions, and then calculate the rate-distortion cost (Rate-Distortion Cost, RDC) corresponding to each mode, where the rate-distortion cost directly reflects the encoding cost corresponding to the use of the encoding mode. Therefore, the encoding mode with the smallest RDC will be selected as the best mode.
  • the current purpose of training learning models is to improve the accuracy of candidate mode selection by the learning model, and it does not actually reduce the encoding cost of the encoder for the image while improving the accuracy of candidate mode selection.
  • a learning model with a 99% encoding mode selection accuracy rate and a 1% error rate may lead to a large increase in encoding cost, which is likely to be greater than the encoding cost of a learning model with a 97% encoding mode selection accuracy rate.
  • the embodiments of the present application provide a coding decision processing method, device and equipment for a learning model to solve the problem in the related technology that the coding cost of the encoder for the picture is not truly reduced while improving the accuracy of candidate mode selection.
  • the rate-distortion cost of picture encoding is added to the loss function of the learning model, and the initial coding decision model is trained, ensuring that the target coding decision model obtained by training can improve the coding decision accuracy while reducing the rate-distortion cost of encoding the picture, thereby achieving the effect of optimizing video encoding.
  • Figure 1 is a flow chart of a coding decision processing method for a learning model provided in an embodiment of the present application.
  • the coding decision processing method for a learning model provided in an embodiment of the present application may include:
  • the training images may be obtained in the following manner: obtaining a training video set, wherein the training video set includes a plurality of training videos; and for each training video, projecting the training video to obtain a plurality of training images.
  • the training video can be a three-dimensional dynamic point cloud video.
  • the three-dimensional dynamic point cloud video can first be projected into three two-dimensional videos that record different information, namely, a two-dimensional occupancy video, a two-dimensional geometry video, and a two-dimensional attribute video, and then each frame image of the above three two-dimensional videos is acquired to obtain a two-dimensional point cloud occupancy picture, a two-dimensional point cloud geometry picture, and a two-dimensional point cloud attribute picture.
  • Example 1 please refer to Figure 2, which is a schematic diagram of the training picture provided in an embodiment of the present application.
  • the training picture 201 includes: a first encoding unit 202, a second encoding unit 203 and a target constraint condition 204.
  • the training picture may be divided into coding units of the same size, for example, the first coding unit 202 and the second coding unit 203, or the training picture may be divided into coding units of different sizes.
  • the best decision mode of the first encoding unit 202 is the division mode
  • the best decision mode of the second encoding unit 203 is the non-division mode.
  • Example 2 When performing intra-frame prediction on a training picture, there are 65 or 66 decision modes for intra-frame prediction of the training picture, such as brightness decision, angle decision, chrominance decision, etc.
  • the decision mode of each coding unit is also more diverse, which will not be repeated here.
  • each coding unit For each coding unit, perform the following processing: input the coding unit into the initial coding decision model to obtain the training decision mode of the coding decision model, determine the training rate distortion loss value of the coding unit under the training decision mode, and calculate the coding rate distortion difference of the coding unit according to the optimal rate distortion loss value and the training rate distortion loss value corresponding to the coding unit.
  • each coding unit the following processing may also be performed: determining multiple optional decision modes of the coding unit, and calculating the rate-distortion loss value of the coding unit in each optional decision mode.
  • the best decision mode and the training decision mode are both optional decision modes.
  • the step of determining the training rate-distortion loss value of the encoding unit under the training decision mode includes: determining a decision mode corresponding to the training decision mode from the optional decision modes, and the training decision mode is one of the optional decision modes; and determining the rate-distortion loss value of the decision mode corresponding to the training decision mode in the optional decision modes as the training rate-distortion loss value.
  • the coding rate distortion difference of each coding unit can be calculated by the following formula:
  • C modes represents all optional decision modes corresponding to the coding unit
  • C j represents the training decision mode corresponding to the coding unit
  • the training decision mode is the j-th decision mode in C modes .
  • Ji (C j ) represents the training rate-distortion loss value of the i-th coding unit in the training decision mode.
  • C best represents the best decision mode corresponding to the coding unit.
  • Ji (C best ) represents the best rate-distortion loss value of the i-th coding unit in the best decision mode.
  • RDCG i,j represents the coding rate-distortion difference of the i-th coding unit in the j-th decision mode.
  • S103 Determine a rate-distortion loss reference value of each coding unit according to the coding rate-distortion difference of each coding unit.
  • the step of determining the rate-distortion loss reference value of each coding unit may include: determining the maximum rate-distortion difference among all coding units under the target restriction and the minimum rate-distortion difference among all coding units under the target restriction; calculating a first difference between the coding rate-distortion difference of each coding unit and the minimum rate-distortion difference among all coding units under the target restriction; calculating a second difference between the maximum rate-distortion difference among all coding units under the target restriction and the minimum rate-distortion difference among all coding units under the target restriction; and determining the ratio of the first difference to the second difference as the rate-distortion loss reference value of each coding unit.
  • the target constraint condition of the second coding unit 203 is all coding units 204 within a certain range around the second coding unit 203, that is, the target constraint condition 204, there are multiple coding units in the target constraint condition 204, determine the maximum rate distortion difference and the minimum rate distortion difference among all coding units in the target constraint condition 204; calculate a first difference between the coding rate distortion difference of each coding unit and the minimum rate distortion difference among all coding units under the target constraint condition; calculate a second difference between the maximum rate distortion difference among all coding units under the target constraint condition and the minimum rate distortion difference among all coding units under the target constraint condition; and determine the ratio of the first difference to the second difference as a rate distortion loss reference value for each coding unit.
  • the target constraint may be a coding unit in the same brightness range as each coding unit, and the maximum rate-distortion difference and the minimum rate-distortion difference among all coding units under the target constraint are determined; a first difference between the coding rate-distortion difference of each coding unit and the minimum rate-distortion difference among all coding units under the target constraint is calculated; a second difference between the maximum rate-distortion difference among all coding units under the target constraint and the minimum rate-distortion difference among all coding units under the target constraint is calculated; and a ratio of the first difference to the second difference is determined as a rate-distortion loss reference value for each coding unit.
  • the magnitude of the rate-distortion loss of the rate-distortion difference of each coding unit within the target constraint condition is determined compared with the coding unit within the target constraint condition.
  • the rate-distortion loss reference value of each coding unit can be calculated by the following formula:
  • Di ,j,s represents the rate-distortion loss reference value of the i-th coding unit in the j-th decision mode under the target constraint condition S
  • RDCG i,j represents the rate-distortion loss reference value of the i-th coding unit in the j-th decision mode under the target constraint condition S
  • RDCG Max,s represents the maximum rate-distortion difference among all coding units under the target constraint condition S
  • RDCG Min,s represents the minimum rate-distortion difference among all coding units under the target constraint condition S.
  • S104 Calculate a loss function value of an initial coding decision model according to a rate-distortion loss reference value of each coding unit.
  • the step of calculating the loss function value of the initial coding decision model according to the rate-distortion loss reference value of each coding unit may include:
  • the loss function value of the initial coding decision model calculated by the present application is based on the original loss function value.
  • the influence of the wrong decision on the coding loss is calculated, and the influence of the wrong decision on the coding loss is added to the calculation of the loss function value, so as to avoid training the initial coding decision model only for accuracy without considering the problem of reducing the rate-distortion loss.
  • the loss function value of the initial encoding decision model can be calculated by the following formula:
  • loss opt loss Org + ⁇ ⁇ D i,j,s ,
  • loss opt represents the loss function value of the initial coding decision model
  • represents the loss function parameter value
  • Di,j,s represents the rate-distortion loss reference value of each coding unit under the target constraint condition S
  • loss Org represents the original loss function value
  • the initial coding decision model is trained to obtain a trained target coding decision model.
  • Ji is the rate-distortion loss when the coding unit uses the QT mode for coding
  • Ji (Nop-Split) is the rate-distortion loss when the coding unit uses the Non-Split mode for coding.
  • loss opt1 and loss opt2 are:
  • the coding decision processing method for the learning model provided in the embodiment of the present application can calculate the coding rate distortion difference of each coding unit through the training rate distortion loss value and the optimal rate distortion loss value under the training decision mode of each coding unit, calculate the rate distortion loss reference value according to the coding rate distortion difference, and calculate the loss number of the coding decision model according to the rate distortion loss reference value, so as to solve the problem that the accuracy of candidate mode selection is not improved while the encoding cost of the encoder for the picture is not reduced.
  • the rate distortion cost value of the picture encoding is added to the loss function of the learning model, and the initial coding decision model is trained, so as to ensure that the target coding decision model obtained by training can reduce the rate distortion cost of encoding the picture while improving the accuracy of the coding decision, so as to achieve the effect of optimizing video encoding.
  • the embodiment of the present application also provides a coding decision processing device for a learning model corresponding to the coding decision processing method for a learning model. Since the principle of solving the problem by the device in the embodiment of the present application is similar to the above-mentioned coding decision processing method for a learning model in the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and the repeated parts will not be repeated.
  • Figure 3 is a schematic diagram of the structure of a coding decision processing device for a learning model provided in an embodiment of the present application.
  • the coding decision processing device 300 for a learning model includes:
  • the training picture determination module 301 is configured to divide the training picture into a plurality of coding units and determine the best decision mode corresponding to each coding unit.
  • the coding unit calculation module 302 is configured to perform the following processing for each coding unit: input the coding unit into the initial coding decision model, obtain the training decision mode of the coding decision model, determine the training rate distortion loss value of the coding unit under the training decision mode, and calculate the coding rate distortion difference of the coding unit according to the optimal rate distortion loss value and the training rate distortion loss value corresponding to the coding unit.
  • the rate-distortion loss reference value calculation module 303 is configured to determine a rate-distortion loss reference value of each coding unit according to the coding rate-distortion difference of each coding unit.
  • the loss function value calculation module 304 is configured to calculate the loss function value of the initial coding decision model according to the rate-distortion loss reference value of each coding unit.
  • the target coding decision model training module 305 is configured to train the initial coding decision model according to the obtained loss function value to obtain a trained target coding decision model.
  • the coding decision processing device for the learning model provided in the embodiment of the present application can calculate the coding rate distortion difference of each coding unit through the training rate distortion loss value and the optimal rate distortion loss value under the training decision mode of each coding unit, calculate the rate distortion loss reference value according to the coding rate distortion difference, and calculate the loss number of the coding decision model according to the rate distortion loss reference value, so as to solve the problem existing in the related art that the coding cost of the encoder for the picture is not truly reduced while improving the accuracy of the candidate mode selection, so as to achieve the result that in the process of training the learning model, the rate distortion cost value of the picture encoding is added to the loss function of the learning model, and the initial coding decision model is trained, so as to ensure that the target coding decision model obtained by training can reduce the rate distortion cost of encoding the picture while improving the coding decision accuracy, so as to achieve the effect of optimizing video encoding.
  • Fig. 4 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • the electronic device 400 includes a processor 410, a memory 420 and a bus 430.
  • the memory 420 stores machine-readable instructions executable by the processor 410.
  • the processor 410 communicates with the memory 420 through the bus 430.
  • the machine-readable instructions are executed by the processor 410, the steps of the coding decision processing method for the learning model in the method embodiment shown in Figure 1 above can be executed.
  • the specific implementation method can be found in the method embodiment, which will not be repeated here.
  • An embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored.
  • the computer program When the computer program is executed by a processor, it can execute the steps of the coding decision processing method for the learning model in the method embodiment shown in Figure 1 above.
  • the specific implementation method can be found in the method embodiment, which will not be repeated here.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are merely schematic.
  • the division of the units is only a logical function division. There may be other division methods in actual implementation.
  • multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some communication interfaces, and the indirect coupling or communication connection of devices or units can be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium that can be executed by a processor.
  • the technical solution of the present application can essentially or in other words, the part that contributes to the relevant technology or the part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including several instructions to enable a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.
  • the present application provides a coding decision processing method, device and equipment for a learning model, wherein the method includes: inputting the coding unit into the initial coding decision model, obtaining the training decision mode of the coding decision model, determining the training rate distortion loss value of the coding unit in the training decision mode, and calculating the coding rate distortion difference of the coding unit according to the optimal rate distortion loss value and the training rate distortion loss value corresponding to the coding unit; determining the rate distortion loss reference value of each coding unit according to the coding rate distortion difference of each coding unit; calculating the loss function value of the initial coding decision model according to the rate distortion loss reference value of each coding unit; and training the initial coding decision model according to the obtained loss function value to obtain a trained target coding decision model.
  • the trained target coding decision model can improve the coding decision accuracy while reducing the rate distortion cost of picture encoding.
  • the coding decision processing method, device and equipment for learning models of the present application are reproducible and can be used in a variety of industrial applications.
  • the coding decision processing method, device and equipment for learning models of the present application can be used in the field of video coding technology that requires video coding.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente demande concerne un procédé et un appareil de traitement de décision de codage orienté modèle d'apprentissage, ainsi qu'un dispositif. Le procédé consiste à : entrer chaque unité de codage dans un modèle de décision de codage initial pour obtenir un mode de décision d'apprentissage du modèle de décision de codage, déterminer une valeur de perte de distorsion de taux d'apprentissage de l'unité de codage dans le mode de décision d'apprentissage, et calculer une valeur de différence de distorsion de taux de codage de l'unité de codage selon une valeur de perte de distorsion de taux optimale et la valeur de perte de distorsion de taux d'apprentissage correspondant à l'unité de codage ; déterminer une valeur de référence de perte de distorsion de taux de chaque unité de codage en fonction de la valeur de différence de distorsion de taux de codage de chaque unité de codage ; calculer une valeur de fonction de perte du modèle de décision de codage initial en fonction de la valeur de référence de perte de distorsion de taux de chaque unité de codage ; et entraîner le modèle de décision de codage initial en fonction de la valeur de fonction de perte obtenue pour obtenir un modèle de décision de codage cible entraîné. Ainsi, le modèle de décision de codage cible obtenu par apprentissage réduit le coût de distorsion de taux pour le codage d'image tout en améliorant la précision de décision de codage.
PCT/CN2022/139790 2022-10-14 2022-12-16 Procédé et appareil de traitement de décision de codage orienté modèle d'apprentissage, et dispositif WO2024077767A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211256606.2 2022-10-14
CN202211256606.2A CN115334308B (zh) 2022-10-14 2022-10-14 一种面向学习模型的编码决策处理方法、装置及设备

Publications (1)

Publication Number Publication Date
WO2024077767A1 true WO2024077767A1 (fr) 2024-04-18

Family

ID=83913463

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/139790 WO2024077767A1 (fr) 2022-10-14 2022-12-16 Procédé et appareil de traitement de décision de codage orienté modèle d'apprentissage, et dispositif

Country Status (2)

Country Link
CN (1) CN115334308B (fr)
WO (1) WO2024077767A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170594B (zh) * 2023-04-19 2023-07-14 中国科学技术大学 一种基于率失真代价预测的编码方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104394409A (zh) * 2014-11-21 2015-03-04 西安电子科技大学 基于空域相关性的hevc预测模式快速选择方法
CN106713935A (zh) * 2017-01-09 2017-05-24 杭州电子科技大学 一种基于贝叶斯决策的hevc块划分快速方法
CN113767400A (zh) * 2019-03-21 2021-12-07 谷歌有限责任公司 使用率失真成本作为深度学习的损失函数
CN114745551A (zh) * 2021-01-07 2022-07-12 腾讯科技(深圳)有限公司 处理视频帧图像的方法及电子设备
WO2022159151A1 (fr) * 2021-01-19 2022-07-28 Tencent America LLC Compression d'image neuronale avec prédiction intra adaptative

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190045195A1 (en) * 2018-03-30 2019-02-07 Intel Corporation Reduced Partitioning and Mode Decisions Based on Content Analysis and Learning
CN109769119B (zh) * 2018-12-18 2021-01-19 中国科学院深圳先进技术研究院 一种低复杂度视频信号编码处理方法
CN110139098B (zh) * 2019-04-09 2023-01-06 中南大学 基于决策树的高效率视频编码器帧内快速算法选择方法
CN111355956B (zh) * 2020-03-09 2023-05-09 蔡晓刚 一种hevc帧内编码中基于深度学习的率失真优化快速决策系统及其方法
CN113242429B (zh) * 2021-05-11 2023-12-05 杭州网易智企科技有限公司 视频编码模式决策方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104394409A (zh) * 2014-11-21 2015-03-04 西安电子科技大学 基于空域相关性的hevc预测模式快速选择方法
CN106713935A (zh) * 2017-01-09 2017-05-24 杭州电子科技大学 一种基于贝叶斯决策的hevc块划分快速方法
CN113767400A (zh) * 2019-03-21 2021-12-07 谷歌有限责任公司 使用率失真成本作为深度学习的损失函数
CN114745551A (zh) * 2021-01-07 2022-07-12 腾讯科技(深圳)有限公司 处理视频帧图像的方法及电子设备
WO2022159151A1 (fr) * 2021-01-19 2022-07-28 Tencent America LLC Compression d'image neuronale avec prédiction intra adaptative

Also Published As

Publication number Publication date
CN115334308B (zh) 2022-12-27
CN115334308A (zh) 2022-11-11

Similar Documents

Publication Publication Date Title
US10425643B2 (en) Method and system for view optimization of a 360 degrees video
JP2020512772A (ja) Vrビデオ用に画像解像度を最適化してビデオストリーミングの帯域幅を最適化する画像処理のための方法及び装置
US10692249B2 (en) Octree traversal for anchor point cloud compression
US10230957B2 (en) Systems and methods for encoding 360 video
WO2024077767A1 (fr) Procédé et appareil de traitement de décision de codage orienté modèle d'apprentissage, et dispositif
KR102472971B1 (ko) 인공지능 모델을 이용한 동영상 인코딩 최적화 방법, 시스템, 및 컴퓨터 프로그램
WO2019076344A1 (fr) Procédé et appareil de sélection de bloc de référence pour une unité de codage, dispositif électronique et support de stockage
CN110620924A (zh) 编码数据的处理方法、装置、计算机设备及存储介质
WO2018058476A1 (fr) Procédé et dispositif de correction d'image
US20190333190A1 (en) Systems and methods for distortion removal at multiple quality levels
CN113012073A (zh) 视频质量提升模型的训练方法和装置
US10395337B2 (en) Image processing apparatus, image processing method, and storage medium
CN112437301A (zh) 一种面向视觉分析的码率控制方法、装置、存储介质及终端
CN110611842B (zh) 基于虚拟机的视频传输管理方法及相关装置
CN114827567B (zh) 视频质量分析方法、设备和可读介质
KR102402643B1 (ko) 3차원 모델링의 색상 최적화 처리 시스템
CN112715029A (zh) Ai编码设备及其操作方法和ai解码设备及其操作方法
US20220030233A1 (en) Interpolation filtering method and apparatus for intra-frame prediction, medium, and electronic device
CN108805943B (zh) 图片转码方法和装置
CN112669240A (zh) 高清图像修复方法、装置、电子设备和存储介质
CN109982093B (zh) 视频解码错误补偿方法及装置、存储介质、终端
KR20220003087A (ko) Vr 영상 품질 평가 방법 및 장치
US11622118B2 (en) Determination of coding modes for video content using order of potential coding modes and block classification
KR102461031B1 (ko) 네트워크 기반의 영상 처리 방법 및 이를 위한 장치
US20230245281A1 (en) Visual effects processing framework

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22961945

Country of ref document: EP

Kind code of ref document: A1