WO2023147780A1 - 视频帧的编码模式筛选方法、装置及电子设备 - Google Patents

视频帧的编码模式筛选方法、装置及电子设备 Download PDF

Info

Publication number
WO2023147780A1
WO2023147780A1 PCT/CN2023/074598 CN2023074598W WO2023147780A1 WO 2023147780 A1 WO2023147780 A1 WO 2023147780A1 CN 2023074598 W CN2023074598 W CN 2023074598W WO 2023147780 A1 WO2023147780 A1 WO 2023147780A1
Authority
WO
WIPO (PCT)
Prior art keywords
mode
affine
current
rate
distortion
Prior art date
Application number
PCT/CN2023/074598
Other languages
English (en)
French (fr)
Inventor
张鹏
陈长鑫
向国庆
黄晓峰
严伟
范益波
Original Assignee
杭州未名信科科技有限公司
浙江省北大信息技术高等研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州未名信科科技有限公司, 浙江省北大信息技术高等研究院 filed Critical 杭州未名信科科技有限公司
Publication of WO2023147780A1 publication Critical patent/WO2023147780A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria

Definitions

  • the present application relates to the field of video encoding and decoding, and in particular, relates to a method, device and electronic equipment for screening encoding modes of video frames.
  • Video codec technology The main function of video codec technology is to pursue the highest possible video reconstruction quality and the highest possible compression ratio within the available computing resources.
  • Such as Advanced Video Coding Advanced Video Coding, AVS.
  • Affine Affine-based motion compensation technology is a displacement transformation model for irregular motions such as fade-in, fade-out, rotation, and scaling, which solves the problem of inaccurate motion compensation for translation transformation models.
  • Affine-based motion compensation technology includes Affine Merge mode (affine merging mode) and affine motion estimation mode (Affine Motion Estimation), both of which are included in the process of inter-frame mode selection.
  • affine motion estimation and Ordinary motion estimation together calculates the rate-distortion cost RDCost.
  • affine motion estimation increases the time complexity of video frame encoder and requires more hardware resources.
  • the affine motion estimation process is required for the coding unit (Coding Unit, CU) with a size greater than or equal to 16*16 in the decision-making process of the inter-frame mode, and due to the affine Compared with other inter-frame modes, the motion estimation process has higher computational complexity, which greatly increases the encoding time, resulting in low efficiency in determining the inter-frame prediction mode of the CU.
  • Embodiments of the present application provide a coding mode screening method, device, and electronic device for video frames, so as to at least solve the technical problem in the related art that the efficiency of determining an inter-frame prediction mode of a CU is low.
  • a method for screening coding modes of video frames including: according to the coding data of the current coding unit CU, judging whether the coding mode in the set of adjacent CUs of the current CU is an affine mode
  • Each CU in the above-mentioned affine CU subset is a coded unit; if there is an affine CU subset whose coding mode is affine mode in the above-mentioned adjacent CU set, determine the above-mentioned current The first rate-distortion cost of the CU in the above-mentioned affine mode, and obtain the second rate-distortion rate cost set; wherein, the above-mentioned second rate-distortion rate cost set is the above-mentioned current CU in each non-affine mode of the AVS3 standard A set of rate-distortion costs; in the case that there is no affine CU subset whose encoding mode is
  • a video frame coding mode screening device including: a first judging unit, configured to judge the adjacent CU set of the current CU according to the coding data of the current coding unit CU Whether there is an affine CU subset whose encoding mode is an affine mode in the above-mentioned affine CU subset, and each CU in the above-mentioned affine CU subset is a coded unit; the first determination unit is used to exist in the above-mentioned set of adjacent CUs whose encoding mode is In the case of an affine CU subset in the affine mode, determine the first rate-distortion cost of the current CU in the affine mode, and obtain a second rate-distortion rate cost set; wherein, the second rate-distortion rate cost set It is a set of the rate-distortion cost of the above-mentioned current CU in each non
  • a computer-readable storage medium where a computer program is stored in the computer-readable storage medium, wherein the computer program is configured to execute the above-mentioned video frame The encoding mode screening method.
  • an electronic device including a memory and a processor, the memory stores a computer program, and the processor is configured to execute the above video frame encoding mode through the computer program Screening method.
  • CU and the optimal mode of the above-mentioned parent CU is the direct SKIP mode; further obtain the first rate-distortion cost of the current CU in the affine mode and the second rate-distortion rate cost set in the non-affine mode, and set the minimum rate
  • the prediction mode corresponding to the distortion cost is used as the target encoding prediction mode of the current CU.
  • this scheme not only reduces the coding time occupied by affine motion estimation in the whole inter-frame mode decision-making process, but also improves the certainty
  • the efficiency of the prediction mode of the CU solves the technical problem in the related art that the efficiency of determining the inter-frame prediction mode of the CU is low.
  • FIG. 1 is a schematic diagram of an application environment of an optional coding mode screening method for video frames according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an application environment of another optional coding mode screening method for video frames according to an embodiment of the present invention
  • FIG. 3 is a schematic flowchart of an optional encoding mode screening method for video frames according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of adjacent spatial domain CUs of a current CU according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a time-domain CU of another current CU according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of another optional method for screening encoding modes of video frames according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an optional encoding mode screening device for video frames according to an embodiment of the present invention.
  • Fig. 8 is a schematic structural diagram of an optional electronic device according to an embodiment of the present application.
  • a method for screening coding modes of video frames is provided.
  • the above method for screening coding modes of video frames can be applied, but not limited to, as shown in FIG. 1 in the hardware environment.
  • the hardware environment includes: a terminal device 102 for human-computer interaction with a user, a network 104 and a server 106 . Human-computer interaction can be performed between the user 108 and the terminal device 102 , and the terminal device 102 runs an application client for encoding mode screening of video frames.
  • the terminal device 102 includes a human-computer interaction screen 1022, a processor 1024 and storage device 1026.
  • the human-computer interaction screen 1022 is used to present an interface for video frame processing; the processor 1024 is used to obtain and judge whether there is an affine CU subset whose coding mode is an affine mode in the adjacent CU set of the current CU, and/or, judge Whether the above-mentioned current CU has a parent CU and the optimal mode of the above-mentioned parent CU is SKIP mode.
  • the memory 1026 is used for storing encoded data based on the current coding unit CU.
  • the server 106 includes a database 1062 and a processing engine 1064.
  • the database 1062 is used to store the coding data of the current coding unit CU and the coding mode of the AVS3.
  • the processing engine 1064 judges, according to the encoded data of the current coding unit CU, whether there is an affine CU subset whose coding mode is an affine mode in the adjacent CU set of the current CU, and each CU in the above-mentioned affine CU subset is Coding unit; in the case where there is an affine CU subset whose encoding mode is an affine mode in the above-mentioned set of adjacent CUs, determine the first rate-distortion cost of the above-mentioned current CU in the above-mentioned affine mode, and obtain a second rate-distortion A set of rate-distortion costs; wherein, the above-mentioned second rate-distortion rate-cost set is a set
  • the foregoing method for screening a coding mode of a video frame in the present application may be applied to FIG. 2 .
  • human-computer interaction can be performed between the user 202 and the user equipment 204 .
  • the user equipment 204 includes a memory 206 and a processor 208 therein.
  • the user equipment 204 may, but is not limited to, refer to and execute the operations performed by the terminal equipment 102 above, so as to obtain the target coding prediction mode of the current CU.
  • the terminal device 102 and the user device 204 may be, but not limited to, terminals such as mobile phones, tablet computers, laptops, PCs, etc.
  • the network 104 may include but not limited to a wireless network or a wired network.
  • the wireless network includes: WIFI and other networks for realizing wireless communication.
  • the above-mentioned wired network may include but not limited to: a wide area network, a metropolitan area network, and a local area network.
  • the above server 106 may include, but is not limited to, any hardware device capable of computing.
  • the above server may be a single server, or a server cluster composed of multiple servers, or a cloud server. The foregoing is only an example, and no limitation is set in this embodiment.
  • the encoding mode screening method of the video frame includes:
  • each CU in the above-mentioned affine CU subset is coded unit.
  • the reference frame image Col_pic of the current frame image Cur_pic is located adjacent to the current Cu
  • the current CU is adjacent to the 6 airspace CU ⁇ A, B, D, G, C, F ⁇ , a total of 7 adjacent CUs, forming an adjacent CU set ⁇ A, B, D, G, C, F , T ⁇ .
  • CodingStructure is a data structure in the encoder of AVS3. According to CodingStructure, the CU information of all positions in a frame can be obtained through coordinate operations.
  • the CU information includes encoding information of the CU.
  • affine CU subset whose encoding mode is affine mode in the adjacent CU set of the current CU above, for example, when there are CU A and CU B in ⁇ A, B, D, G, C, F, T ⁇
  • the encoding mode is affine mode
  • ⁇ A, B ⁇ is an affine CU subset.
  • the SKIP mode is an inter-frame preset mode of the non-affine mode in AVS3.
  • the motion vector information of the current CU can be preliminarily predicted through the affine mode, and the above-mentioned The first rate-distortion penalty.
  • the optimal mode of the above-mentioned parent CU is the prediction mode that minimizes the rate-distortion cost in each inter-frame prediction mode
  • the optimal mode is SKIP mode
  • the current CU has a parent CU and the optimal mode of the parent CU is a SKIP mode, acquire the second rate-distortion rate cost set; wherein the non-affine mode includes the SKIP mode;
  • the prediction mode corresponding to the minimum rate-distortion cost is used as the target coding prediction mode, where the target coding prediction mode is the optimal coding mode.
  • CU and the optimal mode of the above-mentioned parent CU is the direct SKIP mode; further obtain the first rate-distortion cost of the current CU in the affine mode and the second rate-distortion rate cost set in the non-affine mode, and set the minimum rate
  • the prediction mode corresponding to the distortion cost is used as the target encoding prediction mode of the current CU.
  • this scheme not only reduces the coding time occupied by affine motion estimation in the entire inter-frame mode decision-making process, but also improves the efficiency of determining the prediction mode of the CU, and solves the problem of determination in related technologies. A technical issue where the CU's inter prediction mode is less efficient.
  • the above step S302 the above-mentioned affine mode includes an affine merge Affine Merge mode, and the above-mentioned judging whether there is an affine CU subset whose coding mode is an affine mode in the adjacent CU set of the current CU , including: judging whether there is an affine CU subset whose encoding mode is Affine Merge mode in the adjacent CU set of the current CU.
  • the encoding mode is Affine Merge Affine mode
  • ⁇ A, B ⁇ is the affine CU subset of Affine Merge affine mode.
  • the above-mentioned affine mode includes an affine skip mode
  • the above-mentioned judging whether there is an affine CU subset whose coding mode is an affine mode in the adjacent CU set of the current CU including: judging whether there is an affine CU subset whose encoding mode is the above-mentioned Affine Skip mode in the adjacent CU set of the above-mentioned current CU.
  • step S304 if there is an affine CU subset whose encoding mode is an affine mode in the above-mentioned set of adjacent CUs, determine the first of the current CU in the above-mentioned affine mode A rate-distortion penalty, including:
  • the preset threshold includes but not limited to 30 degrees.
  • the above-mentioned non-affine mode also includes an integer-pixel prediction mode, a sub-pixel prediction mode, and a SKIP mode; in the above-mentioned step S304, the above-mentioned acquisition of the second rate-distortion rate cost set includes: respectively obtaining the above-mentioned The rate-distortion cost of the current CU in the integer-pixel prediction mode, sub-pixel prediction mode, SKIP mode, and direct mode to obtain a second set of rate-distortion rate costs.
  • the prediction mode corresponding to the minimum rate-distortion cost is used as the optimal prediction mode.
  • the first rate-distortion cost of the current CU in the affine mode before determining the first rate-distortion cost of the current CU in the affine mode, it further includes: predicting the motion vector information of the current CU by the following formula:
  • MV x is the motion vector of the control point of the current CU in the horizontal direction
  • MV y is the motion vector of the control point of the current CU in the vertical direction
  • a, b, c, d, e, f are adjustment parameters.
  • the motion vector information of the control point by knowing the motion vector information of the control point and the above formula, the motion vector information of all sub-modules in the current CU can be obtained.
  • the above-mentioned coding mode screening method for video frames includes the following steps:
  • Step 1 Start the motion estimation of the current CU
  • Step 2 Under the division depth of the current CU, determine whether there is a CU whose optimal mode is the affine mode Affine or the affine merge mode Affine Merge, or the affine skip mode Affine Skip among the coded CU blocks adjacent to the current CU. If it exists, go to step 4; if not, go to step 3;
  • Step 3 Determine whether the current CU has a parent CU and the optimal mode of the parent CU is Skip mode. If yes, skip step 5 and go to step 6; if not, proceed to step 5;
  • Step 4 Determine whether the difference between the MV angles of the current CU and adjacent CU blocks exceeds the threshold ⁇ . If yes, go to step 5; if not, go to step 6;
  • Step 5 Through the motion prediction formula of CU, Affine motion estimation within 3 iterations process
  • Step 6 Carry out the prediction of the non-affine inter-frame mode in the AVS3 standard
  • Step 7 Select the optimal mode according to the rate-distortion cost
  • Step 8 End the motion estimation of the current CU.
  • an apparatus for screening a coding mode of a video frame for implementing the method for screening a coding mode of a video frame.
  • the device includes:
  • the first judging unit 702 is configured to judge whether there is an affine CU subset whose coding mode is an affine mode in the adjacent CU set of the current CU according to the encoded data of the current coding unit CU, and each of the affine CU subsets in the above-mentioned affine CU subset Each CU is a coded unit.
  • the reference frame image Col_pic of the current frame image Cur_pic is located at the lower right adjacent to the current Cu , the current CU is adjacent to the 6 spatial CUs ⁇ A, B, D, G, C, F ⁇ , a total of 7 adjacent CUs, forming an adjacent CU set ⁇ A, B, D, G, C, F, T ⁇ .
  • CodingStructure is a data structure in the encoder of AVS3. According to CodingStructure, the CU information of all positions in a frame can be obtained through coordinate operations.
  • the CU information includes encoding information of the CU.
  • affine CU subset whose encoding mode is affine mode in the adjacent CU set of the current CU above, for example, when there are CU A and CU B in ⁇ A, B, D, G, C, F, T ⁇
  • the encoding mode is affine mode
  • ⁇ A, B ⁇ is an affine CU subset.
  • the SKIP mode is an inter-frame preset mode of the non-affine mode in AVS3.
  • the first determining unit 704 is configured to determine a first rate-distortion cost of the current CU in the affine mode when there is an affine CU subset whose encoding mode is an affine mode in the set of adjacent CUs, and Acquiring a second set of rate-distortion rate costs; wherein, the second set of rate-distortion rate costs is a set of rate-distortion costs of the current CU in each non-affine mode of the AVS3 standard.
  • the motion vector information of the current CU can be preliminarily predicted through the affine mode, and the above-mentioned The first rate-distortion penalty.
  • the optimal mode of the above-mentioned parent CU is the prediction mode that minimizes the rate-distortion cost in each inter-frame prediction mode
  • the optimal mode is SKIP mode
  • the second judging unit 706 is configured to judge whether the current CU has a parent CU and the optimal mode of the parent CU is Direct SKIP mode.
  • the first acquisition unit 708 is configured to acquire the second set of rate-distortion rate costs when the current CU has a parent CU and the optimal mode of the parent CU is the SKIP mode; wherein the non-affine mode includes the above-mentioned Jump SKIP mode.
  • the second determining unit 710 is configured to determine a first rate-distortion cost of the current CU in the affine mode when the current CU does not have a parent CU and the optimal mode of the parent CU is the SKIP mode.
  • the third determination unit 712 is configured to use the first rate-distortion cost and the prediction mode corresponding to the smallest rate-distortion cost in the second rate-distortion cost set as the target encoding prediction mode of the current CU.
  • the prediction mode corresponding to the minimum rate-distortion cost is used as the target coding prediction mode, where the target coding prediction mode is the optimal coding mode.
  • CU and the optimal mode of the above-mentioned parent CU is the direct SKIP mode; further obtain the first rate-distortion cost of the current CU in the affine mode and the second rate-distortion rate cost set in the non-affine mode, and set the minimum rate
  • the prediction mode corresponding to the distortion cost is used as the target encoding prediction mode of the current CU.
  • this scheme not only reduces the coding time occupied by affine motion estimation in the entire inter-frame mode decision-making process, but also improves the efficiency of determining the prediction mode of the CU, and solves the problem of determination in related technologies. A technical issue where the CU's inter prediction mode is less efficient.
  • the above-mentioned affine mode includes an affine merge Affine Merge mode
  • the above-mentioned first judging unit 702 specifically includes:
  • the first judging module is configured to judge whether there is an affine CU subset whose encoding mode is Affine Merge mode in the adjacent CU set of the current CU.
  • the above-mentioned affine mode includes an affine merge Affine Skip mode
  • the above-mentioned first judging unit 702 also includes:
  • the second judging module is used to judge whether there is an affine CU subset whose encoding mode is Affine Skip mode in the adjacent CU set of the current CU.
  • the above-mentioned first determining unit 704 specifically includes:
  • a third judging module configured to judge the angle between each CU in the adjacent CU set and the motion vector of the current CU when there is an affine CU subset whose encoding mode is an affine mode in the adjacent CU set. difference;
  • the first determining module is configured to determine a first rate-distortion cost of the current CU in the affine mode when there is an angle difference with the current CU greater than a preset threshold.
  • the above-mentioned non-affine mode also includes an integer-pixel prediction mode, a sub-pixel prediction mode, and a direct Direct mode;
  • the above-mentioned first acquisition unit 708 specifically includes:
  • the obtaining module is configured to separately obtain the rate-distortion cost of the current CU in the integer-pixel prediction mode, the sub-pixel prediction mode, the SKIP mode and the direct mode, and obtain a second rate-distortion rate cost set.
  • the above-mentioned encoding mode screening device for video frames further includes:
  • the prediction unit is used to predict the motion vector information of the above-mentioned current CU by the following formula:
  • MV x is the motion vector of the control point of the current CU in the horizontal direction
  • MV y is the motion vector of the control point of the current CU in the vertical direction
  • a, b, c, d, e, f are adjustment parameters.
  • an electronic device for implementing the encoding mode screening method for video frames above, where the electronic device may be the terminal device or the server shown in FIG. 1 .
  • the electronic device may be the terminal device or the server shown in FIG. 1 .
  • This embodiment is described by taking the electronic device as a server as an example.
  • the electronic device includes a memory 802 and a processor 804, the memory 802 stores a computer program, and the processor 804 is configured to execute the steps in any one of the above method embodiments through the computer program.
  • the foregoing electronic device may be located in at least one network device among multiple network devices in the computer network.
  • the above-mentioned processor may be configured to execute the following steps through a computer program:
  • the mode is an affine CU subset of the affine mode, and each CU in the above-mentioned affine CU subset is a coded unit;
  • the current CU has a parent CU and the optimal mode of the parent CU is a SKIP mode, acquire the second rate-distortion rate cost set; wherein the non-affine mode includes the SKIP mode;
  • FIG. 8 is only schematic, and the electronic device electronic device can also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a handheld computer, and a mobile Internet device (Mobile Internet Devices, MID), PAD and other terminal equipment.
  • FIG. 8 does not limit the structure of the above-mentioned electronic device and electronic equipment.
  • the electronic device electronic equipment may also include more or less components than those shown in FIG. 8 (such as a network interface, etc.), or have a different configuration from that shown in FIG. 8 .
  • the memory 802 can be used to store software programs and modules, such as program instructions/modules corresponding to the video frame encoding mode screening method and device in the embodiment of the present application, and the processor 804 runs the software programs and modules stored in the memory 802 , so as to perform various functional applications and data processing, that is, to realize the above-mentioned coding mode screening method for video frames.
  • the memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 802 may further include a memory that is remotely located relative to the processor 1004, and these remote memories may be connected to the terminal through a network.
  • the memory 802 may be specifically, but not limited to, used for storing information such as a reference space CU of the current coding unit CU.
  • the memory 802 may include, but is not limited to, the first judging unit 702, the first determining unit 704, the second judging unit 706, the first An acquisition unit 708 , a second determination unit 710 and a third determination unit 712 .
  • it may also include, but not limited to, other module units in the above-mentioned device for screening encoding modes of video frames, and details will not be described in this example.
  • the above-mentioned transmission device 810 is configured to receive or send data via a network.
  • the specific examples of the above-mentioned network may include a wired network and a wireless network.
  • the transmission device 810 includes a network adapter (Network Interface Controller, NIC), which can be connected to other network devices and routers through a network cable so as to communicate with the Internet or a local area network.
  • the transmission device 810 is a radio frequency (Radio Frequency, RF) module, which is used to communicate with the Internet in a wireless manner.
  • RF Radio Frequency
  • the above-mentioned electronic device further includes: a display 808 for displaying the reference space CU information of the above-mentioned current coding unit CU; and a connection bus 810 for connecting various module components in the above-mentioned electronic device.
  • the above-mentioned terminal device or server may be a node in a distributed system, wherein the distributed system may be a block chain system, and the block chain system may be composed of the multiple nodes communicating through the network A distributed system formed by connecting in the form of .
  • nodes can form a peer-to-peer (P2P, Peer To Peer) network, and any form of computing equipment, such as servers, terminals and other electronic devices, can become a node in the blockchain system by joining the peer-to-peer network.
  • P2P peer-to-peer
  • Peer To Peer Peer To Peer
  • the present application also provides a computer program product or computer program, the computer program product or computer program comprising computer instructions stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above method for screening coding modes of video frames.
  • the computer program is configured to execute the steps in any one of the above method embodiments when running.
  • the above-mentioned computer-readable storage medium may be configured to store a computer program for performing the following steps:
  • the current CU has a parent CU and the optimal mode of the parent CU is a SKIP mode, acquire the second rate-distortion rate cost set; wherein the non-affine mode includes the SKIP mode;
  • the storage medium may include: a flash disk, a read-only memory (Read-Only Memory, ROM), a random access device (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • ROM read-only memory
  • RAM random access device
  • magnetic disk or an optical disk a magnetic disk or an optical disk, and the like.
  • the integrated units in the above embodiments are realized in the form of software function units and sold or used as independent products, they can be stored in the above computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
  • Several instructions are included to make one or more computer devices (which may be personal computers, servers or network devices, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the disclosed client can be implemented in other ways.
  • the device embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种视频帧的编码模式筛选方法、装置及电子设备。其中,该方法包括:根据当前编码单元CU的编码数据,判断当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集;确定当前CU在仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;在相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,在当前CU存在父CU且父CU的最优模式为跳跃SKIP模式的情况下,获取第二率失真率代价集合;将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为当前CU的目标编码预测模式。本申请解决了相关技术中的确定CU的帧间预测模式效率较低的技术问题。

Description

视频帧的编码模式筛选方法、装置及电子设备 技术领域
本申请涉及视频编解码领域,具体而言,涉及一种视频帧的编码模式筛选方法、装置及电子设备。
背景技术
视频编解码技术的主要作用是在可用的计算资源内,追求尽可能高的视频重建质量和尽可能高的压缩比。例如先进视频编码(Advanced Video Coding,AVS)。
基于仿射Affine的运动补偿技术是一种用于淡入、淡出、旋转、放缩等不规则运动的位移变换模型,解决了平移变换模型运动补偿不准确的问题。基于Affine的运动补偿技术包含了Affine Merge模式(仿射合并模式)和仿射运动估计模式(Affine Motion Estimation),二者包含在帧间模式选择的过程中,通常情况下,仿射运动估计和普通的运动估计一起计算率失真代价RDCost。仿射运动估计作为帧间预测的一项新的功能,增加了视频帧的编码器的时间复杂度以及需要较多的硬件资源。
目前在AVS3(第三代先进视频编码)标准中,对于尺寸大于等于16*16的编码单元(Coding Unit,CU)在帧间模式决策过程中都需要进行仿射运动估计过程,而由于仿射运动估计过程相比较其余帧间模式而言,具有更高的计算复杂度,大大增加了编码时间,从而导致确定CU的帧间预测模式的效率较低。
发明内容
本申请实施例提供了一种视频帧的编码模式筛选方法、装置及电子设备,以至少解决相关技术中的确定CU的帧间预测模式效率较低的技术问题。
根据本申请实施例的一个方面,提供了一种视频帧的编码模式筛选方法,包括:根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元;在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述 非仿射模式包括上述跳跃SKIP模式;在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价;将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。
根据本申请实施例的另一方面,还提供了一种视频帧的编码模式筛选装置,包括:第一判断单元,用于根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元;第一确定单元,用于在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;第二判断单元,用于在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;第一获取单元,用于在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述非仿射模式包括上述跳跃SKIP模式;第二确定单元,用于在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价;第三确定单元,用于将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。
根据本申请实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述的视频帧的编码模式筛选方法。
根据本申请实施例的又一方面,还提供了一种电子设备,包括存储器和处理器,上述存储器中存储有计算机程序,上述处理器被设置为通过上述计算机程序执行上述的视频帧的编码模式筛选方法。
在本申请实施例中,通过根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,以及判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;进一步来获取当前CU在仿射模式下的第一率失真代价和非仿射模式下的第二率失真率代价集合中,并将最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。由于简化了仿射运动估计的过程,因此,本方案不仅减少仿射运动估计在整个帧间模式决策过程所占用的编码时间,而且提高了确定 CU的预测模式的效率,解决了相关技术中的确定CU的帧间预测模式效率较低的技术问题。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是根据本发明实施例的一种可选的视频帧的编码模式筛选方法的应用环境的示意图;
图2是根据本发明实施例的另一种可选的视频帧的编码模式筛选方法的应用环境的示意图;
图3是根据本发明实施例的一种可选的视频帧的编码模式筛选方法的流程示意图;
图4是根据本发明实施例的一种当前CU的相邻的空域CU示意图;
图5是根据本发明实施例的另一种当前CU的时域CU示意图;
图6是根据本发明实施例的另一种可选的视频帧的编码模式筛选方法的流程示意图;
图7是根据本发明实施例的一种可选的视频帧的编码模式筛选装置的结构示意图;
图8是根据本申请实施例的一种可选的电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
根据本发明实施例的一个方面,提供了一种视频帧的编码模式筛选方法,作为一种可选的实施方式,上述视频帧的编码模式筛选上述方法可以但不限于应用于如图1所示的硬件环境中。该硬件环境中包括:与用户进行人机交互的终端设备102、网络104、服务器106。用户108与终端设备102之间可以进行人机交互,终端设备102中运行有视频帧的编码模式筛选应用客户端。上述终端设备102中包括人机交互屏幕1022,处理器1024及存储 器1026。人机交互屏幕1022用于呈现视频帧处理的界面;处理器1024用于获取判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,和/或,判断上述当前CU是否存在父CU且上述父CU的最优模式为跳跃SKIP模式。存储器1026用于存储基于当前编码单元CU的编码数据。
此外,服务器106中包括数据库1062及处理引擎1064,数据库1062中用于存储当前编码单元CU的编码数据,及用于存储AVS3的编码模式。处理引擎1064根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元;在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述非仿射模式包括上述跳跃SKIP模式;在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价;将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。
作为另一种可选的实施方式,本申请上述视频帧的编码模式筛选上述方法可以应用于图2中。如图2所示,用户202与用户设备204之间可以进行人机交互。用户设备204中包含有存储器206和处理器208。本实施例中用户设备204可以但不限于参考执行上述终端设备102所执行的操作,以获取当前CU的目标编码预测模式。
可选地,上述终端设备102和用户设备204可以但不限于为手机、平板电脑、笔记本电脑、PC机等终端,上述网络104可以包括但不限于无线网络或有线网络。其中,该无线网络包括:WIFI及其他实现无线通信的网络。上述有线网络可以包括但不限于:广域网、城域网、局域网。上述服务器106可以包括但不限于任何可以进行计算的硬件设备。上述服务器可以是单一服务器,也可以是由多个服务器组成的服务器集群,或者是云服务器。上述仅是一种示例,本实施例中对此不作任何限定。
可选地,在一个或多个实施例中,如图3所示,上述视频帧的编码模式筛选方法包括:
S302,根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元。
在本发明实施例中,如图4与图5所示,按照AVS3中的编码标准,当前帧图像Cur_pic的参考帧图像Col_pic;当前CU(Cur)的1个同位时域T位于当前Cu相邻的右下方,当前CU相邻的6个空域CU{A,B,D,G,C,F}共7个相邻CU,构成相邻CU集合{A,B,D,G,C,F,T}。
CodingStructure为AVS3的编码器内的一种数据结构,根据CodingStructure通过坐标操作可以获取一帧所有位置的CU信息。CU信息包括CU的编码信息。
判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,例如,当{A,B,D,G,C,F,T}中存在CU A和CU B编码模式为仿射模式,{A,B}为仿射CU子集。
在本发明实施例中,跳跃SKIP模式为AVS3中的非仿射模式的帧间预设模式。
S304,在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合。
在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,此时,可以初步通过仿射模式来预测当前CU的运动矢量信息,并根据运动矢量信息得到上述的第一率失真代价。
在本发明实施例中,这里需要说明的是,上述父CU的最优模式为各个帧间预测模式下的将率失真代价最小的预测模式,在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,可以初步判断出当前CU的最优模式不是仿射模式,然后依次获取当前CU在AVS3标准的每个非仿射模式下的率失真代价,得到第二率失真率代价集合。
S306,在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式。
S308,在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述非仿射模式包括上述跳跃SKIP模式;
S310,在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价;
S312,将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上 述当前CU的目标编码预测模式。
在本发明实施例中,通过将当前CU在仿射模式下对应的率失真代价,以及在非仿射模式下对应的率失真代价进行比较,将最小率失真代价对应的预测模式作为目标编码预测模式,这里的目标编码预测模式为最优的编码模式。
在本申请实施例中,通过根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,以及判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;进一步来获取当前CU在仿射模式下的第一率失真代价和非仿射模式下的第二率失真率代价集合中,并将最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。由于简化仿射运动估计的过程,因此,本方案不仅减少仿射运动估计在整个帧间模式决策过程所占用的编码时间,而且提高了确定CU的预测模式的效率,解决了相关技术中的确定CU的帧间预测模式效率较低的技术问题。
在一个或多个实施例中,上述步骤S302,上述仿射模式包括仿射合并Affine Merge模式,上述判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,包括:判断上述当前CU的相邻CU集合中是否存在编码模式为Affine Merge模式的仿射CU子集。
在本发明实施例中,如图4与图5所示,例如,当相邻CU集合{A,B,D,G,C,F,T}中存在CU A和CU B编码模式为Affine Merge仿射模式,{A,B}为Affine Merge仿射模式的仿射CU子集。
在一个或多个实施例中,上述步骤S302,上述仿射模式包括仿射跳跃Affine Skip模式,上述判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,包括:判断上述当前CU的相邻CU集合中是否存在编码模式为上述Affine Skip模式的仿射CU子集。
这里需要说明的是,按照AVS3中的编码标准,在本发明实施例中,如图4与图5所示,例如,当{A,B,D,G,C,F,T}中存在CU A和CU B编码模式为Affine Skip仿射模式,{A,B}为Affine Skip仿射模式的仿射CU子集。
在一个或多个实施例中,上述步骤S304,上述在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,包括:
在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,判断上述相邻 CU集合中每个CU与上述当前CU的运动矢量的角度差异;当存在与当前CU的角度差异大于预设阈值时,确定出上述当前CU在上述仿射模式下的第一率失真代价。
如图4与图5所示,例如,当相邻CU集合{A,B,D,G,C,F,T}中存在CU A与当前CU的角度差异大于预设阈值时,估算当前CU的运动矢量信息,并确定出上述当前CU在上述仿射模式下的第一率失真代价。这里,预设阈值包括但不限于30度。
在一个或多个实施例中,上述非仿射模式还包括整像素预测模式,分像素预测模式,跳跃SKIP模式;上述步骤S304中,上述获取第二率失真率代价集合,包括:分别获取上述当前CU在整像素预测模式,分像素预测模式,跳跃SKIP模式和直接Direct模式下的率失真代价,得到第二率失真率代价集合。
在本发明实施例中,得到第一率失真代价和第二率失真率代价集合后,将最小率失真代价对应的预测模式作为最优预测模式。
在一个或多个实施例中,在确定上述当前CU在上述仿射模式下的第一率失真代价之前,还包括:通过以下公式预测上述当前CU的运动矢量信息:
其中,MVx为当前CU的控制点在水平方向上的运动矢量,MVy为当前CU的控制点在竖直方向上的运动矢量,a,b,c,d,e,f为调整参数。
在本发明实施例中,通过已知控制点的运动矢量信息以及上述公式,可以到当前CU中所有子模块的运动矢量信息。
基于上述实施例,在一个或多个实施例中,如图6所示,上述视频帧的编码模式筛选方法包括如下步骤:
步骤1:开始当前CU的运动估计;
步骤2:在当前CU所在划分深度下,判断当前CU相邻已编码CU块中是否存在最优模式为仿射模式Affine或仿射合并模式Affine Merge,或者仿射跳跃Affine Skip模式的CU。若存在,则步骤4;若不存在,则进行步骤3;
步骤3:判断当前CU是否存在父CU且父CU最优模式为Skip模式。若是,则跳过步骤5,进入步骤6;若否,则接着进行步骤5;
步骤4:判断当前CU与相邻CU块的运动矢量MV角度差异是否超过阈值α。若是,则进行步骤5;若否,则接着进行步骤6;
步骤5:通过CU的运动预测公式,迭代进行3次以内的仿射运动估计 过程;
步骤6:进行AVS3标准中非仿射帧间模式的预测;
步骤7:根据率失真代价,选出最优模式;
步骤8:结束当前CU运动估计。
在本发明实施例中,通过简化仿射运动估计的过程,减少了仿射运动估计过程的高计算复杂度的编码时间。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
根据本申请实施例的另一个方面,还提供了一种用于实施上述视频帧的编码模式筛选方法的视频帧的编码模式筛选装置。如图7所示,该装置包括:
第一判断单元702,用于根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元。
在本发明实施例中,如图4与图5所示,按照AVS3标准,当前帧图像Cur_pic的参考帧图像Col_pic;当前CU(Cur)的1个同位时域T位于当前Cu相邻的右下方,当前CU相邻的6个空域CU{A,B,D,G,C,F}共7个相邻CU,构成相邻CU集合{A,B,D,G,C,F,T}。
CodingStructure为AVS3的编码器内的一种数据结构,根据CodingStructure通过坐标操作可以获取一帧所有位置的CU信息。CU信息包括CU的编码信息。
判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,例如,当{A,B,D,G,C,F,T}中存在CU A和CU B编码模式为仿射模式,{A,B}为仿射CU子集。
在本发明实施例中,跳跃SKIP模式为AVS3中的非仿射模式的帧间预设模式。
第一确定单元704,用于在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合。
在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,此时,可以初步通过仿射模式来预测当前CU的运动矢量信息,并根据运动矢量信息得到上述的第一率失真代价。
在本发明实施例中,这里需要说明的是,上述父CU的最优模式为各个帧间预测模式下的将率失真代价最小的预测模式,在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,可以初步判断出当前CU的最优模式不是仿射模式,然后依次获取当前CU在AVS3标准的每个非仿射模式下的率失真代价,得到第二率失真率代价集合。
第二判断单元706,用于在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式。
第一获取单元708,用于在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述非仿射模式包括上述跳跃SKIP模式。
第二确定单元710,用于在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价。
第三确定单元712,用于将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。
在本发明实施例中,通过将当前CU在仿射模式下对应的率失真代价,以及在非仿射模式下对应的率失真代价进行比较,将最小率失真代价对应的预测模式作为目标编码预测模式,这里的目标编码预测模式为最优的编码模式。
在本申请实施例中,通过根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,以及判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;进一步来获取当前CU在仿射模式下的第一率失真代价和非仿射模式下的第二率失真率代价集合中,并将最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。由于简化仿射运动估计的过程,因此,本方案不仅减少仿射运动估计在整个帧间模式决策过程所占用的编码时间,而且提高了确定CU的预测模式的效率,解决了相关技术中的确定CU的帧间预测模式效率较低的技术问题。
在一个或多个实施例中,上述仿射模式包括仿射合并Affine Merge模式,上述第一判断单元702,具体包括:
第一判断模块,用于判断上述当前CU的相邻CU集合中是否存在编码模式为Affine Merge模式的仿射CU子集。
在一个或多个实施例中,上述仿射模式包括仿射合并Affine Skip模式,上述第一判断单元702,还包括:
第二判断模块,用于判断上述当前CU的相邻CU集合中是否存在编码模式为Affine Skip模式的仿射CU子集。
在一个或多个实施例中,上述第一确定单元704,具体包括:
第三判断模块,用于在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,判断上述相邻CU集合中每个CU与上述当前CU的运动矢量的角度差异;
第一确定模块,用于当存在与当前CU的角度差异大于预设阈值时,确定出上述当前CU在上述仿射模式下的第一率失真代价。
在一个或多个实施例中,上述非仿射模式还包括整像素预测模式,分像素预测模式,直接Direct模式;上述第一获取单元708,具体包括:
获取模块,用于分别获取上述当前CU在整像素预测模式,分像素预测模式,跳跃SKIP模式和直接Direct模式下的率失真代价,得到第二率失真率代价集合。
在一个或多个实施例中,上述视频帧的编码模式筛选装置,还包括:
预测单元,用于通过以下公式预测上述当前CU的运动矢量信息:
其中,MVx为当前CU的控制点在水平方向上的运动矢量,MVy为当前CU的控制点在竖直方向上的运动矢量,a,b,c,d,e,f为调整参数。
根据本申请实施例的又一个方面,还提供了一种用于实施上述视频帧的编码模式筛选方法的电子设备,该电子设备可以是图1所示的终端设备或服务器。本实施例以该电子设备为服务器为例来说明。如图8所示,该电子设备包括存储器802和处理器804,该存储器802中存储有计算机程序,该处理器804被设置为通过计算机程序执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述电子设备可以位于计算机网络的多个网络设备中的至少一个网络设备。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:
S1,根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码 模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元;
S2,在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;
S3,在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;
S4,在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述非仿射模式包括上述跳跃SKIP模式;
S5,在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价;
S6,将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。
可选地,本领域普通技术人员可以理解,图8所示的结构仅为示意,电子装置电子设备也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图8其并不对上述电子装置电子设备的结构造成限定。例如,电子装置电子设备还可包括比图8中所示更多或者更少的组件(如网络接口等),或者具有与图8所示不同的配置。
其中,存储器802可用于存储软件程序以及模块,如本申请实施例中的视频帧的编码模式筛选方法和装置对应的程序指令/模块,处理器804通过运行存储在存储器802内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的视频帧的编码模式筛选方法。存储器802可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器802可进一步包括相对于处理器1004远程设置的存储器,这些远程存储器可以通过网络连接至终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。其中,存储器802具体可以但不限于用于存储当前编码单元CU的参考空域CU等信息。作为一种示例,如图8所示,上述存储器802中可以但不限于包括上述视频帧的编码模式筛选装置中的第一判断单元702、第一确定单元704、第二判断单元706、第一获取单元708、第二确定单元710及第三确定单元712。此外,还可以包括但不限于上述视频帧的编码模式筛选装置中的其他模块单元,本示例中不再赘述。
可选地,上述的传输装置810用于经由一个网络接收或者发送数据。上述的网络具体实例可包括有线网络及无线网络。在一个实例中,传输装置810包括一个网络适配器(Network Interface Controller,NIC),其可通过网线与其他网络设备与路由器相连从而可与互联网或局域网进行通讯。在一个实例中,传输装置810为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
此外,上述电子设备还包括:显示器808,用于显示上述当前编码单元CU的参考空域CU信息;和连接总线810,用于连接上述电子设备中的各个模块部件。
在其他实施例中,上述终端设备或者服务器可以是一个分布式系统中的一个节点,其中,该分布式系统可以为区块链系统,该区块链系统可以是由该多个节点通过网络通信的形式连接形成的分布式系统。其中,节点之间可以组成点对点(P2P,Peer To Peer)网络,任意形式的计算设备,比如服务器、终端等电子设备都可以通过加入该点对点网络而成为该区块链系统中的一个节点。
在一个或多个实施例中,本申请还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述视频帧的编码模式筛选方法。其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述计算机可读的存储介质可以被设置为存储用于执行以下步骤的计算机程序:
S1,根据当前编码单元CU的编码数据,判断上述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,上述仿射CU子集中的每个CU均为已编码单元;
S2,在上述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定上述当前CU在上述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,上述第二率失真率代价集合为上述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;
S3,在上述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断上述当前CU是否存在父CU且上述父CU的最优模式为直接SKIP模式;
S4,在上述当前CU存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,获取上述第二率失真率代价集合;其中,上述非仿射模式包括上述跳跃SKIP模式;
S5,在上述当前CU不存在父CU且上述父CU的最优模式为跳跃SKIP模式的情况下,确 定上述当前CU在上述仿射模式下的第一率失真代价;
S6,将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为上述当前CU的目标编码预测模式。
可选地,在本实施例中,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视 为本申请的保护范围。

Claims (10)

  1. 一种视频帧的编码模式筛选方法,其特征在于,包括:
    根据当前编码单元CU的编码数据,判断所述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,所述仿射CU子集中的每个CU均为已编码单元;
    在所述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定所述当前CU在所述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,所述第二率失真率代价集合为所述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;
    在所述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断所述当前CU是否存在父CU且所述父CU的最优模式为直接SKIP模式;
    在所述当前CU存在父CU且所述父CU的最优模式为跳跃SKIP模式的情况下,获取所述第二率失真率代价集合;其中,所述非仿射模式包括所述跳跃SKIP模式;
    在所述当前CU不存在父CU且所述父CU的最优模式为跳跃SKIP模式的情况下,确定所述当前CU在所述仿射模式下的第一率失真代价;
    将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为所述当前CU的目标编码预测模式。
  2. 根据权利要求1所述的方法,其特征在于,所述仿射模式包括仿射合并Affine Merge模式,所述判断所述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,包括:
    判断所述当前CU的相邻CU集合中是否存在编码模式为Affine Merge模式的仿射CU子集。
  3. 根据权利要求1所述的方法,其特征在于,所述仿射模式包括仿射跳跃Affine Skip模式,所述判断所述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,包括:
    判断所述当前CU的相邻CU集合中是否存在编码模式为所述Affine Skip模式的仿射CU子集。
  4. 根据权利要求1所述的方法,其特征在于,所述在所述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定所述当前CU在所述仿射模式下的第一率失真代价,包括:
    在所述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,判断所述相邻CU集合中每个CU与所述当前CU的运动矢量的角度差异;
    当存在与当前CU的角度差异大于预设阈值时,确定出所述当前CU在所述仿射模式下的第一率失真代价。
  5. 根据权利要求1至4中任一项中所述的方法,其特征在于,所述非仿射模式还包括整像素预测模式,分像素预测模式,直接Direct模式;所述获取第二率失真率代价集合,包括:分别获取所述当前CU在整像素预测模式,分像素预测模式,跳跃SKIP模式和直接Direct模式下的率失真代价,得到所述第二率失真率代价集合。
  6. 根据权利要求1所述的方法,其特征在于,在确定所述当前CU在所述仿射模式下的第一率失真代价之前,还包括:
    通过以下公式预测所述当前CU的运动矢量信息:
    其中,MVx为当前CU的控制点在水平方向上的运动矢量,MVy为当前CU的控制点在竖直方向上的运动矢量,a,b,c,d,e,f为调整参数。
  7. 一种视频帧的编码模式筛选装置,其特征在于,包括:
    第一判断单元,用于根据当前编码单元CU的编码数据,判断所述当前CU的相邻CU集合中是否存在编码模式为仿射模式的仿射CU子集,所述仿射CU子集中的每个CU均为已编码单元;
    第一确定单元,用于在所述相邻CU集合中存在编码模式为仿射模式的仿射CU子集的情况下,确定所述当前CU在所述仿射模式下的第一率失真代价,以及获取第二率失真率代价集合;其中,所述第二率失真率代价集合为所述当前CU在AVS3标准的每个非仿射模式下的率失真代价的集合;
    第二判断单元,用于在所述相邻CU集合中不存在编码模式为仿射模式的仿射CU子集的情况下,判断所述当前CU是否存在父CU且所述父CU的最优模式为直接SKIP模式;
    第一获取单元,用于在所述当前CU存在父CU且所述父CU的最优模式为跳跃SKIP模式的情况下,获取所述第二率失真率代价集合;其中,所述非仿射模式包括所述跳跃SKIP模式;
    第二确定单元,用于在所述当前CU不存在父CU且所述父CU的最优模式为跳跃SKIP模式的情况下,确定所述当前CU在所述仿射模式下的第一率失真代价;
    第三确定单元,用于将第一率失真代价和第二率失真率代价集合中最小率失真代价对应的预测模式作为所述当前CU的目标编码预测模式。
  8. 根据权利要求7所述的装置,其特征在于,所述仿射模式包括仿射合并Affine Merge模式,所述第一判断单元,包括:
    第一判断模块,用于判断所述当前CU的相邻CU集合中是否存在编码模式为Affine Merge模式的仿射CU子集。
  9. 一种计算机可读的存储介质,其特征在于,所述计算机可读的存储介质包括存储的程序,其中,所述程序运行时执行所述权利要求1至6任一项中所述的方法。
  10. 一种电子设备,包括存储器和处理器,其特征在于,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行所述权利要求1至6中任一项所述的方法。
PCT/CN2023/074598 2022-02-07 2023-02-06 视频帧的编码模式筛选方法、装置及电子设备 WO2023147780A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210116116.6A CN114157868B (zh) 2022-02-07 2022-02-07 视频帧的编码模式筛选方法、装置及电子设备
CN202210116116.6 2022-02-07

Publications (1)

Publication Number Publication Date
WO2023147780A1 true WO2023147780A1 (zh) 2023-08-10

Family

ID=80450372

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074598 WO2023147780A1 (zh) 2022-02-07 2023-02-06 视频帧的编码模式筛选方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN114157868B (zh)
WO (1) WO2023147780A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114157868B (zh) * 2022-02-07 2022-07-19 杭州未名信科科技有限公司 视频帧的编码模式筛选方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018117334A1 (ko) * 2016-12-21 2018-06-28 전자부품연구원 고효율 비디오 부호화 모드 결정방법 및 결정장치
US20180270500A1 (en) * 2017-03-14 2018-09-20 Qualcomm Incorporated Affine motion information derivation
CN109788287A (zh) * 2017-11-10 2019-05-21 腾讯科技(深圳)有限公司 视频编码方法、装置、计算机设备和存储介质
CN111698502A (zh) * 2020-06-19 2020-09-22 中南大学 基于vvc编码的仿射运动估计加速方法、设备及存储介质
CN114157868A (zh) * 2022-02-07 2022-03-08 杭州未名信科科技有限公司 视频帧的编码模式筛选方法、装置及电子设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10958934B2 (en) * 2018-07-27 2021-03-23 Tencent America LLC History-based affine merge and motion vector prediction
CN110933427B (zh) * 2018-09-19 2023-05-12 北京字节跳动网络技术有限公司 仿射模式编码的模式相关自适应运动矢量分辨率
CN111698515B (zh) * 2019-03-14 2023-02-14 华为技术有限公司 帧间预测的方法及相关装置
CN115280779A (zh) * 2020-03-20 2022-11-01 北京达佳互联信息技术有限公司 用于仿射运动补偿预测细化的方法和装置
CN111988607B (zh) * 2020-08-07 2023-03-24 北京奇艺世纪科技有限公司 编码单元处理方法、装置、电子设备及存储介质
CN112911308B (zh) * 2021-02-01 2022-07-01 重庆邮电大学 一种h.266/vvc的快速运动估计方法及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018117334A1 (ko) * 2016-12-21 2018-06-28 전자부품연구원 고효율 비디오 부호화 모드 결정방법 및 결정장치
US20180270500A1 (en) * 2017-03-14 2018-09-20 Qualcomm Incorporated Affine motion information derivation
CN109788287A (zh) * 2017-11-10 2019-05-21 腾讯科技(深圳)有限公司 视频编码方法、装置、计算机设备和存储介质
CN111698502A (zh) * 2020-06-19 2020-09-22 中南大学 基于vvc编码的仿射运动估计加速方法、设备及存储介质
CN114157868A (zh) * 2022-02-07 2022-03-08 杭州未名信科科技有限公司 视频帧的编码模式筛选方法、装置及电子设备

Also Published As

Publication number Publication date
CN114157868B (zh) 2022-07-19
CN114157868A (zh) 2022-03-08

Similar Documents

Publication Publication Date Title
CN102726047B (zh) 侦测系统
CN107734335B (zh) 图像预测方法及相关装置
US9332271B2 (en) Utilizing a search scheme for screen content video coding
CN111681167A (zh) 画质调整方法和装置、存储介质及电子设备
JP2009512316A (ja) カメラパラメータを利用した多視点動画符号化及び復号化装置並びに方法と、これを行うためのプログラムの記録された記録媒体
US20220005232A1 (en) System and method for procedurally colorizing spatial data
WO2023147780A1 (zh) 视频帧的编码模式筛选方法、装置及电子设备
KR20180037042A (ko) 모션 벡터 필드 코딩 방법 및 디코딩 방법, 및 코딩 및 디코딩 장치들
US20190182503A1 (en) Method and image processing apparatus for video coding
JP2007538415A (ja) ハンドヘルド装置用のエンコード方法
CN112468816B (zh) 固定码率系数预测模型建立及视频编码的方法
CN112312131B (zh) 一种帧间预测方法、装置、设备及计算机可读存储介质
WO2021057686A1 (zh) 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置
CN115550645A (zh) 帧内预测模式的确定方法、装置、存储介质及电子设备
CN113996056A (zh) 云游戏的数据发送和接收方法以及相关设备
KR102600721B1 (ko) Vr 영상 품질 평가 방법 및 장치
CN104243950A (zh) 用于将2维内容实时转换为3维内容的方法和设备
CN114374841A (zh) 视频编码码率控制的优化方法、装置及电子设备
CN110636293A (zh) 视频编码、解码方法和装置、存储介质及电子装置
CN116760986B (zh) 候选运动矢量生成方法、装置、计算机设备和存储介质
CN114554209A (zh) 基于avs3的仿射模式筛选方法、装置及电子设备
CN114222134A (zh) 视频数据的帧间预测方法、装置及电子设备
CN117014659B (zh) 一种视频转码方法、装置、电子设备和存储介质
CN113556552B (zh) 云端视频压缩的控制方法
CN113079372B (zh) 帧间预测的编码方法、装置、设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23749359

Country of ref document: EP

Kind code of ref document: A1