WO2021169817A1 - 视频处理方法及电子设备 - Google Patents

视频处理方法及电子设备 Download PDF

Info

Publication number
WO2021169817A1
WO2021169817A1 PCT/CN2021/076414 CN2021076414W WO2021169817A1 WO 2021169817 A1 WO2021169817 A1 WO 2021169817A1 CN 2021076414 W CN2021076414 W CN 2021076414W WO 2021169817 A1 WO2021169817 A1 WO 2021169817A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
screen video
resolution
coding
computer
Prior art date
Application number
PCT/CN2021/076414
Other languages
English (en)
French (fr)
Inventor
黎凌宇
王悦
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2021169817A1 publication Critical patent/WO2021169817A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display

Definitions

  • the embodiments of the present disclosure relate to the field of coding technology, and in particular, to a video processing method and electronic equipment.
  • Video image signals are Advantages such as high efficiency have become the most important way for people to obtain information in daily life.
  • screen video is the video content directly captured in the image display of computers, mobile phones and other terminals. It mainly includes computer graphics, text documents, natural video and graphic text mixed images, and computer-generated images. Screen video coding has broad application prospects in desktop sharing, video conferencing, online education, cloud games and other fields.
  • hevc scc puts forward an expansion proposal for screen video content on hevc/h.265.
  • the hevc scc coding tools mainly include intra block copy (intra block copy, referred to as IBC), hash-based motion search (hash motion estimation, referred to as hashme), palette (palette) encoding, and adaptive color space conversion (Adaptive color transform) , Referred to as ACT) and so on.
  • the existing video coding usually directly activates the above coding tools for all regions, which has a large amount of data processing, and the coding effect for non-screen video regions is not good.
  • the coding effects of the above coding tools are similar, but the existing video coding uses the above coding tools to perform repeated calculations, which causes the problem of calculation redundancy.
  • the embodiments of the present disclosure provide a video processing method and electronic device to overcome the problems of large processing volume of existing video encoding data, poor encoding effect, and calculation redundancy.
  • embodiments of the present disclosure provide a video processing method, including:
  • the image block is the screen video content, in the first screen video coding mode, performing conversion between the coding unit of the image block and the bit representation of the coding unit;
  • the step of converting between the coding unit of the image block and the bit representation of the coding unit is performed again.
  • a video processing device including:
  • the determining module is used to determine whether the image block is a screen according to the color histogram of the brightness component of the image block after the resolution is reduced, and/or the prediction mode of the adjacent coded or decoded image block of the image block Video content;
  • the execution module is configured to, if the image block is the screen video content, perform the conversion between the coding unit of the image block and the bit representation of the coding unit in the first screen video coding mode;
  • the execution module is further configured to, if the conversion fails, in the second screen video coding mode, re-execute the step of conversion between the coding unit of the image block and the bit representation of the coding unit.
  • an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
  • the memory stores computer execution instructions
  • the at least one processor executes the computer-executable instructions stored in the memory, so that the at least one processor executes the video processing method described in the first aspect and various possible designs of the first aspect.
  • embodiments of the present disclosure provide a computer-readable storage medium that stores computer-executable instructions.
  • the processor executes the computer-executable instructions, the first aspect and the first aspect described above are implemented. In terms of various possible designs, the video processing method described.
  • embodiments of the present disclosure provide a computer program product, including computer program instructions, which cause a computer to execute the video processing method described in the first aspect and various possible designs of the first aspect.
  • embodiments of the present disclosure provide a computer program that, when the computer program runs on a computer, causes the computer to execute the video processing method described in the first aspect and various possible designs of the first aspect.
  • the video processing method and electronic device provided by the embodiments of the present disclosure use image blocks with reduced resolution for subsequent processing, thereby reducing the number of samples and the amount of subsequent data processing, and the embodiments of the present disclosure determine the reduced resolution image Whether the block is the screen video content, if it is, the screen video encoding is performed, that is, the screen video encoding tool is only turned on for the screen video area, so as to avoid the problem that the screen video encoding tool is turned on for all areas and the encoding effect is not good for the non-screen video area.
  • FIG. 1 is a schematic diagram of the architecture of a video processing system provided by an embodiment of the disclosure
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of another video processing method provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of still another video processing method provided by an embodiment of the disclosure.
  • FIG. 5 is a schematic diagram of reducing resolution through average resolution provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of reducing resolution through downsampling according to an embodiment of the disclosure.
  • FIG. 7 is a schematic flowchart of another video processing method provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic flowchart of another video processing method provided by an embodiment of the present disclosure.
  • FIG. 9 is a schematic flowchart of another video processing method provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a video processing device provided by an embodiment of the disclosure.
  • FIG. 11 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the disclosure.
  • Video coding usually refers to processing a sequence of pictures that form a video or video sequence.
  • picture In the field of video coding, the terms “picture”, “frame” or “image” can be used as synonyms.
  • the video encoding in the present disclosure is performed on the source side, and usually includes processing (for example, by compressing) the original video picture to reduce the amount of data required to represent the video picture (thus storing and/or transmitting more efficiently).
  • Video decoding is performed on the destination side and usually involves inverse processing relative to the encoder to reconstruct the video picture.
  • the term "block” may be a part of a picture or frame.
  • VVC Versatile Video Coding
  • VCEG ITU-T Video Coding Experts Group
  • VVC ISO/IEC Motion Picture Experts Group
  • JCT-VC Joint Collaboration Team on Video Coding
  • a coding tree unit is split into multiple coding units (Coding Unit, CU for short) by using a quad-tree structure represented as a coding tree.
  • CU is the coding unit, usually corresponding to an A ⁇ B rectangular area, including A ⁇ B luminance pixels and its corresponding chrominance pixels, A is the width of the rectangle, B is the height of the rectangle, and A and B can be the same. It can be different.
  • the values of A and B are usually integer powers of 2, such as 128, 64, 32, 16, 8, 4.
  • a coding unit can obtain a reconstructed image of an A ⁇ B rectangular area through decoding processing.
  • the decoding processing usually includes prediction, inverse quantization, inverse transformation, etc., to generate a predicted image and residual, and the predicted image and residual are superimposed to be reconstructed image.
  • CTU is the coding tree unit. An image is composed of multiple CTUs.
  • a CTU usually corresponds to a square image area and contains the luminance pixels and chrominance pixels in this image area (or it can also contain only luminance pixels. , Or only chrominance pixels);
  • the CTU also contains syntax elements, which indicate how to divide the CTU into at least one CU, and a method for decoding each coding unit to obtain a reconstructed image.
  • the existing screen video content is the video content directly captured in the image display of the computer, mobile phone and other terminals, mainly including computer graphics, text documents, natural video and graphic text mixed images, computer-generated images, etc.
  • hevc scc puts forward an expansion proposal for screen video content on hevc/h.265.
  • hevc scc coding tools mainly include IBC, hashme, palette, ACT, etc.
  • the existing video coding usually directly activates the above coding tools for all regions, which has a large amount of data processing, and the coding effect for non-screen video regions is not good.
  • the coding effects of the above coding tools are similar, but the existing video coding uses the above coding tools to perform repeated calculations, which causes the problem of calculation redundancy.
  • the present disclosure provides a video processing method that uses reduced-resolution image blocks for subsequent processing, thereby reducing the number of samples and the amount of subsequent data processing, and opening the screen video encoding tool only for the screen video area , To avoid the problem that the screen video coding tool is turned on for all areas, and the coding effect of the non-screen video area is not good.
  • the image block as the screen video content
  • try to perform the coding of the image block in the first screen video coding mode If the conversion between the bit representations of the unit and the coding unit fails, the above process is repeated in the second screen video coding mode, thereby avoiding the use of multiple screen video coding tools to perform repeated calculations, which causes the problem of calculation redundancy.
  • the video processing method provided by the present disclosure can be applied to the schematic diagram of the video processing system architecture shown in FIG. 1.
  • the video processing system 10 includes a source device 12 and a target device 14, and the source device 12 includes: image acquisition The device 121, the preprocessor 122, the encoder 123 and the communication interface 124.
  • the target device 14 includes a display device 141, a processor 142, a decoder 143, and a communication interface 144.
  • the source device 12 sends the encoded data 13 obtained by encoding to the target device 14.
  • the method of the present disclosure is applied to the encoder 123.
  • the source device 12 may be referred to as a video encoding device or a video encoding device.
  • the target device 14 may be referred to as a video decoding device or a video decoding device.
  • the source device 12 and the target device 14 may be examples of video encoding devices or video encoding devices.
  • the source device 12 and the target device 14 may include any of a variety of devices, including any type of handheld or stationary device, for example, a notebook or laptop computer, mobile phone, smart phone, tablet or tablet computer, video camera, desktop computer , Set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices, etc., and may not be used or Use any type of operating system.
  • a notebook or laptop computer mobile phone, smart phone, tablet or tablet computer, video camera, desktop computer , Set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices, etc., and may not be used or Use any type of operating system.
  • source device 12 and target device 14 may be equipped for wireless communication. Therefore, the source device 12 and the target device 14 may be wireless communication devices.
  • the video processing system 10 shown in FIG. 1 is only an example, and the technology of the present disclosure can be applied to video encoding settings (for example, video encoding or video decoding) that do not necessarily include any data communication between encoding and decoding devices. .
  • the data can be retrieved from local storage, streamed on the network, etc.
  • the video encoding device can encode data and store the data to the memory, and/or the video decoding device can retrieve the data from the memory and decode the data.
  • encoding and decoding are performed by devices that do not communicate with each other but only encode data to and/or retrieve data from the memory and decode the data.
  • the encoder 123 of the video processing system 10 may also be referred to as a video encoder, and the decoder 143 may also be referred to as a video decoder.
  • the picture acquisition device 121 may include or may be any type of picture capture device, for example, to capture real-world pictures, and/or any type of pictures or comments (for screen content encoding, some text on the screen is also Considered to be a part of the picture or image to be encoded) generating equipment, for example, a computer graphics processor for generating computer animation pictures, or for obtaining and/or providing real-world pictures, computer animation pictures (for example, screen content, virtual Any type of equipment of virtual reality (VR) pictures, and/or any combination thereof (for example, augmented reality (AR) pictures).
  • the picture is or can be regarded as a two-dimensional array or matrix of sampling points with brightness values.
  • the sampling points in the array may also be called pixels (pixels) or pixels (picture elements, pels for short).
  • the number of sampling points of the array in the horizontal and vertical directions (or axis) defines the size and/or resolution of the picture.
  • three color components are usually used, that is, pictures can be represented as or contain three sample arrays.
  • RGB Red Green Blue
  • a picture includes corresponding red, green, and blue sample arrays.
  • each pixel is usually represented in a luminance/chrominance format or color space, for example, YUV (Luma and Chroma), including the Luminance (Luminance, abbreviated as luma) component indicated by Y ( Sometimes it can also be indicated by L) and the two chrominance (Chrominance, abbreviated as chroma) components indicated by U and V (sometimes can also be indicated by Cb and Cr).
  • the luminance component Y represents luminance or gray level intensity (for example, the two are the same in a grayscale picture), and the two chrominance components U and V represent chrominance or color information components.
  • a picture in the YUV format includes a luminance sample array of the luminance component (Y), and two chrominance sample arrays of the chrominance component (U and V).
  • Pictures in RGB format can be converted or converted to YUV format, and vice versa. This process is also called color conversion or conversion.
  • the picture acquisition device 121 may be, for example, a camera for capturing pictures, such as a memory of a picture memory, including or storing previously captured or generated pictures, and/or any type of (internal or external) interface for acquiring or receiving pictures.
  • the camera may be, for example, an integrated camera that is local or integrated in the source device, and the memory may be local or, for example, an integrated memory that is integrated in the source device.
  • the interface may be, for example, an external interface for receiving pictures from an external video source.
  • the external video source is, for example, an external picture capturing device, such as a camera, an external memory, or an external picture generating device
  • the external picture generating device is, for example, an external computer graphics processor. , Computer or server.
  • the interface can be any type of interface according to any proprietary or standardized interface protocol, such as a wired or wireless interface, and an optical interface.
  • the interface for acquiring the picture data 125 in FIG. 1 may be the same interface as the communication interface 124 or a part of the communication interface 124.
  • the picture data 125 (for example, video data) may be referred to as original picture or original picture data.
  • the pre-processor 122 is used to receive the picture data 125 and perform pre-processing on the picture data 125 to obtain a pre-processed picture (or pre-processed picture data) 126.
  • the preprocessing performed by the preprocessor 122 may include trimming, color format conversion (for example, conversion from RGB to YUV), toning or denoising. It can be understood that the pre-processor 122 may be an optional component.
  • the encoder 123 (eg, a video encoder) is used to receive pre-processed pictures (or pre-processed picture data) 126 and provide encoded picture data 127.
  • the communication interface 124 of the source device 12 can be used to receive the encoded picture data 127 and transmit it to other devices, for example, the target device 14 or any other device for storage or direct reconstruction, or for storing
  • the encoded data 13 is stored and/or the encoded picture data 127 is processed before transmitting the encoded data 13 to other devices, such as the target device 14 or any other device for decoding or storage.
  • the communication interface 144 of the target device 14 is used, for example, to directly receive the encoded picture data 127 or the encoded data 13 from the source device 12 or any other source. Any other source is, for example, a storage device, and the storage device is, for example, an encoded picture data storage device.
  • the communication interface 124 and the communication interface 144 can be used to directly communicate through the direct communication link between the source device 12 and the target device 14 or through any type of network to transmit or receive the encoded picture data 127 or the encoded data 13
  • the link is, for example, a direct wired or wireless connection, and any type of network is, for example, a wired or wireless network or any combination thereof, or any type of private network and public network, or any combination thereof.
  • the communication interface 124 may be used, for example, to encapsulate the encoded picture data 127 into a suitable format, such as a packet, for transmission on a communication link or communication network.
  • the communication interface 144 forming the corresponding part of the communication interface 124 may be used, for example, to decapsulate the encoded data 13 to obtain the encoded picture data 127.
  • Both the communication interface 124 and the communication interface 144 can be configured as a one-way communication interface, as indicated by the arrow pointing from the source device 12 to the target device 14 for the encoded picture data 127 in FIG. 1, or can be configured as a two-way communication interface, and can It is used, for example, to send and receive messages to establish a connection, confirm and exchange any other information related to the communication link and/or data transmission such as the transmission of encoded picture data.
  • the decoder 143 is used to receive encoded picture data 127 and provide decoded picture data (or decoded picture) 145.
  • the processor 142 of the target device 14 is used to post-process decoded picture data (or decoded picture) 145, for example, a decoded picture, to obtain post-processed picture data 146, for example, a post-processed picture.
  • the post-processing performed by the processor 142 may include, for example, color format conversion (for example, conversion from YUV to RGB), toning, trimming or resampling, or any other processing for preparing decoded picture data (or decoded picture data, for example).
  • the picture 145 is displayed by the display device 141.
  • the display device 141 of the target device 14 is used to receive the post-processed picture data 145 to display the picture to, for example, a user or viewer.
  • the display device 141 may be or may include any type of display for presenting the reconstructed picture, for example, an integrated or external display or monitor.
  • the display may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED (light emitting diode) display, and a silicon-based display.
  • LCD liquid crystal display
  • OLED organic light emitting diode
  • plasma display a plasma display
  • a projector a micro LED (light emitting diode) display
  • silicon-based display silicon-based display.
  • Liquid crystal liquid crystal on silicon, LCoS for short
  • DLP digital light processor
  • FIG. 1 depicts the source device 12 and the target device 14 as separate devices
  • the device embodiment may also include the source device 12 and the target device 14 or the functionality of both, that is, the source device 12 or the corresponding The functionality of the target device 14 or the corresponding functionality.
  • the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the target device 14 or the corresponding functionality.
  • the functionality of different units or the existence and (accurate) division of the functionality of the source device 12 and/or the target device 14 shown in FIG. 1 may vary according to actual devices and applications.
  • both the encoder 123 e.g., video encoder
  • the decoder 143 e.g., video decoder
  • DSP digital Signal processor
  • ASIC application-specific integrated circuit
  • FPGA field-programmable gate array
  • the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors to execute the instructions in hardware to execute the technology of the present disclosure. .
  • Each of the encoder 123 and the decoder 143 may be included in one or more encoders or decoders, and any one of the encoders or decoders may be integrated as a combined encoder/decoder in the corresponding device ( Codec).
  • the decoder 143 may be used to perform the reverse process.
  • the decoder 143 can be used to receive and parse such syntax elements, and decode related video data accordingly.
  • the encoder 123 may entropy encode one or more defined syntax elements into an encoded video bitstream. In such instances, the decoder 143 can parse such syntax elements and decode related video data accordingly.
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the present disclosure.
  • the execution subject of the embodiment of the present disclosure may be the encoder in the foregoing embodiment.
  • the method may include:
  • S201 Determine whether the image block is screen video content according to the color histogram of the brightness component of the image block after the resolution is reduced, and/or the prediction mode of the adjacent coded or decoded image block of the image block.
  • one image block corresponds to one coding unit (CU), and in some cases, the image block may also be referred to as a CU. It can be seen from the above that one image is composed of multiple coding tree units (CTU), and one CTU can be split into multiple CUs, that is, one CTU can be split into multiple image blocks.
  • CTU coding tree units
  • the foregoing determining whether the foregoing image block is screen video content may include:
  • the maximum value and minimum value of non-zero values, etc. it is determined whether the image block is screen video content.
  • the image block is screen video content according to whether the prediction mode of the adjacent coded or decoded image block of the image block is a preset mode.
  • how to determine whether the above-mentioned image block is screen video content mentioned in the embodiment of the present disclosure may exist independently of the method shown in FIG. 2.
  • the screen video coding tool can be opened only for the screen video area in the future, avoiding the opening of the screen video coding tool for all areas, which wastes computing power, and coding for non-screen video areas The problem of poor results.
  • the first screen video processing mode can be set according to actual conditions, such as IBC encoding mode or hashme mode.
  • IBC encoding mode or hashme mode the conversion between the coding unit of the image block and the bit representation of the coding unit is performed.
  • the encoder first tries the first screen video processing mode, collects information required by the first screen video processing mode, and performs encoding and decoding of the first screen video processing mode.
  • the foregoing conversion between the coding unit of the image block and the bit representation of the coding unit may be the coding process of the image block, or may be the decoding process of the image block, for example, the conversion of the bit representation of the coding unit to the coding unit Represents the encoding process of the above-mentioned image block, and the bits of the encoding unit represent the decoding process of the above-mentioned image block converted to the encoding unit.
  • step S203 is performed.
  • the second screen video encoding mode is different from the first screen video encoding mode. After the encoding of the first screen video encoding mode fails, other screen video encoding modes are used for encoding.
  • the embodiments of the present disclosure use reduced-resolution image blocks for subsequent processing, thereby reducing the number of samples and reducing the amount of subsequent data processing, and the embodiments of the present disclosure determine whether the reduced-resolution image blocks are screen video content. , If it is, then the screen video encoding is performed, that is, the screen video encoding tool is only turned on for the screen video area, so as to avoid the problem that the screen video encoding tool is turned on for all areas and the encoding effect is not good for the non-screen video area. In addition, after confirming the above After the image block is the screen video content, try to perform the conversion between the coding unit of the image block and the bit representation of the coding unit in the first screen video encoding mode. If it fails, repeat the above process in the second screen video encoding mode , And further, avoid using multiple screen video coding tools to perform repeated calculations, causing the problem of calculation redundancy.
  • FIG. 3 is a schematic flowchart of another video processing method proposed by an embodiment of the disclosure.
  • the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 3, the method includes:
  • S301 Perform resolution reduction processing on the brightness component of the coding tree unit to be processed to obtain a reduced resolution image block.
  • the foregoing coding tree unit to be processed may be determined according to actual conditions, and the comparison of the embodiments of the present disclosure is not particularly limited.
  • the luminance component is subjected to resolution reduction processing to obtain a reduced-resolution image block, for example, the size is (N/2)x(N/2).
  • the code control preprocessing lookahead tool can be set in the encoder to perform resolution reduction processing on the luminance components of the coding tree unit to be processed, where the lookahead tool can be set to perform processing for reducing the resolution processing of the CTU luminance components.
  • the process here, the process can be set according to the situation, and the comparison of the embodiments of the present disclosure is not particularly limited.
  • the encoder can also determine whether the code control preprocessing lookahead tool is turned on before the luminance component of the coding tree unit to be processed is reduced. The luminance component undergoes resolution reduction processing. If it is not enabled, the code control preprocessing lookahead tool is turned on, and the luminance component of the coding tree unit to be processed is processed for resolution reduction.
  • S302 Determine whether the image block is screen video content according to the color histogram of the brightness component of the image block after the resolution is reduced, and/or the prediction mode of the adjacent coded or decoded image block of the image block.
  • steps S302-S304 are implemented in the same manner as the foregoing steps S201-S203, and will not be repeated here.
  • the embodiment of the present disclosure reduces the number of samples and the amount of subsequent data processing by performing resolution reduction processing on the CTU to be processed, and the embodiment of the present disclosure determines whether the image block after the resolution is reduced is screen video content, if Yes, only on-screen video encoding is performed, that is, the on-screen video encoding tool is only turned on for the screen video area, to avoid the problem that the on-screen video encoding tool is turned on for all areas and the encoding effect for non-screen video areas is not good.
  • FIG. 4 is a schematic flowchart of another video processing method proposed by an embodiment of the disclosure.
  • the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1.
  • the method includes:
  • S401 Decrease the resolution of the brightness component of the coding tree unit to be processed to a preset resolution, and obtain a reduced-resolution image block.
  • the above-mentioned preset resolution can be set according to actual conditions, which is not particularly limited in the embodiment of the present disclosure.
  • the aforementioned preset resolution is an average resolution of the luminance component of the aforementioned coding tree unit to be processed.
  • the foregoing reduction of the resolution of the luminance component of the coding tree unit to be processed to the preset resolution includes:
  • the resolution of the luminance component of the above-mentioned processed coding tree unit is reduced to the above-mentioned average resolution.
  • the brightness component of the coding tree unit to be processed is shown on the left side of FIG.
  • the image block is shown on the right side of Figure 5.
  • S402 Down-sampling the resolution of the brightness component of the coding tree unit to be processed to obtain a reduced-resolution image block.
  • the size of the brightness component of the coding tree unit to be processed is M*N, and it is down-sampled by s times to obtain a resolution image with a size of (M/s)*(N/s).
  • s should be The common divisor of M and N, the specific value can be set according to the actual situation.
  • the luminance component of the coding tree unit to be processed is shown on the left side of Fig. 6.
  • a reduced-resolution image block is obtained as As shown on the right side of Figure 6.
  • step S401 and step S402 are parallel steps, and the embodiment of the present disclosure may adopt any one of these steps to obtain a reduced-resolution image block.
  • the embodiment of the present disclosure may also set the above-mentioned specific manner of performing resolution reduction processing on the luminance component of the coding tree unit to be processed according to actual conditions.
  • the embodiment of the present disclosure does not make any special considerations to this. limit.
  • S403 Determine whether the image block is screen video content according to the color histogram of the brightness component of the image block after the resolution is reduced, and/or the prediction mode of the adjacent coded or decoded image block of the image block.
  • steps S403-S405 are implemented in the same manner as the foregoing steps S201-S203, and will not be repeated here.
  • the embodiments of the present disclosure can perform resolution reduction processing on the CTU to be processed by preset resolution or down-sampling, thereby reducing the number of samples and the amount of subsequent data processing, and the embodiments of the present disclosure determine the reduced resolution image Whether the block is the screen video content, if it is, the screen video encoding is performed, that is, the screen video encoding tool is only turned on for the screen video area, so as to avoid the problem that the screen video encoding tool is turned on for all areas and the encoding effect is not good for the non-screen video area.
  • the second screen video encoding Mode repeat the above process, and further avoid the use of multiple screen video coding tools for repeated calculations, resulting in the problem of calculation redundancy.
  • FIG. 7 is a schematic flowchart of another video processing method proposed by an embodiment of the disclosure.
  • the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 7, the method includes:
  • S701 Determine whether the image block is screen video content according to the number of non-zero values in the color histogram of the brightness component of the image block after the resolution is reduced, and/or the maximum and minimum values of the non-zero values .
  • the image block is screen video content.
  • the image block is screen video content.
  • the image block is screen video content.
  • the above-mentioned preset difference threshold may be set according to actual conditions, which is not particularly limited in the embodiment of the present disclosure.
  • the image block is the screen video content.
  • the preset non-zero value can be set according to the actual situation, for example, determine the largest 5 values in hist[i], and sum to obtain top5sum.
  • the above-mentioned preset multiple may be set according to actual conditions, which is not particularly limited in the embodiment of the present disclosure. For example, if the above top5sum is greater than the beta times of the size of the current image block, it is determined that the above image block is screen video content, and the beta times is the above preset multiple.
  • the sum of multiple preset non-zero values in the color histogram is determined. If the sum of the foregoing multiple preset non-zero values is greater than a preset multiple of the size of the foregoing image block, it is determined that the foregoing image block is screen video content.
  • the embodiment of the present disclosure may also set the foregoing specific method for determining whether the image block is the screen video content according to the actual situation, which is not particularly limited in the embodiment of the present disclosure.
  • the embodiments of the present disclosure determine that the above-mentioned image block is screen video content, so that the screen video coding tool is only opened for the screen video area in the subsequent, so as to avoid opening the screen video coding tool for all areas, which wastes computing power, and is used for non-screen video areas. Poor coding effect.
  • steps S702-S703 are implemented in the same manner as the foregoing steps S202-S203, and will not be repeated here.
  • the embodiment of the present disclosure performs resolution reduction processing on the CTU to be processed, thereby reducing the number of samples and reducing the amount of subsequent data processing, and the embodiment of the present disclosure determines whether the image block after the resolution is reduced is the screen video content, if it is ,
  • the on-screen video encoding is performed, that is, the on-screen video encoding tool is only turned on for the screen video area, to avoid the problem of turning on the on-screen video encoding tool for all areas, and the encoding effect of non-screen video areas is not good.
  • FIG. 8 is a schematic flowchart of another video processing method proposed by an embodiment of the disclosure.
  • the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 8, the method includes:
  • S801 Determine whether the optimal prediction mode of the adjacent coded or decoded image block of the reduced-resolution image block is a preset intra-frame prediction mode.
  • the above-mentioned adjacent coded or decoded image blocks may be image blocks on the left, top, and top left adjacent to the above-mentioned image blocks.
  • the foregoing preset intra prediction mode can be set according to actual conditions, which is not particularly limited in the embodiment of the present disclosure.
  • the encoder judges whether the optimal prediction mode of the image block adjacent to the above image block is one of 35 intra (intra) modes, if not, it determines that the image block is a screen video content.
  • the embodiment of the present disclosure may also set the foregoing specific method for determining whether the image block is the screen video content according to the actual situation, which is not particularly limited in the embodiment of the present disclosure.
  • the embodiments of the present disclosure determine that the above-mentioned image block is screen video content, so that the screen video coding tool is only opened for the screen video area in the subsequent, so as to avoid opening the screen video coding tool for all areas, which wastes computing power, and is used for non-screen video areas. Poor coding effect.
  • steps S803-S804 are implemented in the same manner as the foregoing steps S202-S203, and will not be repeated here.
  • the embodiment of the present disclosure performs resolution reduction processing on the CTU to be processed, thereby reducing the number of samples and reducing the amount of subsequent data processing, and the embodiment of the present disclosure determines whether the image block after the resolution is reduced is the screen video content, if it is ,
  • the on-screen video encoding is performed, that is, the on-screen video encoding tool is only turned on for the screen video area, to avoid the problem of turning on the on-screen video encoding tool for all areas, and the encoding effect of non-screen video areas is not good.
  • the above-mentioned second screen video coding mode is a palette coding mode.
  • the second screen video coding mode before performing the conversion between the coding unit of the image block and the bit representation of the coding unit, It is possible to perform resolution reduction processing on the chrominance component of the above-mentioned image block.
  • FIG. 9 is a schematic flowchart of another video processing method proposed by an embodiment of the disclosure.
  • the execution subject of this embodiment may be the encoder in the embodiment shown in FIG. 1. As shown in Figure 9, the method includes:
  • S901 Determine whether the image block is screen video content according to the color histogram of the brightness component of the image block after the resolution is reduced, and/or the prediction mode of the adjacent coded or decoded image block of the image block.
  • steps S901-S902 are implemented in the same manner as the foregoing steps S201-S202, and will not be repeated here.
  • the code control preprocessing lookahead tool is used to perform resolution reduction processing on the chrominance component of the above image block.
  • the resolution of the chrominance component of the image block may be reduced to a preset resolution, or the resolution of the chrominance component of the image block may be down-sampled.
  • the above-mentioned preset resolution can be set according to actual conditions, which is not particularly limited in the embodiment of the present disclosure.
  • the above-mentioned preset resolution is the average resolution of the chrominance components of the above-mentioned image block.
  • the number of samples is further reduced, and the amount of subsequent data processing is reduced.
  • the above-mentioned second screen video coding mode may be a coding mode related to brightness/chroma or color, such as a palatte coding mode.
  • the palatte coding mode is related to brightness/chroma or color. Therefore, in this mode Next, perform resolution reduction processing on the chrominance components of the above-mentioned image blocks to further reduce the number of samples and reduce the amount of subsequent data processing.
  • the chrominance component of the image block is reduced in resolution
  • the chrominance component of the image block is not reduced in resolution.
  • the embodiment of the present disclosure performs resolution reduction processing on the CTU to be processed, thereby reducing the number of samples and the amount of subsequent data processing, and the embodiment of the present disclosure determines whether the image block after the resolution is reduced is the screen video content, and if it is, the screen video is performed Encoding, that is, only to open the screen video encoding tool for the screen video area, avoid the problem of turning on the screen video encoding tool for all areas, and the encoding effect of the non-screen video area is not good.
  • Encoding that is, only to open the screen video encoding tool for the screen video area, avoid the problem of turning on the screen video encoding tool for all areas, and the encoding effect of the non-screen video area is not good.
  • the embodiment of the present disclosure determines whether the image block after the resolution is reduced is the screen video content, and if it is, the screen video is performed Encoding, that is, only to open the screen video encoding tool for the screen video area, avoid the problem of turning on the screen video encoding tool
  • the embodiments of the present disclosure can also perform resolution reduction processing on the chrominance component of the image block to further reduce the number of samples. , To reduce the amount of subsequent data processing.
  • FIG. 10 is a schematic structural diagram of a video processing device provided in an embodiment of the disclosure.
  • the video processing device 100 includes: a determination module 1001, an execution module 1002, and a processing module 1003.
  • the determining module 1001 is configured to determine the image block according to the color histogram of the brightness component of the reduced-resolution image block, and/or the prediction mode of the adjacent coded or decoded image block of the image block Whether it is screen video content.
  • the execution module 1002 is configured to perform the conversion between the coding unit of the image block and the bit representation of the coding unit in the first screen video coding mode if the image block is the screen video content.
  • the execution module 1002 is further configured to perform the conversion between the coding unit of the image block and the bit representation of the coding unit again in the second screen video coding mode if the conversion fails.
  • the processing module 1003 is configured to perform resolution reduction processing on the brightness component of the coding tree unit to be processed to obtain the image block with the reduced resolution.
  • the processing module 1003 performs resolution reduction processing on the luminance component of the coding tree unit to be processed, including:
  • the code control preprocessing lookahead tool is used to perform resolution reduction processing on the luminance component of the coding tree unit to be processed.
  • the processing module 1003 performs resolution reduction processing on the luminance component of the coding tree unit to be processed, including:
  • the determining module 1001 determines whether the image block is screen video content according to the color histogram of the brightness component of the reduced-resolution image block, including:
  • the image block is the screen video content.
  • the determining module 1001 determines whether the image block is based on the number of non-zero values in the color histogram, and/or the maximum and minimum values of the non-zero values Is the screen video content, including:
  • the image block is the screen video content.
  • the determining module 1001 determines whether the image block is screen video content according to the prediction mode of the adjacent encoded or decoded image block of the image block, including:
  • the optimal prediction mode of the adjacent coded or decoded image block is not the preset intra prediction mode, it is determined that the image block is the screen video content.
  • the second screen video coding mode is a palette coding mode.
  • the method before the execution module 1002 re-executes the conversion between the coding unit of the image block and the bit representation of the coding unit in the second screen video coding mode, the method further includes:
  • the chrominance component of the image block is reduced in resolution.
  • the first screen encoding mode is an intra-block copy IBC mode or a hash-based motion search hashme mode.
  • the preset resolution is an average resolution of the luminance component of the coding tree unit to be processed
  • the processing module 1003 reducing the resolution of the luminance component of the coding tree unit to be processed to a preset resolution includes:
  • the resolution of the luminance component of the processing coding tree unit is reduced to the average resolution.
  • the device provided in the embodiment of the present disclosure can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and the embodiments of the present disclosure will not be repeated here.
  • the electronic device 1100 may include a processing device (such as a central processing unit, a graphics processor, etc.) 1101, which may be loaded from a storage device 1108 according to a program stored in a read only memory (Read Only Memory, ROM for short) 1102 To the program in Random Access Memory (Random Access Memory, RAM for short) 1103, various appropriate actions and processing are executed. In the RAM 1103, various programs and data required for the operation of the electronic device 1100 are also stored.
  • the processing device 1101, the ROM 1102, and the RAM 1103 are connected to each other through a bus 1104.
  • An input/output (Input/Output, I/O) interface 1105 is also connected to the bus 1104.
  • the following devices can be connected to the I/O interface 1105: including input devices 1106 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including output such as LCD, speaker, vibrator, etc.
  • input devices 1106 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.
  • output such as LCD, speaker, vibrator, etc.
  • the communication device 1109 may allow the electronic device 1100 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 11 shows an electronic device 1100 having various devices, it should be understood that it is not required to implement or have all of the illustrated devices. It may be implemented alternatively or provided with more or fewer devices.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 1109, or installed from the storage device 1108, or installed from the ROM 1102.
  • the processing device 1101 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (Electrical Programmable ROM, EPROM or flash memory), optical fiber, portable compact disc read-only memory (Compact Disc ROM, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein.
  • This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the foregoing computer-readable medium carries one or more programs, and when the foregoing one or more programs are executed by the electronic device, the electronic device is caused to execute the method shown in the foregoing embodiment.
  • the computer program code used to perform the operations of the present disclosure may be written in one or more programming languages or a combination thereof.
  • the above-mentioned programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to the outside Computer (for example, using an Internet service provider to connect via the Internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in the flowchart or block diagram may represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present disclosure can be implemented in software or hardware. Wherein, the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • the first obtaining unit can also be described as "a unit for obtaining at least two Internet Protocol addresses.”
  • exemplary types of hardware logic components include: Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Parts (ASSP), System on Chip (System on Chip, SOC), Complex Programmable Logic Device (CPLD), etc.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Parts
  • SOC System on Chip
  • CPLD Complex Programmable Logic Device
  • a machine-readable medium may be a tangible medium, which may contain or store a program for use by the instruction execution system, apparatus, or device or in combination with the instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • the machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing.
  • machine-readable storage media would include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or flash memory erasable programmable read-only memory
  • CD-ROM compact disk read only memory
  • magnetic storage device or any suitable combination of the foregoing.
  • a video processing method including:
  • the image block is the screen video content, in the first screen video coding mode, performing conversion between the coding unit of the image block and the bit representation of the coding unit;
  • the step of converting between the coding unit of the image block and the bit representation of the coding unit is performed again.
  • the method further includes:
  • the performing resolution reduction processing on the luminance component of the coding tree unit to be processed includes:
  • the code control preprocessing lookahead tool is used to perform resolution reduction processing on the luminance component of the coding tree unit to be processed.
  • the performing resolution reduction processing on the luminance component of the coding tree unit to be processed includes:
  • the determining whether the image block is screen video content according to the color histogram of the brightness component of the reduced-resolution image block includes:
  • the determining whether the image block is based on the number of non-zero values in the color histogram, and/or the maximum and minimum values of the non-zero values Is the screen video content including:
  • the image block is the screen video content.
  • the determining whether the image block is screen video content according to the prediction mode of the adjacent encoded or decoded image block of the image block includes:
  • the optimal prediction mode of the adjacent coded or decoded image block is not the preset intra prediction mode, it is determined that the image block is the screen video content.
  • the second screen video encoding mode is a palette encoding mode.
  • the method further includes:
  • the first screen encoding mode is an intra-block copy IBC mode or a hash-based motion search hashme mode.
  • the preset resolution is an average resolution of the luminance component of the coding tree unit to be processed
  • the reducing the resolution of the luminance component of the coding tree unit to be processed to a preset resolution includes:
  • the resolution of the luminance component of the processing coding tree unit is reduced to the average resolution.
  • a video processing device including:
  • the determining module is used to determine whether the image block is a screen according to the color histogram of the brightness component of the image block after the resolution is reduced, and/or the prediction mode of the adjacent coded or decoded image block of the image block Video content;
  • the execution module is configured to, if the image block is the screen video content, perform the conversion between the coding unit of the image block and the bit representation of the coding unit in the first screen video coding mode;
  • the execution module is further configured to, if the conversion fails, in the second screen video coding mode, re-execute the step of conversion between the coding unit of the image block and the bit representation of the coding unit.
  • an electronic device including: at least one processor and a memory;
  • the memory stores computer execution instructions
  • the at least one processor executes the computer-executable instructions stored in the memory, so that the at least one processor executes the video processing method described in the first aspect and various possible designs of the first aspect.
  • a computer-readable storage medium stores computer-executable instructions.
  • a processor executes the computer-executable instructions, The video processing method described in the above first aspect and various possible designs of the first aspect is implemented.
  • a computer program product including computer program instructions that cause a computer to execute the above-mentioned first aspect and various possible designs of the first aspect. The video processing method described.
  • a computer program is provided.
  • the computer program runs on a computer, the computer executes the above-mentioned first aspect and various possible possibilities of the first aspect. Design the described video processing method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本公开实施例提供一种视频处理方法及电子设备,该方法采用降分辨率后的图像块进行后续处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。

Description

视频处理方法及电子设备 技术领域
本公开实施例涉及编码技术领域,尤其涉及一种视频处理方法及电子设备。
背景技术
随着信息技术的发展,高清晰度电视、网络会议、交互式网络电视(Internet Protocol Television,IPTV)、三维(three dimensions,3D)电视等视频图像业务迅速发展,视频图像信号以其直观性和高效性等优势成为人们日常生活中获取信息最主要的方式。以屏幕视频为例,屏幕视频内容是计算机、手机等终端的图像显示器里直接捕捉到的视频内容,主要包括计算机图形,文字文档,自然视频和图形文字混合图像,计算机生成图像等。屏幕视频编码在桌面共享、视频会议、在线教育、云游戏等领域有广泛应用前景。
相关技术中,hevc scc在hevc/h.265上针对屏幕视频内容提出拓展提案。hevc scc编码工具主要包括帧内块拷贝(intra block copy,简称IBC),基于hash的运动搜索(hash motion estimation,简称hashme)、调色板(palette)编码以及自适应颜色空间转换(Adaptive color transform,简称ACT)等。
然而,现有视频编码通常对所有区域直接开启上述编码工具,数据处理量较大,而且对非屏幕视频区域编码效果也不佳。另外,上述编码工具编码效果相似,但现有视频编码会分别采用上述编码工具进行重复计算,造成计算冗余的问题。
发明内容
本公开实施例提供一种视频处理方法及电子设备,以克服现有视频编码数据处理量较大、编码效果不佳,以及计算冗余的问题。
第一方面,本公开实施例提供一种视频处理方法,包括:
根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容;
若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换;
若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
第二方面,本公开实施例提供一种视频处理装置,包括:
确定模块,用于根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容;
执行模块,用于若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换;
所述执行模块,还用于若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
第三方面,本公开实施例提供一种电子设备,包括:至少一个处理器和存储器;
所述存储器存储计算机执行指令;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第四方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第五方面,本公开实施例提供一种计算机程序产品,包括计算机程序指令,所述计算机程序指令使得计算机执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第六方面,本公开实施例提供一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
本公开实施例提供的视频处理方法及电子设备,该方法采用降分辨率后的图像块进行后续处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种视频处理系统架构示意图;
图2为本公开实施例提供的一种视频处理方法的流程示意图;
图3为本公开实施例提供的另一种视频处理方法的流程示意图;
图4为本公开实施例提供的再一种视频处理方法的流程示意图;
图5为本公开实施例提供的通过平均分辨率降分辨率的示意图;
图6为本公开实施例提供的通过下采样降分辨率的示意图;
图7为本公开实施例提供的又一种视频处理方法的流程示意图;
图8为本公开实施例提供的又一种视频处理方法的流程示意图;
图9为本公开实施例提供的又一种视频处理方法的流程示意图;
图10为本公开实施例提供的一种视频处理装置的结构示意图;
图11为本公开实施例提供的电子设备的硬件结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附 图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
首先对本公开所涉及的名词进行解释:
视频编码:通常是指处理形成视频或视频序列的图片序列。在视频编码领域,术语“图片(picture)”、“帧(frame)”或“图像(image)”可以用作同义词。本公开中视频编码在源侧执行,通常包括处理(例如,通过压缩)原始视频图片以减少表示该视频图片所需的数据量(从而更高效地存储和/或传输)。视频解码在目的地侧执行,通常包括相对于编码器作逆处理,以重构视频图片。
如本公开中所用,术语“块”可以为图片或帧的一部分。为便于描述,参考多用途视频编码(Versatile Video Coding,简称VVC)或由ITU-T视频编码专家组(Video Coding Experts Group,简称VCEG)和ISO/IEC运动图像专家组(Motion Picture Experts Group,简称MPEG)的视频编码联合工作组(Joint Collaboration Team on Video Coding,简称JCT-VC)开发的高效视频编码(High-Efficiency Video Coding,简称HEVC)描述本公开实施例。在HEVC中,通过使用表示为编码树的四叉树结构将编码树单元拆分为多个编码单元(Coding Unit,简称CU)。其中,CU即编码单元,通常对应于一个A×B的矩形区域,包含A×B亮度像素和它对应的色度像素,A为矩形的宽,B为矩形的高,A和B可以相同也可以不同,A和B的取值通常为2的整数次幂,例如128、64、32、16、8、4。一个编码单元可通过解码处理解码得到一个A×B的矩形区域的重建图像,解码处理通常包括预测、反量化、反变换等处理,产生预测图像和残差,预测图像和残差叠加后得到重建图像。CTU即编码树单元(coding tree unit),一幅图像由多个CTU构成,一个CTU通常对应于一个方形图像区域,包含这个图像区域中的亮度像素和色度像素(或者也可以只包含亮度像素,或者也可以只包含色度像素);CTU中还包含语法元素,这些语法元素指示如何将CTU划分成至少一个CU,以及解码每个编码单元得到重建图像的方法。
现有屏幕视频内容是计算机、手机等终端的图像显示器里直接捕捉到的视频内容,主要包括计算机图形,文字文档,自然视频和图形文字混合图像,计算机生成图像等。相关技术中,hevc scc在hevc/h.265上针对屏幕视频内容提出拓展提案。hevc scc编码工具主要包括IBC、hashme、palette、ACT等。
然而,现有视频编码通常对所有区域直接开启上述编码工具,数据处理量较大,而且对非屏幕视频区域编码效果也不佳。另外,上述编码工具编码效果相似,但现有视频编码会分别采用上述编码工具进行重复计算,造成计算冗余的问题。
因此,考虑到上述问题,本公开提供一种视频处理方法,采用降分辨率后的图像块进行后续处理,从而降低样本数量,减少后续数据处理量,而且仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
本公开提供的一种视频处理方法,可以适用于图1所示的视频处理系统架构示意图,如图1所示,视频处理系统10包括源设备12和目标设备14,源设备12包括:图片获取装置 121、预处理器122、编码器123和通信接口124。目标设备14包括:显示设备141、处理器142、解码器143和通信接口144。源设备12将编码得到的编码数据13发送给目标设备14。本公开的方法应用于编码器123。
其中,源设备12可称为视频编码设备或视频编码装置。目标设备14可称为视频解码设备或视频解码装置。源设备12以及目标设备14可以是视频编码设备或视频编码装置的实例。
源设备12和目标设备14可以包括各种设备中的任一个,包含任何类别的手持或静止设备,例如,笔记本或膝上型计算机、移动电话、智能电话、平板或平板计算机、摄像机、台式计算机、机顶盒、电视、显示设备、数字媒体播放器、视频游戏控制台、视频流式传输设备(例如内容服务服务器或内容分发服务器)、广播接收器设备、广播发射器设备等,并可以不使用或使用任何类别的操作系统。
在一些情况下,源设备12和目标设备14可以经装备以用于无线通信。因此,源设备12和目标设备14可以为无线通信设备。
在一些情况下,图1中所示视频处理系统10仅为示例,本公开的技术可以适用于不必包含编码和解码设备之间的任何数据通信的视频编码设置(例如,视频编码或视频解码)。在其它实例中,数据可从本地存储器检索、在网络上流式传输等。视频编码设备可以对数据进行编码并且将数据存储到存储器,和/或视频解码设备可以从存储器检索数据并且对数据进行解码。在一些实例中,由并不彼此通信而是仅编码数据到存储器和/或从存储器检索数据且解码数据的设备执行编码和解码。
在一些情况下,视频处理系统10的编码器123也可以称为视频编码器,解码器143也可以称为视频解码器。
在一些情况下,图片获取装置121可以包括或可以为任何类别的图片捕获设备,用于例如捕获现实世界图片,和/或任何类别的图片或评论(对于屏幕内容编码,屏幕上的一些文字也认为是待编码的图片或图像的一部分)生成设备,例如,用于生成计算机动画图片的计算机图形处理器,或用于获取和/或提供现实世界图片、计算机动画图片(例如,屏幕内容、虚拟现实(virtual reality,简称VR)图片)的任何类别设备,和/或其任何组合(例如,实景(augmented reality,简称AR)图片)。其中,图片为或者可以视为具有亮度值的采样点的二维阵列或矩阵。以阵列为例,阵列中的采样点也可以称为像素(pixel)或像素(picture element,简称pel)。阵列在水平和垂直方向(或轴线)上的采样点数目定义图片的尺寸和/或分辨率。为了表示颜色,通常采用三个颜色分量,即图片可以表示为或包含三个采样阵列。RGB(三原色,Red Green Blue)格式或颜色空间中,图片包括对应的红色、绿色及蓝色采样阵列。但是,在视频编码中,每个像素通常以亮度/色度格式或颜色空间表示,例如,YUV(亮度和色差信号,Luma and Chroma),包括Y指示的亮度(Luminance,简写为luma)分量(有时也可以用L指示)以及U和V指示的两个色度(Chrominance,简写为chroma)分量(有时也可以用Cb和Cr指示)。亮度分量Y表示亮度或灰度水平强度(例如,在灰度等级图片中两者相同),而两个色度分量U和V表示色度或颜色信息分量。相应地,YUV格式的图片包括亮度分量(Y)的亮度采样阵列,和色度分量(U和V)的两个色度采样阵列。RGB格式的图片可以转换或变换为YUV格式,反之亦然,该过程也称为色彩变换或转换。
另外,图片获取装置121可以为,例如用于捕获图片的相机,例如图片存储器的存储器,包括或存储先前捕获或产生的图片,和/或获取或接收图片的任何类别的(内部或外部)接口。 其中,相机可以为,例如,本地的或集成在源设备中的集成相机,存储器可为本地的或例如集成在源设备中的集成存储器。接口可以为,例如,从外部视频源接收图片的外部接口,这里,外部视频源例如为外部图片捕获设备,比如相机、外部存储器或外部图片生成设备,外部图片生成设备例如为外部计算机图形处理器、计算机或服务器。另外,接口可以为根据任何专有或标准化接口协议的任何类别的接口,例如有线或无线接口、光接口。图1中获取图片数据125的接口可以是与通信接口124相同的接口或是通信接口124的一部分。其中,图片数据125(例如,视频数据)可以称为原始图片或原始图片数据。
在一些情况下,预处理器122用于接收图片数据125并对图片数据125执行预处理,以获取经预处理的图片(或经预处理的图片数据)126。其中,预处理器122执行的预处理可以包括整修、色彩格式转换(例如,从RGB转换为YUV)、调色或去噪。可以理解,预处理器122可以是可选组件。
在一些情况下,编码器123(例如,视频编码器)用于接收经预处理的图片(或经预处理的图片数据)126并提供经编码图片数据127。
在一些情况下,源设备12的通信接口124可以用于接收经编码图片数据127并传输至其它设备,例如,目标设备14或任何其它设备,以用于存储或直接重构,或用于在对应地存储编码数据13和/或传输编码数据13至其它设备之前处理经编码图片数据127,其它设备例如为目标设备14或任何其它用于解码或存储的设备。目标设备14的通信接口144用于例如,直接从源设备12或任何其它源接收经编码图片数据127或编码数据13,任何其它源例如为存储设备,存储设备例如为经编码图片数据存储设备。
其中,通信接口124和通信接口144可以用于藉由源设备12和目标设备14之间的直接通信链路或藉由任何类别的网络传输或接收经编码图片数据127或编码数据13,直接通信链路例如为直接有线或无线连接,任何类别的网络例如为有线或无线网络或其任何组合,或任何类别的私网和公网,或其任何组合。通信接口124可以例如用于将经编码图片数据127封装成合适的格式,例如包,以在通信链路或通信网络上传输。形成通信接口124的对应部分的通信接口144可以例如用于解封装编码数据13,以获取经编码图片数据127。通信接口124和通信接口144都可以配置为单向通信接口,如图1中用于经编码图片数据127的从源设备12指向目标设备14的箭头所指示,或配置为双向通信接口,以及可以用于例如发送和接收消息来建立连接、确认和交换任何其它与通信链路和/或例如经编码图片数据传输的数据传输有关的信息。
在一些情况下,解码器143用于接收经编码图片数据127并提供经解码图片数据(或经解码图片)145。
在一些情况下,目标设备14的处理器142用于后处理经解码图片数据(或经解码图片)145,例如,经解码图片,以获取经后处理图片数据146,例如,经后处理图片。处理器142执行的后处理可以包括,例如,色彩格式转换(例如,从YUV转换为RGB)、调色、整修或重采样,或任何其它处理,用于例如准备经解码图片数据(或经解码图片)145以由显示设备141显示。
在一些情况下,目标设备14的显示设备141用于接收经后处理图片数据145以向例如用户或观看者显示图片。显示设备141可以为或可以包括任何类别的用于呈现经重构图片的显示器,例如,集成的或外部的显示器或监视器。例如,显示器可以包括液晶显示器(liquid crystal  display,简称LCD)、有机发光二极管(organic light emitting diode,简称OLED)显示器、等离子显示器、投影仪、微LED(发光二极管,light emitting diode)显示器、硅基液晶(liquid crystal on silicon,简称LCoS)、数字光处理器(digital light processor,简称DLP)或任何类别的其它显示器。
另外,虽然图1将源设备12和目标设备14绘示为单独的设备,但设备实施例也可以同时包括源设备12和目标设备14或同时包括两者的功能性,即源设备12或对应的功能性以及目标设备14或对应的功能性。在此类实施例中,可以使用相同硬件和/或软件,或使用单独的硬件和/或软件,或其任何组合来实施源设备12或对应的功能性以及目标设备14或对应的功能性。不同单元的功能性或图1所示的源设备12和/或目标设备14的功能性的存在和(准确)划分可能根据实际设备和应用有所不同。
在一些情况下,编码器123(例如,视频编码器)和解码器143(例如,视频解码器)都可以实施为各种合适电路中的任一个,例如,一个或多个微处理器、数字信号处理器(digital signal processor,简称DSP)、专用集成电路(application-specific integrated circuit,简称ASIC)、现场可编程门阵列(field-programmable gate array,简称FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件实施所述技术,则设备可将软件的指令存储于合适的非暂时性计算机可读存储介质中,且可使用一或多个处理器以硬件执行指令从而执行本公开的技术。前述内容(包含硬件、软件、硬件与软件的组合等)中的任一者可视为一或多个处理器。编码器123和解码器143中的每一个可以包含在一或多个编码器或解码器中,所述编码器或解码器中的任一个可以集成为对应设备中的组合编码器/解码器(编解码器)的一部分。
应理解,对于以上参考编码器123所描述的实例中的每一个,解码器143可以用于执行相反过程。关于信令语法元素,解码器143可以用于接收并解析这种语法元素,相应地解码相关视频数据。在一些例子中,编码器123可以将一个或多个定义的语法元素熵编码成经编码视频比特流。在此类实例中,解码器143可以解析这种语法元素,并相应地解码相关视频数据。
下面以具体地实施例对本公开的技术方案以及本公开的技术方案如何解决上述技术问题进行详细说明。下面这几个具体地实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图,对本公开的实施例进行描述。
图2为本公开实施例提供的一种视频处理方法的流程示意图,本公开实施例的执行主体可以为上述实施例中的编码器。如图2所示,该方法可以包括:
S201:根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,上述图像块的相邻已编码或解码图像块的预测模式,确定上述图像块是否为屏幕视频内容。
这里,一个图像块对应一个编码单元(CU),在一些情况下,图像块也可以被称为CU。由上述可知,一幅图像由多个编码树单元(CTU)构成,一个CTU可以拆分为多个CU,即一个CTU可以拆分为多个图像块。
可选地,上述确定上述图像块是否为屏幕视频内容可以包括:
根据上述颜色直方图中的非零值的个数、非零值的最大值、最小值等确定上述图像块是否为屏幕视频内容。
在本公开实施例中,还可以根据上述图像块的的相邻已编码或解码图像块的预测模式是否为预设模式,确定上述图像块是否为屏幕视频内容。
另外,本公开实施例提到的如何判断上述图像块是否为屏幕视频内容可以独立于图2所示的方法存在。
这里,通过判断图像块是否为屏幕视频内容,使得后续可以仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,出现的浪费计算能力,且对非屏幕视频区域编码效果不佳的问题。
S202:若上述图像块为屏幕视频内容,则在第一屏幕视频编码模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换。
这里,上述第一屏幕视频处理模式可以根据实际情况设置,例如IBC编码模式或hashme模式,在IBC编码模式或hashme模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换。
示例性的,编码器首先尝试第一屏幕视频处理模式,统计第一屏幕视频处理模式所需要的信息,进行第一屏幕视频处理模式的编解码。
其中,上述执行上述图像块的编码单元与编码单元的比特表示之间的转换,可以为上述图像块的编码过程,或者可以为上述图像块的解码过程,例如编码单元向编码单元的比特表示转换表示上述图像块的编码过程,编码单元的比特表示向编码单元转换表示上述图像块的解码过程。
另外,如果在第一屏幕视频编码模式,无法成功执行上述图像块的编码单元与编码单元的比特表示之间的转换,则尝试其它屏幕视频编码模式,即执行步骤S203。
S203:若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
其中,第二屏幕视频编码模式与第一屏幕视频编码模式不同。在上述第一屏幕视频编码模式编码失败后,采用其它屏幕视频编码模式进行编码。
示例性的,如果转换失败,可以尝试调色板(palette)编码模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换,如果转换再次失败,则可以再尝试其它编码模式,直至转换成功,从而解决多种屏幕编码工具效果相似,但多次计算造成计算冗余的问题。
从上述描述可知,本公开实施例采用降分辨率后的图像块进行后续处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
另外,本公开实施例还能够对待处理编码树单元的亮度分量进行降分辨率处理,获得降分辨率的图像块。图3为本公开实施例提出的另一种视频处理方法的流程示意图,本实施例的执行主体可以为图1所示实施例中的编码器。如图3所示,该方法包括:
S301:对待处理编码树单元的亮度分量进行降分辨率处理,获得降分辨率的图像块。
这里,上述待处理编码树单元可以根据实际情况确定,本公开实施例对比不做特别限制。
示例性的,假设CTU亮度分量大小为NxN,对该亮度分量进行降分辨率处理,得到降分辨率的图像块,例如大小为(N/2)x(N/2)。
具体的,编码器中可以设置码控预处理lookahead工具,对上述待处理编码树单元的亮度分量进行降分辨率处理,其中,lookahead工具中可以设置对CTU亮度分量进行降分辨率处理的执行处理过程,这里,该过程可以根据情况设置,本公开实施例对比不做特别限制。
另外,编码器在对待处理编码树单元的亮度分量进行降分辨率处理之前,还可以判断码控预处理lookahead工具是否开启,如果开启,则通过码控预处理lookahead工具,对待处理编码树单元的亮度分量进行降分辨率处理,如果没有开启,则开启码控预处理lookahead工具,对待处理编码树单元的亮度分量进行降分辨率处理。
这里,通过对待处理CTU进行降分辨率处理,从而降低样本数量,减少后续数据处理量。
S302:根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容。
S303:若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换。
S304:若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
其中,步骤S302-S304与上述步骤S201-S203的实现方式相同,此处不再赘述。
从上述描述可知,本公开实施例通过对待处理CTU进行降分辨率处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
另外,本公开实施例在上述对待处理编码树单元的亮度分量进行降分辨率处理时,还能够通过预设分辨率或下采样降分辨率。图4为本公开实施例提出的再一种视频处理方法的流程示意图,本实施例的执行主体可以为图1所示实施例中的编码器。如图4所示,该方法包括:
S401:将待处理编码树单元的亮度分量的分辨率降为预设分辨率,获得降分辨率的图像块。
这里,上述预设分辨率可以根据实际情况设置,本公开实施例对此不做特别限制。
可选地,上述预设分辨率为上述待处理编码树单元的亮度分量的平均分辨率。
上述将上述待处理编码树单元的亮度分量的分辨率降为预设分辨率,包括:
将上述处理编码树单元的亮度分量的分辨率降为上述平均分辨率。
示例性的,如图5所示,待处理编码树单元的亮度分量如图5左侧所示,将上述处理编码树单元的亮度分量的分辨率降为上述平均分辨率后,获得降分辨率的图像块如图5右侧所示。
S402:对待处理编码树单元的亮度分量的分辨率进行下采样,获得降分辨率的图像块。
下采样原理:待处理编码树单元的亮度分量大小为M*N,对其进行s倍下采样,即得到(M/s)*(N/s)大小的分辨率图像,这里,s应该是M和N的公约数,具体数值可以根据实际情况设置。
示例性的,如图6所示,待处理编码树单元的亮度分量如图6左侧所示,对待处理编码树单元的亮度分量的分辨率进行下采样后,获得降分辨率的图像块如图6右侧所示。
其中,步骤S401和步骤S402为并列步骤,本公开实施例可以采用其中任一步骤获得降分辨率的图像块。
另外,除上述步骤S401和步骤S402外,本公开实施例还可以根据实际情况设置上述对待处理编码树单元的亮度分量进行降分辨率处理的具体方式,对此本公开实施例对此不做特别限制。
S403:根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,上述图像块的相邻已编码或解码图像块的预测模式,确定上述图像块是否为屏幕视频内容。
S404:若上述图像块为屏幕视频内容,则在第一屏幕视频编码模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换。
S405:若转换失败,则在第二屏幕视频编码模式下,重新执行上述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
其中,步骤S403-S405与上述步骤S201-S203的实现方式相同,此处不再赘述。
从上述描述可知,本公开实施例能够通过预设分辨率或下采样对待处理CTU进行降分辨率处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
另外,本公开实施例还能够根据上述颜色直方图中的非零值的个数、非零值的最大值、最小值等确定上述图像块是否为屏幕视频内容。图7为本公开实施例提出的又一种视频处理方法的流程示意图,本实施例的执行主体可以为图1所示实施例中的编码器。如图7所示,该方法包括:
S701:根据降分辨率后的图像块的亮度分量的颜色直方图中的非零值的个数,和/或,上述非零值的最大值和最小值,确定上述图像块是否为屏幕视频内容。
示例性的,确定降分辨率后的图像块的亮度分量的颜色直方图,例如hist[i](对于8bit(比特)位宽,i=0,1,2…255)。
根据上述颜色直方图中的非零值的个数,和/或,上述非零值的最大值和最小值,确定上述图像块是否为屏幕视频内容。
具体的,若统计上述颜色直方图中非零值的个数大于零,且上述非零值的个数小于或等于预设个数阈值,则判定上述图像块为屏幕视频内容。
其中,上述预设个数阈值可以根据实际情况设置,本公开实施例对此不做特别限制。例如,统计hist[i]中非零的个数numDiffLuma,如果0<numDiffLuma<=alpha,则判定上述图像块为屏幕视频内容,alpha为上述预设个数阈值。
另外,若上述非零值对应的最大值与上述非零值对应的最小值的差值大于或等于预设差值阈值,则判定上述图像块为屏幕视频内容。
这里,上述预设差值阈值可以根据实际情况设置,本公开实施例对此不做特别限制。例如,统计hist[i]中非零值对应的最大和最小下标值i,分别记为vMin,vMax,如果vMax-vMin>=gama,则当判定上述图像块为屏幕视频内容,gama为上述预设差值阈值。
还有,若多个预设非零值的相加之和,大于上述图像块的大小的预设倍数,则判定上述图像块为屏幕视频内容。
其中,预设非零值可以根据实际情况设置,例如,确定hist[i]中最大的5个值,求和得到top5sum。上述预设倍数可以根据实际情况设置,本公开实施例对此不做特别限制。例如,如果上述top5sum大于当前图像块的大小的beta倍,则判定上述图像块为屏幕视频内容,beta倍为上述预设倍数。
另外,若上述非零值的最大值与非零值的最小值的差值小于上述预设差值阈值,则确定上述颜色直方图中多个预设非零值的相加之和。若上述多个预设非零值的相加之和,大于上述图像块的大小的预设倍数,则判定上述图像块为屏幕视频内容。
另外,除上述步骤S701外,本公开实施例还可以根据实际情况设置上述确定上述图像块是否为屏幕视频内容的具体方式,对此本公开实施例对此不做特别限制。
本公开实施例通过判断上述图像块为屏幕视频内容,使得后续仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,出现的浪费计算能力,且对非屏幕视频区域编码效果不佳的问题。
S702:若上述图像块为屏幕视频内容,则在第一屏幕视频编码模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换。
S703:若转换失败,则在第二屏幕视频编码模式下,重新执行上述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
其中,步骤S702-S703与上述步骤S202-S203的实现方式相同,此处不再赘述。
从上述描述可知,本公开实施例对待处理CTU进行降分辨率处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
另外,本公开实施例还能够根据上述图像块的相邻已编码或解码图像块的预测模式,确定上述图像块是否为屏幕视频内容。图8为本公开实施例提出的又一种视频处理方法的流程示意图,本实施例的执行主体可以为图1所示实施例中的编码器。如图8所示,该方法包括:
S801:判断降分辨率后的图像块的相邻已编码或解码图像块的最优预测模式,是否为预设帧内预测模式。
S802:若上述相邻已编码或解码图像块的最优预测模式不为上述预设帧内预测模式,则确定上述图像块为屏幕视频内容。
其中,上述相邻已编码或解码图像块可以为上述图像块相邻的左、上、左上的图像块。上述预设帧内预测模式可以根据实际情况设置,本公开实施例对此不做特别限制。
示例性的,编码器判断上述图像块相邻的左、上、左上的图像块的最优预测模式是否为 35种intra(帧内)模式之一,如果不是,则确定上述图像块为屏幕视频内容。
另外,除上述步骤S801-S802外,本公开实施例还可以根据实际情况设置上述确定上述图像块是否为屏幕视频内容的具体方式,对此本公开实施例对此不做特别限制。
本公开实施例通过判断上述图像块为屏幕视频内容,使得后续仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,出现的浪费计算能力,且对非屏幕视频区域编码效果不佳的问题。
S803:若上述图像块为屏幕视频内容,则在第一屏幕视频编码模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换。
S804:若转换失败,则在第二屏幕视频编码模式下,重新执行上述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
其中,步骤S803-S804与上述步骤S202-S203的实现方式相同,此处不再赘述。
从上述描述可知,本公开实施例对待处理CTU进行降分辨率处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。
另外,本公开实施例上述第二屏幕视频编码模式为调色板编码模式,在第二屏幕视频编码模式下,执行所述图像块的编码单元与编码单元的比特表示之间的转换之前,还能够对上述图像块的色度分量进行降分辨率处理。图9为本公开实施例提出的又一种视频处理方法的流程示意图,本实施例的执行主体可以为图1所示实施例中的编码器。如图9所示,该方法包括:
S901:根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,上述图像块的相邻已编码或解码图像块的预测模式,确定上述图像块是否为屏幕视频内容。
S902:若上述图像块为屏幕视频内容,则在第一屏幕视频编码模式,执行上述图像块的编码单元与编码单元的比特表示之间的转换。
其中,步骤S901-S902与上述步骤S201-S202的实现方式相同,此处不再赘述。
S903:若转换失败,则在第二屏幕视频编码模式下,对上述图像块的色度分量进行降分辨率处理,重新执行上述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
示例性的,利用码控预处理lookahead工具,对上述图像块的色度分量进行降分辨率处理。
具体的,可以将上述图像块的色度分量的分辨率降为预设分辨率,或者,对上述图像块的色度分量的分辨率进行下采样。
其中,上述预设分辨率可以根据实际情况设置,本公开实施例对此不作特别限制,例如上述预设分辨率为上述图像块的色度分量的平均分辨率。
这里,通过对上述图像块的色度分量进行降分辨率处理,进一步降低样本数量,减少后续数据处理量。
在本公开实施例中,上述第二屏幕视频编码模式可以为与亮度/色度或颜色有关的编码模式,例如palatte编码模式,palatte编码模式与亮度/色度或颜色有关,因此,在该模式下,对 上述图像块的色度分量进行降分辨率处理,进一步降低样本数量,减少后续数据处理量。示例性的,在palatte编码模式下,对上述图像块的色度分量进行降分辨率处理,在IBC编码模式和hashme编码模式,不对上述图像块的色度分量进行降分辨率处理。
本公开实施例对待处理CTU进行降分辨率处理,从而降低样本数量,减少后续数据处理量,而且本公开实施例判断降分辨率后的图像块是否为屏幕视频内容,如果是,才进行屏幕视频编码,即仅对屏幕视频区域开启屏幕视频编码工具,避免对所有区域都开启屏幕视频编码工具,对非屏幕视频区域编码效果不佳的问题,另外,在确定上述图像块为屏幕视频内容后,尝试在第一屏幕视频编码模式,执行图像块的编码单元与编码单元的比特表示之间的转换,如果失败,则在第二屏幕视频编码模式,重复执行上述流程,进而,避免采用多种屏幕视频编码工具进行重复计算,造成计算冗余的问题。除上述之外,本公开实施例在执行所述图像块的编码单元与编码单元的比特表示之间的转换之前,还能够对上述图像块的色度分量进行降分辨率处理,进一步降低样本数量,减少后续数据处理量。
对应于上文实施例的视频处理方法,图10为本公开实施例提供的视频处理装置的结构示意图。为了便于说明,仅示出了与本公开实施例相关的部分。图10为本公开实施例提供的一种视频处理装置的结构示意图。如图10所示,该视频处理装置100包括:确定模块1001、执行模块1002、处理模块1003。
其中,确定模块1001,用于根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容。
执行模块1002,用于若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换。
所述执行模块1002,还用于若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
在一种可能的设计中,处理模块1003,用于对待处理编码树单元的亮度分量进行降分辨率处理,获得降分辨率的所述图像块。
在一种可能的设计中,所述处理模块1003对待处理编码树单元的亮度分量进行降分辨率处理,包括:
利用码控预处理lookahead工具,对所述待处理编码树单元的亮度分量进行降分辨率处理。
在一种可能的设计中,所述处理模块1003对待处理编码树单元的亮度分量进行降分辨率处理,包括:
将所述待处理编码树单元的亮度分量的分辨率降为预设分辨率;
或者
对所述待处理编码树单元的亮度分量的分辨率进行下采样。
在一种可能的设计中,所述确定模块1001根据降分辨率后的图像块的亮度分量的颜色直方图,确定所述图像块是否为屏幕视频内容,包括:
根据所述颜色直方图中的非零值的个数,和/或,所述非零值的最大值和最小值,确定所述图像块是否为所述屏幕视频内容。
在一种可能的设计中,所述确定模块1001根据所述颜色直方图中的非零值的个数,和/或,所述非零值的最大值和最小值,确定所述图像块是否为所述屏幕视频内容,包括:
若所述非零值的个数大于零,且所述非零值的个数小于或等于预设个数阈值,则判定所 述图像块为所述屏幕视频内容;
和/或
若所述非零值的最大值与所述非零值的最小值的差值大于或等于预设差值阈值,则判定所述图像块为所述屏幕视频内容;
和/或
若多个预设非零值的相加之和,大于所述图像块的大小的预设倍数,则判定所述图像块为所述屏幕视频内容。
在一种可能的设计中,所述确定模块1001根据所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容,包括:
判断所述相邻已编码或解码图像块的最优预测模式是否为预设帧内预测模式;
若所述相邻已编码或解码图像块的最优预测模式不为所述预设帧内预测模式,则确定所述图像块为所述屏幕视频内容。
在一种可能的设计中,所述第二屏幕视频编码模式为调色板编码模式。
在一种可能的设计中,所述执行模块1002在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤之前,还包括:
针对所述第二屏幕视频编码模式,对所述图像块的色度分量进行降分辨率处理。
在一种可能的设计中,所述第一屏幕编码模式为帧内块拷贝IBC模式或者基于hash的运动搜索hashme模式。
在一种可能的设计中,所述预设分辨率为所述待处理编码树单元的亮度分量的平均分辨率;
所述处理模块1003将所述待处理编码树单元的亮度分量的分辨率降为预设分辨率,包括:
将所述处理编码树单元的亮度分量的分辨率降为所述平均分辨率。
本公开实施例提供的装置,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本公开实施例此处不再赘述。
参考图11,电子设备1100可以包括处理装置(例如中央处理器、图形处理器等)1101,其可以根据存储在只读存储器(Read Only Memory,简称ROM)1102中的程序或者从存储装置1108加载到随机访问存储器(Random Access Memory,简称RAM)1103中的程序而执行各种适当的动作和处理。在RAM 1103中,还存储有电子设备1100操作所需的各种程序和数据。处理装置1101、ROM 1102以及RAM 1103通过总线1104彼此相连。输入/输出(Input/Output,I/O)接口1105也连接至总线1104。
通常,以下装置可以连接至I/O接口1105:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置1106;包括例如LCD、扬声器、振动器等的输出装置1107;包括例如磁带、硬盘等的存储装置1108;以及通信装置1109。通信装置1109可以允许电子设备1100与其他设备进行无线或有线通信以交换数据。虽然图11示出了具有各种装置的电子设备1100,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中, 该计算机程序可以通过通信装置1109从网络上被下载和安装,或者从存储装置1108被安装,或者从ROM 1102被安装。在该计算机程序被处理装置1101执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体地例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Electrical Programmable ROM,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc ROM,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频,Radio Frequency)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方 式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programming Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
第一方面,根据本公开的一个或多个实施例,提供了一种视频处理方法,包括:
根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容;
若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换;
若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
根据本公开的一个或多个实施例,所述方法还包括:
对待处理编码树单元的亮度分量进行降分辨率处理,获得降分辨率的所述图像块。
根据本公开的一个或多个实施例,所述对待处理编码树单元的亮度分量进行降分辨率处理,包括:
利用码控预处理lookahead工具,对所述待处理编码树单元的亮度分量进行降分辨率处理。
根据本公开的一个或多个实施例,所述对待处理编码树单元的亮度分量进行降分辨率处理,包括:
将所述待处理编码树单元的亮度分量的分辨率降为预设分辨率;
或者
对所述待处理编码树单元的亮度分量的分辨率进行下采样。
根据本公开的一个或多个实施例,所述根据降分辨率后的图像块的亮度分量的颜色直方图,确定所述图像块是否为屏幕视频内容,包括:
根据所述颜色直方图中的非零值的个数,和/或,所述非零值的最大值和最小值,确定所述图像块是否为所述屏幕视频内容;
根据本公开的一个或多个实施例,所述根据所述颜色直方图中的非零值的个数,和/或,所述非零值的最大值和最小值,确定所述图像块是否为所述屏幕视频内容,包括:
若所述非零值的个数大于零,且所述非零值的个数小于或等于预设个数阈值,则判定所述图像块为所述屏幕视频内容;
和/或
若所述非零值的最大值与所述非零值的最小值的差值大于或等于预设差值阈值,则判定所述图像块为所述屏幕视频内容;
和/或
若多个预设非零值的相加之和,大于所述图像块的大小的预设倍数,则判定所述图像块为所述屏幕视频内容。
根据本公开的一个或多个实施例,所述根据所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容,包括:
判断所述相邻已编码或解码图像块的最优预测模式是否为预设帧内预测模式;
若所述相邻已编码或解码图像块的最优预测模式不为所述预设帧内预测模式,则确定所述图像块为所述屏幕视频内容。
根据本公开的一个或多个实施例,所述第二屏幕视频编码模式为调色板编码模式。
根据本公开的一个或多个实施例,在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤之前,还包括:
针对所述第二屏幕视频编码模式,对所述图像块的色度分量进行降分辨率处理
根据本公开的一个或多个实施例,所述第一屏幕编码模式为帧内块拷贝IBC模式或者基于hash的运动搜索hashme模式。
根据本公开的一个或多个实施例,所述预设分辨率为所述待处理编码树单元的亮度分量的平均分辨率;
所述将所述待处理编码树单元的亮度分量的分辨率降为预设分辨率,包括:
将所述处理编码树单元的亮度分量的分辨率降为所述平均分辨率。
第二方面,根据本公开的一个或多个实施例,提供了视频处理装置,包括:
确定模块,用于根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容;
执行模块,用于若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换;
所述执行模块,还用于若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
第三方面,根据本公开的一个或多个实施例,提供了一种电子设备,包括:至少一个处理器和存储器;
所述存储器存储计算机执行指令;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第四方面,根据本公开的一个或多个实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第五方面,根据本公开的一个或多个实施例,提供了一种计算机程序产品,包括计算机程序指令,所述计算机程序指令使得计算机执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第六方面,根据本公开的一个或多个实施例,提供了一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (15)

  1. 一种视频处理方法,其特征在于,包括:
    根据降分辨率后的图像块的亮度分量的颜色直方图,和/或,所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容;
    若所述图像块为所述屏幕视频内容,则在第一屏幕视频编码模式,执行所述图像块的编码单元与编码单元的比特表示之间的转换;
    若转换失败,则在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    对待处理编码树单元的亮度分量进行降分辨率处理,获得降分辨率的所述图像块。
  3. 根据权利要求2所述的方法,其特征在于,所述对待处理编码树单元的亮度分量进行降分辨率处理,包括:
    利用码控预处理lookahead工具,对所述待处理编码树单元的亮度分量进行降分辨率处理。
  4. 根据权利要求2或3所述的方法,其特征在于,所述对待处理编码树单元的亮度分量进行降分辨率处理,包括:
    将所述待处理编码树单元的亮度分量的分辨率降为预设分辨率;
    或者
    对所述待处理编码树单元的亮度分量的分辨率进行下采样。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述根据降分辨率后的图像块的亮度分量的颜色直方图,确定所述图像块是否为屏幕视频内容,包括:
    根据所述颜色直方图中的非零值的个数,和/或,所述非零值的最大值和最小值,确定所述图像块是否为所述屏幕视频内容。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述颜色直方图中的非零值的个数,和/或,所述非零值的最大值和最小值,确定所述图像块是否为所述屏幕视频内容,包括:
    若所述非零值的个数大于零,且所述非零值的个数小于或等于预设个数阈值,则判定所述图像块为所述屏幕视频内容;
    和/或
    若所述非零值的最大值与所述非零值的最小值的差值大于或等于预设差值阈值,则判定所述图像块为所述屏幕视频内容;
    和/或
    若多个预设非零值的相加之和,大于所述图像块的大小的预设倍数,则判定所述图像块为所述屏幕视频内容。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述根据所述图像块的相邻已编码或解码图像块的预测模式,确定所述图像块是否为屏幕视频内容,包括:
    判断所述相邻已编码或解码图像块的最优预测模式是否为预设帧内预测模式;
    若所述相邻已编码或解码图像块的最优预测模式不为所述预设帧内预测模式,则确定所述图像块为所述屏幕视频内容。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述第二屏幕视频编码模式为调色板编码模式。
  9. 根据权利要求1至8任一项所述的方法,其特征在于,在第二屏幕视频编码模式下,重新执行所述图像块的编码单元与编码单元的比特表示之间的转换的步骤之前,还包括:
    针对所述第二屏幕视频编码模式,对所述图像块的色度分量进行降分辨率处理。
  10. 根据权利要求1至9任一项所述的方法,其特征在于,所述第一屏幕编码模式为帧内块拷贝IBC模式或者基于hash的运动搜索hashme模式。
  11. 根据权利要求4所述的方法,其特征在于,所述预设分辨率为所述待处理编码树单元的亮度分量的平均分辨率;
    所述将所述待处理编码树单元的亮度分量的分辨率降为预设分辨率,包括:
    将所述处理编码树单元的亮度分量的分辨率降为所述平均分辨率。
  12. 一种电子设备,其特征在于,包括:至少一个处理器和存储器;
    所述存储器存储计算机执行指令;
    所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1至11任一项所述的视频处理方法。
  13. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至11任一项所述的视频处理方法。
  14. 一种计算机程序产品,包括计算机程序指令,所述计算机程序指令使得计算机执行如权利要求1至11任一项所述的视频处理方法。
  15. 一种计算机程序,所述计算机程序使得计算机执行如权利要求1至11任一项所述的视频处理方法。
PCT/CN2021/076414 2020-02-27 2021-02-09 视频处理方法及电子设备 WO2021169817A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010124003.1 2020-02-27
CN202010124003.1A CN111314701A (zh) 2020-02-27 2020-02-27 视频处理方法及电子设备

Publications (1)

Publication Number Publication Date
WO2021169817A1 true WO2021169817A1 (zh) 2021-09-02

Family

ID=71161994

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076414 WO2021169817A1 (zh) 2020-02-27 2021-02-09 视频处理方法及电子设备

Country Status (2)

Country Link
CN (1) CN111314701A (zh)
WO (1) WO2021169817A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111314701A (zh) * 2020-02-27 2020-06-19 北京字节跳动网络技术有限公司 视频处理方法及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105723676A (zh) * 2013-09-05 2016-06-29 微软技术许可有限责任公司 通用屏幕内容编解码器
CN106063263A (zh) * 2014-03-13 2016-10-26 华为技术有限公司 改进的屏幕内容和混合内容编码
CN106534846A (zh) * 2016-11-18 2017-03-22 天津大学 一种屏幕内容与自然内容划分及快速编码方法
US20180262760A1 (en) * 2017-03-10 2018-09-13 Intel Corporation Screen content detection for adaptive encoding
CN110312134A (zh) * 2019-08-06 2019-10-08 杭州微帧信息科技有限公司 一种基于图像处理和机器学习的屏幕视频编码方法
CN111314701A (zh) * 2020-02-27 2020-06-19 北京字节跳动网络技术有限公司 视频处理方法及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105723676A (zh) * 2013-09-05 2016-06-29 微软技术许可有限责任公司 通用屏幕内容编解码器
CN106063263A (zh) * 2014-03-13 2016-10-26 华为技术有限公司 改进的屏幕内容和混合内容编码
CN106534846A (zh) * 2016-11-18 2017-03-22 天津大学 一种屏幕内容与自然内容划分及快速编码方法
US20180262760A1 (en) * 2017-03-10 2018-09-13 Intel Corporation Screen content detection for adaptive encoding
CN110312134A (zh) * 2019-08-06 2019-10-08 杭州微帧信息科技有限公司 一种基于图像处理和机器学习的屏幕视频编码方法
CN111314701A (zh) * 2020-02-27 2020-06-19 北京字节跳动网络技术有限公司 视频处理方法及电子设备

Also Published As

Publication number Publication date
CN111314701A (zh) 2020-06-19

Similar Documents

Publication Publication Date Title
JP7106744B2 (ja) ルーマおよびクロマ成分についてibc専用バッファおよびデフォルト値リフレッシュを使用するエンコーダ、デコーダおよび対応する方法
US11197010B2 (en) Browser-based video decoder using multiple CPU threads
JP7205038B2 (ja) 任意のctuサイズのためのibc検索範囲最適化を用いるエンコーダ、デコーダおよび対応する方法
WO2021042957A1 (zh) 一种图像处理方法和装置
WO2017129023A1 (zh) 解码方法、编码方法、解码设备和编码设备
WO2020103800A1 (zh) 视频解码方法和视频解码器
WO2022068682A1 (zh) 图像处理方法及装置
US20220007034A1 (en) Encoder, a decoder and corresponding methods related to intra prediction mode
KR20240042127A (ko) 인트라 예측을 위한 방법 및 장치
CN112673640A (zh) 使用调色板译码的编码器、解码器和相应方法
WO2021147464A1 (zh) 视频处理方法、装置及电子设备
WO2021147463A1 (zh) 视频处理方法、装置及电子设备
JP2023126795A (ja) ビデオコーディングにおけるクロマイントラ予測のための方法及び装置
WO2021169817A1 (zh) 视频处理方法及电子设备
US11095890B2 (en) Memory-efficient filtering approach for image and video coding
US10225573B1 (en) Video coding using parameterized motion models
KR102631517B1 (ko) 픽처 분할 방법 및 장치
CN110868590B (zh) 图像划分方法及装置
US20210337189A1 (en) Prediction mode determining method and apparatus
WO2021168624A1 (zh) 视频图像编码方法、设备及可移动平台
CN111885389B (zh) 一种多媒体数据编码方法、装置及存储介质
RU2801326C2 (ru) Кодер, декодер и соответствующие способы, использующие выделенный буфер ibc, и значение по умолчанию, обновляющее компонент яркости и цветности
RU2814812C2 (ru) Выведение веса выборки цветности для геометрического режима разделения
WO2022111349A1 (zh) 图像处理方法、设备、存储介质及计算机程序产品
WO2020119742A1 (zh) 块划分方法、视频编解码方法、视频编解码器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21761622

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21761622

Country of ref document: EP

Kind code of ref document: A1