CN114125442B

CN114125442B - Screen video coding mode determining method, coding method, device and computing equipment

Info

Publication number: CN114125442B
Application number: CN202210109630.7A
Authority: CN
Inventors: 张涛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-01-29
Filing date: 2022-01-29
Publication date: 2022-05-03
Anticipated expiration: 2042-01-29
Also published as: CN114125442A

Abstract

The application discloses a screen video coding mode determining method, a coding method, a device and a computing device. The determination method comprises the following steps: respectively coding a current coding block of a current frame by using a basic coding mode and an IBC coding mode of a first type to obtain at least two coding costs; determining a coding mode corresponding to the minimum coding cost in at least two coding costs as a candidate coding mode; and under the condition that the candidate coding mode is a basic coding mode and the number of basic coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to a preset threshold value, skipping to code the current coding block by using the residual IBC coding modes in the IBC coding mode set and determining the basic coding mode as the coding mode of the current coding block, wherein the IBC coding mode of the first type is the coding mode with the lowest coding complexity in the IBC coding mode set, and the coding complexity of the basic coding mode is lower than that of the IBC coding mode of the first type.

Description

Screen video coding mode determining method, coding method, device and computing equipment

Technical Field

The present invention belongs to the technical field of video encoding and decoding, and in particular, to a method and an apparatus for determining an encoding mode of a screen video, a method and an apparatus for encoding a screen video, and a computing device.

Background

With the development of computer applications, screen videos are widely applied to application scenes such as video conferences, online education, remote desktops and the like. The screen video is a video obtained by capturing a screen content image of an electronic device such as a computer, a mobile phone, and the like, and has many obvious differences from a traditional natural video, for example, the screen content image in the screen video is a non-continuous tone content, and the natural image in the natural video is a continuous tone content; the local area of the screen video has the characteristics of less color number, a large number of sharp boundaries, a large number of flat areas, high-contrast characters, a large number of repeated textures and the like. Screen video content forms are also many, including PPT presentations, word documents, and the like.

Due to the inherent characteristics of screen video, the international standards organization also enacts a standard for screen compression, Screen Content Coding (SCC), which is an extended version of the HEVC-based standard. Compared with HEVC, the SCC standard improves the screen compression by adding tools suitable for screen compression, such as Intra Block Copy (IBC), palette (palette) mode, and can significantly improve the compression performance.

Due to the addition of a new coding tool, in the process of selecting a prediction mode, besides the traditional HEVC mode coding process, each coding block (CU) needs to perform an IBC mode, a palette mode, and the like, which inevitably leads to the increase of complexity of screen video coding, brings greater challenges to network traffic and bandwidth, and affects the quality of visual experience of users.

Therefore, in order to reduce the complexity of screen video coding in the context of SCC, the present application provides a method for efficiently and quickly determining a coding mode of a screen video and an efficient coding method, so as to improve the efficiency of screen video coding.

Disclosure of Invention

According to an aspect of the present application, there is provided a method for determining an encoding mode of a screen video, including: respectively coding a current coding block of a current frame of a screen video to be coded by using a first type of an intra-frame block matching (IBC) coding mode in an IBC coding mode set and using a basic coding mode to obtain at least two coding costs; determining the coding mode corresponding to the minimum coding cost in the at least two coding costs as a candidate coding mode; and under the condition that the candidate coding mode is a basic coding mode and the number of basic coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to a preset threshold value, skipping to code the current coding block by using the residual IBC coding modes in the IBC coding mode set and determining the basic coding mode as the coding mode of the current coding block, wherein the IBC coding mode of the first type is the coding mode with the lowest coding complexity in the IBC coding mode set, and the coding complexity of the basic coding mode is lower than that of the IBC coding mode of the first type.

According to another aspect of the present application, there is provided a method for encoding of screen video, including: when each coding block in a current frame of a screen video to be coded is coded by adopting a search-based intra-frame block matching (IBC) coding mode, down-sampling the current frame to obtain a sampling frame corresponding to the current frame, pre-analyzing the coding mode of each sampling block in the sampling frame, and determining a first number of sampling blocks of which the coding mode is the search-based IBC coding mode from the sampling blocks of the sampling frame based on the result of the pre-analysis; and searching a block which is matched with the current coding block most from the coded area in the current frame as a prediction block of the current coding block based on the first quantity, wherein the block is used for coding the current coding block.

According to still another aspect of the present application, there is provided an apparatus for determining an encoding mode of a screen video, including: the cost calculation module is used for respectively coding a current coding block of a current frame of the screen video to be coded by using a first type of an intra-frame block matching (IBC) coding mode in an IBC coding mode set and a basic coding mode to obtain at least two coding costs; a candidate mode determining module, configured to determine, as a candidate coding mode, a coding mode corresponding to a smallest coding cost of the at least two coding costs; and a mode determining module, configured to skip coding of the current coding block using the remaining IBC coding modes in the IBC coding mode set and determine the base coding mode as the coding mode of the current coding block when the candidate coding mode is the base coding mode and the number of base coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to a preset threshold, where the first type of IBC coding mode is a coding mode with the lowest coding complexity in the IBC coding mode set, and the coding complexity of the base coding mode is lower than that of the first type of IBC coding mode.

According to still another aspect of the present application, there is provided an apparatus for encoding of screen video, including: the device comprises a pre-analysis module, a determination module and a decoding module, wherein the pre-analysis module is used for performing down-sampling on a current frame to obtain a sampling frame corresponding to the current frame and performing pre-analysis on the coding mode of each sampling block in the sampling frame when each coding block in the current frame of a screen video to be coded is coded by adopting a searching-based intra block matching (IBC) coding mode, and the determination module is used for determining the first number of the sampling blocks of the sampling frame, of which the coding mode is the searching-based IBC coding mode, based on the result of the pre-analysis; and the searching module is used for searching a block which is most matched with the current coding block from the coded region in the current frame based on the first quantity to serve as a prediction block of the current coding block, and the searching module is used for coding the current coding block.

According to an aspect of the application, there is also provided a computing device comprising: a processor; and a memory having stored thereon a computer program which, when executed by the processor, implements the method as described above.

According to another aspect of the present application, there is also provided a computer readable storage medium, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method as described above.

According to yet another aspect of the present application, there is also provided a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method as described above.

According to the scheme, the method and the device for encoding the screen video can skip the encoding of the current encoding block by a part of encoding modes according to the characteristics of the current encoding block, so that the encoding efficiency of the screen video can be improved, and higher encoding performance and encoding quality can be guaranteed. In addition, when the current coding block is coded by using the search-based IBC coding mode, the search time complexity can be reduced by adaptively changing the search step length in the horizontal direction and the vertical direction based on the pre-analysis result, so that at least one part of the search process is accelerated, and the coding efficiency is improved while the coding quality and the coding performance are ensured.

Drawings

Fig. 1 illustrates a simplified block diagram of a communication system provided by one embodiment of the present application.

Fig. 2 is a schematic diagram illustrating an intra block matching (IBC) encoding method used in intra prediction.

Fig. 3 shows a flow diagram of a method for determining an encoding mode of a screen video according to an embodiment of the present application.

Fig. 4 shows a schematic diagram of a current coding block and a matching prediction block during intra prediction.

Fig. 5A shows a flow diagram of a method of improving a search-based IBC coding mode according to an embodiment of the present application.

FIG. 5B shows a schematic diagram of a search area in a sample frame searching for matching blocks of sample blocks during a pre-analysis process.

Fig. 6A shows more details of the step of searching for the block that best matches the current coding block in step S530 of fig. 5A.

Fig. 6B, 6D-6F show schematic diagrams of searching a current coding block based on different search modes in a current frame.

Fig. 6C shows a flow chart for determining different search modes.

Fig. 7 shows an exemplary flowchart of step S530 of fig. 5A.

Fig. 8 shows a flow diagram of a method for encoding of screen video according to an embodiment of the present application.

Fig. 9 is a block diagram illustrating a structure of an apparatus for determining an encoding mode of a screen video according to an embodiment of the present application.

Fig. 10 is a block diagram illustrating an apparatus for encoding of screen video according to an embodiment of the present application.

FIG. 11 shows a block diagram of a computing device, according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, exemplary embodiments according to the present application will be described in detail below with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.

In the present specification and the drawings, steps and elements having substantially the same or similar characteristics are denoted by the same or similar reference numerals, and repeated description of the steps and elements will be omitted. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance or order.

The definitions of the coding-related terms referred to herein are the same as those in the video coding standards HEVC and SCC.

As shown in fig. 1, a simplified block diagram of a communication system provided by one embodiment of the present application is shown. The communication system 100 includes a plurality of devices that may communicate with each other over, for example, a network 150. By way of example, the communication system 100 includes a first device 110 and a second device 120 interconnected by a network 150. In the embodiment of fig. 1, first device 110 and second device 120 perform unidirectional data transfer. For example, the first device 110 may encode video data, such as a video picture stream captured by the first device 110, for transmission to the second device 120 over the network 150. The encoded video data is transmitted in the form of one or more encoded video streams. The second device 120 may receive the encoded video data from the network 150, decode the encoded video data to recover the video data, and display video pictures according to the recovered video data. Unidirectional data transmission is common in applications such as media services.

In another embodiment, the communication system 100 includes a third device 130 and a fourth device 140 that perform bi-directional transmission of encoded video data, which may occur, for example, during video conferencing, online education, remote desktop control, and the like. For bi-directional data transfer, each of the third device 130 and the fourth device 140 may encode video data (e.g., a stream of video pictures captured by the devices) for transmission over the network 150 to the other of the third device 130 and the fourth device 140. Each of third apparatus 130 and fourth apparatus 140 may also receive encoded video data transmitted by the other of third apparatus 130 and fourth apparatus 140, and may decode the encoded video data to recover the video data, and may display video pictures on an accessible display device according to the recovered video data.

In fig. 1, the devices (110-140) may be shown as servers, personal computers, smart phones, or other smart devices such as in-vehicle smart terminals, but the principles of the embodiments are not limited thereto. Embodiments are applicable to laptop computers, tablet computers, media players, and/or dedicated video conferencing equipment. Network (150) represents any number of networks that transport encoded video data between devices (110-140), including, for example, wired and/or wireless communication networks. The communication network (150) may exchange data in circuit-switched and/or packet-switched channels. Representative networks include telecommunications networks, local area networks, wide area networks, and/or the internet. For purposes of this discussion, the architecture and topology of the network (150) may be immaterial to the operation of the embodiments, unless otherwise stated below.

As described above, the newly proposed SCC standard introduces a new tool for encoding screen video, and thus it is required to reduce the encoding complexity of screen video encoding to improve the video encoding quality and efficiency.

Due to the fact that a large number of repeated textures exist in the screen video, for example, the same characters appear in one frame for multiple times, according to the characteristics of the screen video, when the screen video is coded, an I frame is involved to be coded, the I frame is an intra-frame prediction frame, namely a coding mode adopting intra-frame prediction, and an intra-frame prediction method based on an Intra Block Copy (IBC) (also called intra block matching) coding mode can perform efficient prediction.

As shown in fig. 2, a unit to be currently encoded (denoted as CU in fig. 2) (also referred to as a current encoding block herein) searches an already encoded region (a shaded region in fig. 2) in a current frame, and a block (a white region indicated by an arrow in fig. 2, and an arrow indicates a block vector) which is most matched with the unit to be currently encoded is used as a prediction block of the current encoding block CU to encode the current encoding block.

Furthermore, there may be various contents in the screen video, so that in intra prediction, it may be better to adopt a conventional coding mode (e.g., a Direct Current (DC) coding mode, a Planar (Planar) coding mode, a direction (e.g., 33 angular directions) based coding mode, etc.) or a Palette (Palette) coding mode with respect to the IBC coding mode for coding mode selection of different coding blocks of the current frame.

For these coding modes, the coding complexity of the conventional coding mode is lower than that of the IBC coding scheme and the Palette (Palette) coding mode, which has the highest coding complexity. The IBC coding mode may further include multiple types, for example, merge/skip (merge/skip) based IBC coding mode, search based IBC coding mode, and hash (hash) based IBC coding mode, where the merge/skip based IBC coding mode has the lowest coding complexity, and thus for screen video, the probability that the merge/skip based IBC coding mode is finally the IBC coding mode of the coding block is also the largest.

Generally, after acquiring a screen video to be encoded, an encoder first divides a current frame of the screen video into a number of Coding Tree Units (CTUs), each of which is divided into a plurality of Coding Units (CUs) (also referred to as coding blocks) having the same or different sizes, and the minimum size of each Coding Unit (CU) is, for example, 8 × 8. Each Coding Unit (CU) may be further divided into one or more coding units, i.e., multiple layers of coding units, based on different prediction partition modes, and may be prediction-encoded for each coding unit in each layer.

When performing I-frame coding, for each coding block, it is necessary to sequentially traverse a conventional intra-frame coding mode (referred to as IPM coding mode herein for short), an IBC coding mode (including multiple types), and optionally a Palette coding mode, and select a coding mode with the smallest coding cost (e.g., rate-distortion cost) as the coding mode of the current coding block. And then the determined coding mode of each coding block is provided to a decoder end together with the coding stream, and the decoder decodes the corresponding coding block according to the received coding mode and finally displays the coding block on a display device.

Therefore, if the coding mode suitable for the coding block can be quickly determined when the coding block is coded, and certain coding modes with low probability are skipped, the coding quality can be ensured, and the efficiency of coding the screen video can be improved.

Details of the manner of improving coding efficiency according to the embodiments of the present application are described in detail below with reference to fig. 3-9.

As shown in fig. 3, in step S310, a current coding block of a current frame of a screen video to be coded is coded by using a first type of intra block matching (IBC) coding mode in an IBC coding mode set and by using a base coding mode, respectively, so as to obtain at least two coding costs.

For example, the IBC coding mode of the first type is a coding mode of the set of IBC coding modes with the lowest coding complexity, and the base coding mode is a coding mode with a lower coding complexity than the IBC coding mode of the first type.

For example, the base coding mode may be a conventional coding mode, such as the aforementioned Direct Current (DC) coding mode, Planar (Planar) coding mode, direction-based (e.g., 33 angular directions) coding mode, and so on, i.e., when the coding mode determined by a certain coding block is one of the three coding modes, the coding mode of the coding block may be considered as the base coding mode.

For example, the set of intra block matching (IBC) encoding modes may include at least one of a merge/skip based IBC encoding mode, a search based IBC encoding mode, and a hash (hash) based IBC encoding mode. When the IBC coding mode set comprises the merge/skip-based IBC coding modes, the first type of IBC coding mode is the merge/skip-based IBC coding mode.

Alternatively, the coding cost may be characterized in a number of different forms. For example, and not by way of limitation, the coding cost obtained for each coding mode may be a rate-distortion cost (rdcost) indicating a value obtained by weighting the number of bits required to encode the current coding block using each coding mode and the coding quality loss (e.g., the residual of the coding block and the prediction block), a smaller value indicating that the coding mode is more efficient for the current coding block.

In step S320, the coding mode corresponding to the smallest coding cost of the at least two coding costs is determined as the candidate coding mode.

For example, the number of coding costs depends on the number of coding modes in the base coding mode; when the base coding mode comprises a Direct Current (DC) coding mode, a Planar (Planar) coding mode, a direction-based coding mode, at least three coding costs (depending on the number of directions) for coding with the base coding mode may be obtained, and furthermore one coding cost for coding with the IBC coding mode of the first type may be obtained. The smaller of these coding costs indicates a more efficient coding mode for the current coding block, and thus the coding mode is more suitable for the current coding block than other coding modes.

In step S330, when the candidate coding mode is the basic coding mode and the number of basic coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to the preset threshold, the coding of the current coding block by using the remaining IBC coding modes in the IBC coding mode set is skipped, and the basic coding mode is determined as the coding mode of the current coding block.

For example, if the candidate coding mode is the base coding mode and the number of base coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is sufficiently large (i.e. greater than or equal to the preset threshold), it indicates that the current coding block has a high possibility of adopting the base coding mode, and thus, the subsequent coding of other IBC coding modes can be omitted.

Optionally, the surrounding coding blocks of the current coding block include: the encoding blocks at the left, upper left, upper right, and lower left of the current encoding block and the encoding blocks at the previous level of the current encoding block, wherein, as described above, the current frame may be iteratively divided into encoding blocks of different levels by different sizes, and the encoding blocks at the previous level of the current encoding block are divided into a plurality of encoding blocks including the current encoding block. For example, the size of the coding block at the first level may be 32 × 32, the coding block is divided into four coding blocks at the second level (the size may be 16 × 16), and then further divided into coding blocks at the third level (the size may be 8 × 8), and so on, and the coding block at the upper layer of the current coding block may be obtained correspondingly through the division relationship.

Alternatively, the preset threshold may be, but is not limited to, 3.

In addition, if the candidate coding mode is not the base coding mode, that is, the candidate coding mode is the IBC coding mode of the first type, it indicates that the coding performance and the coding quality of the base coding mode are inferior to those of the IBC coding mode of the first type; or if the candidate coding modes are the base coding modes but the number is smaller than the preset threshold, the probability that the coding mode of the current coding block is the base coding mode is not enough.

Therefore, for the two cases, different processing manners may be adopted, for example, the IBC coding manner of the first type may be directly used as the coding mode of the current coding block, and the coding efficiency using this coding manner is higher because the IBC coding manner of the first type is the lowest coding complexity of all IBC coding modes, and the basic principles of various IBC coding manners are similar to some extent, so the coding quality does not differ too much, and therefore, in some application occasions with less high requirements on coding performance and coding quality, the IBC coding manner of the first type may be directly used as the coding mode of the current coding block. In addition, in an application scenario with higher requirements on coding performance and coding quality, in many application scenarios, the subsequent coding of the IBC coding mode cannot be directly omitted, and the coding cost of other IBC coding modes needs to be determined so as to determine the optimal IBC coding mode from multiple IBC coding modes.

Alternatively, the coding mode of the current coding block may be determined as follows. For example, the method 300 may further include the following steps.

First, the current coding block is coded using the remaining IBC coding modes.

And then, determining the coding mode of the current coding block based on the coding cost of coding the current coding block by each coding mode in the basic coding mode and the IBC coding mode set.

For example, the remaining IBC encoding modes may include a search-based IBC encoding mode and a hash-based IBC encoding mode. And determining the at least two coding costs obtained in the step S310, and the coding mode corresponding to the minimum coding cost of the coding costs for coding the current coding block based on the searched IBC coding mode and the hash-based IBC coding mode, as the coding mode of the current coding block.

Therefore, the coding mode suitable for the current coding block can be obtained, and higher coding performance and coding quality can be achieved.

By the method for determining the coding mode of the screen video described with reference to fig. 3, a part of coding modes can be skipped to code the current coding block according to the characteristics of the current coding block (the above-mentioned at least two coding costs and the number of basic coding modes in the coding modes of the coded surrounding coding blocks of the current coding block), so that the coding efficiency of the screen video can be improved, and higher coding performance and coding quality can be ensured.

According to another aspect of the present application, as described above, if a current coding block needs to be encoded using remaining IBC coding modes (e.g., a search-based IBC coding mode and a hash-based IBC coding mode) other than the first type of IBC coding mode, in this case, the coding efficiency is not improved by skipping the encoding of a part of the IBC coding modes, and thus in order to improve the coding efficiency, it may be considered to improve the encoding modes of other IBC coding modes to improve the coding efficiency.

For the search-based IBC coding mode, when a current coding block is coded by using the coding mode, the current coding block needs to be searched in the horizontal and vertical directions within a certain search range (as shown in fig. 4, the lower right square box represents the position of the current coding block, and the black solid circle represents the position that needs to be searched in the horizontal and vertical directions), so as to obtain a block (prediction block) that best matches the current coding block. For the hash-based IBC coding mode, the best matching block of the current coding block may be determined based on the hash value throughout the current frame, with a higher coding complexity than the search-based IBC coding mode. And aiming at different IBC coding modes, obtaining the coding cost of each IBC coding mode of the current coding block, and further determining the optimal IBC mode suitable for the current coding block.

When encoding a current coding block using a search-based IBC coding mode, the search step size in the horizontal direction and the vertical direction may be the same or different. The search step length refers to the number of pixels sliding in the search range at each time. For example, if the search step size is 1, it indicates that each pixel point position needs to be searched in both the horizontal and vertical directions. Furthermore, the searched block that overlaps the current coding block is not matched to the current block, i.e., is not used as a candidate prediction block. For example, when the current coding block is 8 × 8, a block obtained by a first search in the horizontal direction (for example, a first black solid circle on the left of the current coding block in fig. 4) is at least 8 pixels apart from a pixel at a corresponding position of the current coding block, a block obtained by a second search (for example, a second black solid circle on the left of the current coding block in fig. 4) is at least 9 pixels apart from a pixel at a corresponding position of the current coding block, and is 1 pixel apart from a pixel at a corresponding position of the block obtained by the first search (because the search step size is 1), and the vertical direction is also similar, that is, there may be an overlapping region between candidate prediction blocks obtained by the search. The phase positions of the searched prediction block (best matching block) and the current coding block may be represented by a block vector indicating the horizontal offset value x and the vertical offset value y of the current coding block with respect to its best matching block searched, and when searching only in the horizontal direction or the vertical direction, one of x and y in the block vector is 0. While x and y in the block vector of the prediction block used when intra prediction is performed using other IBC coding modes (e.g., hash-based coding modes) may not be 0, as shown in fig. 2.

According to some embodiments, it may be considered to increase the coding efficiency when a current coding block is coded using a search-based IBC coding mode by increasing the search speed.

As shown in fig. 5A, in step S510, a current frame is down-sampled to obtain a sampling frame corresponding to the current frame, and a coding mode of each sampling block in the sampling frame is pre-analyzed.

For example, the sample frame may be an image with reduced resolution from the current frame. Since the sampled frame can reflect many coding characteristics of the current frame, the coding characteristics of the current frame with the original resolution can be estimated or simulated by performing pre-analysis on the sampled frame, so as to quickly search for a prediction block matched with the current coding block in the current frame.

Optionally, a sampling frame may be divided into a plurality of sampling blocks, and then each sampling block is precoded by using a coding mode set to obtain a coding cost corresponding to each coding mode, where the coding mode set includes a basic coding mode and a search-based IBC coding mode; and finally, determining the coding mode of each sample block based on the coding cost corresponding to each coding mode in the coding mode set. In the pre-analysis process, in order to reduce the encoding complexity, only the search-based IBC encoding mode is designed for the IBC encoding mode, and in order to more comprehensively search for the block matching the sample block, when the block matching the current sample block is searched, a search region surrounded by, for example, 64 pixels left and 64 pixels above the current sample block may be searched in the upper left encoded region of the current sample block, as shown by the shaded region in fig. 5B.

For example, for each sample block, the coding mode corresponding to the lowest coding cost is determined as the coding mode of the sample block. The size of each sample block is, but not limited to, 8 x 8.

In step S520, a first number of sample blocks of which the encoding mode is the search-based IBC encoding mode among the sample blocks of the sample frame is determined based on the result of the pre-analysis.

For example, based on the coding mode of each block of samples in the frame of samples, it can be counted how many of these blocks of samples are coding modes for search-based IBC coding.

In step S530, a block that best matches the current coding block is searched from among the coded regions within the current frame based on the first number as a prediction block of the current coding block for coding the current coding block.

For example, if the first number is small, i.e., the sample blocks in the sample frame use the search-based IBC coding mode less (e.g., only one bit of the 100 sample blocks may use the search-based IBC coding mode), since the sample frame corresponds to the current frame, the number of code blocks in the current frame that use the search-based IBC coding mode may also be considered to be small, and the search-based IBC coding mode is not so important, and thus the search is not performed so finely; otherwise, a relatively fine search is required. That is, the block that best matches the current coding block may be searched in different searching manners according to the different first numbers.

Optionally, more details of this step S530 are described in detail below in conjunction with FIGS. 6A-7.

As shown in fig. 6A, in step S530-1, in case that the first number satisfies the first threshold condition, a block (prediction block) that best matches the current coding block is searched in a first search step in the horizontal direction and a first search step in the vertical direction, where the first search step is 2 or more.

For example, the first threshold condition may be that a first ratio of the first number to a total number of sample blocks in the sample frame encoded with the search-based IBC encoding mode is less than or equal to a first ratio threshold. For example, the first ratio threshold is 0.1, which can be obtained by training.

Alternatively, the first search step size may be 2, that is, every two pixel point positions in the horizontal direction and the vertical direction are searched to obtain one block (which is not overlapped with the current coding block) for matching with the current coding block to determine the best matching block, as shown in fig. 6B, black filled circles indicate positions to be searched in the horizontal and vertical directions, black open circles indicate non-search positions, and every two pixel points in the horizontal direction and the vertical direction are searched. For example, a block horizontally separated by 12 pixels (denoted by Bv (12,0) in the block vector) from the corresponding position of the current coding block is used as the prediction block (the size is the same as that of the current coding block) of the current coding block.

In step S530-2, in case the first number does not satisfy the first threshold condition, a block that best matches the current coding block is searched from the coded block within the current frame based on the block vector of each of the first number of sample blocks.

Optionally, the block vector of a sample block indicates the horizontal and vertical offset values of the sample block relative to its best matching block found by the search during the pre-analysis.

Alternatively, for selecting the best matching block based on the block vector, first, a second number of sample blocks of the first number of sample blocks, among which the horizontal offset value indicated by the block vector is smaller than the vertical offset value, may be determined based on the result of the pre-analysis; then, a block (prediction block) that best matches the current coding block may be searched from among the current intra-coded region based on the second number.

For example, if the second number is sufficiently large relative to the first number, it indicates that the best matching block of sample blocks (prediction block) is less likely to be searched in the horizontal direction; if the second number is sufficiently small relative to the first number, this indicates a higher probability of searching in the vertical direction for the best matching block of sample blocks (prediction block); if the second number is in a moderate range with respect to the first number, it is shown that the probability of searching for the best matching block of sample blocks (prediction block) in the horizontal and vertical direction is comparable. Based on these analyses of the sampled frames, the probability of a prediction block for each coding block in the horizontal and vertical directions in the current frame at the original resolution can also be corresponded accordingly.

Thus, FIGS. 6C-6F show further details for a prediction block search based on the second number.

As shown in fig. 6C, in operation a, in case that the second number satisfies the second threshold condition, a block that best matches the current coding block is searched in a first search step in the horizontal direction and a second search step in the vertical direction, the second search step being smaller than the first search step.

Alternatively, the second threshold condition may be that a second ratio of the second number to the first number is greater than a second ratio threshold. For example, the second ratio threshold is 0.7, which can be obtained by training.

For example, as previously analyzed, this case corresponds to the second number being sufficiently large relative to the first number, i.e., it means that the best matching block (prediction block) of the sample blocks is less likely to be searched in the horizontal direction, so the first search step in the horizontal direction is, for example, 2, and the second search step in the vertical direction is, for example, 1, as shown in fig. 6D, the black filled circles indicate positions to be searched in the horizontal and vertical directions, the black open circles indicate non-search positions, the search is performed every two pixels in the horizontal direction, and the search is performed every other pixel in the vertical direction.

In operation B, in case that the second number satisfies a third threshold condition, a block that best matches the current coding block is searched for in a second search step in the horizontal direction and a first search step in the vertical direction.

Alternatively, the third threshold condition may be that a second ratio of the second number to the first number is less than a third ratio threshold. For example, the second ratio threshold is 0.3, which may be obtained by training.

For example, as previously analyzed, this case corresponds to the second number being sufficiently small with respect to the first number, i.e. it means that the probability of searching for the best matching block of sample blocks (prediction block) in the vertical direction is lower, so that the second search step in the horizontal direction is for example 1 and the first search step in the vertical direction is for example 2. As shown in fig. 6E, the black solid circles indicate positions to be searched in the horizontal and vertical directions, the black hollow circles indicate non-search positions, the search is performed every other pixel point in the horizontal direction, and the search is performed every other two pixel points in the vertical direction

In operation C, in case that the second number does not satisfy both the second threshold condition and the third threshold condition, the block that best matches the current encoding block is searched for in the second search step in the horizontal direction and the second search step in the vertical direction.

For example, as previously analyzed, this case corresponds to a case where the probability of searching for the best matching block (prediction block) of the sample blocks in the horizontal direction and the vertical direction is considerable, and therefore a second search step (step size of, for example, 1) is used in both the horizontal direction and the vertical direction. As shown in fig. 6F, the black solid circles indicate positions to be searched in the horizontal and vertical directions, the black hollow circles indicate non-search positions, every other pixel point in the horizontal direction is searched, and every other pixel point in the vertical direction is searched, which is the same as the search method shown in fig. 2.

In order to more clearly describe the above step S530, fig. 7 shows an exemplary flowchart of the step S530.

As shown in fig. 7, first, in flow 710, it is determined whether a ratio of a first number (numIBC) of sample blocks of the sample frame for which the coding mode is the search based IBC coding mode (for the upper left search area) to a total number of sample blocks of the sample frame for which the search based IBC coding mode is used (numALL) is less than or equal to a first ratio threshold. If the first ratio is less than or equal to the first ratio threshold, then flow proceeds to block 720, where the block (prediction block) that best matches the current coding block is searched for with the first search step (2) in the horizontal direction and the first search step (2) in the vertical direction.

If the first ratio is greater than the first ratio threshold, then flow is to block 730, which is to determine a second number of sample blocks (numBV) of which the horizontal offset value (abs (bv.x)) indicated by the block vector is less than the vertical offset value (abs (bv.y)) of the first number of sample blocks, wherein abs () is an absolute value and bv.x and bv.y are the x and y values, respectively, of the block vector; flow 740 then proceeds to determine whether a second ratio of the second number (numBV) to the first number (numIBC) is greater than a second ratio threshold (th 2).

If the second ratio is greater than the second ratio threshold, proceed to flow 750, i.e. search for the block that best matches the current coding block with the first search step (2) in the horizontal direction and the second search step (1) in the vertical direction; if the second ratio is less than or equal to the second ratio threshold, flow 760 is reached, where it is determined whether the second ratio is less than a third ratio threshold.

If the second ratio is smaller than the third ratio threshold, proceed to flow 770, i.e. search for the block that best matches the current coding block with the second search step (1) in the horizontal direction and the first search step (2) in the vertical direction; and if the second ratio is greater than or equal to the third ratio threshold and less than or equal to the second ratio threshold, proceed to flow 780, i.e., search for the block that best matches the current coding block with the second search step (1) in the horizontal direction and the second search step (1) in the vertical direction.

Thus, an improved process for the search-based IBC coding scheme is described above with reference to fig. 5A-7, and based on such an improvement, when the current coding block is coded by using the search-based IBC coding scheme, by adaptively changing the search step size in the horizontal direction and the vertical direction based on the pre-analysis result, the complexity of the search time can be reduced, thereby increasing the search speed, and further increasing the coding efficiency while ensuring the coding quality and the coding performance.

In the above embodiment, although a plurality of coding modes are used for each sampling block in the sampling frame to perform pre-coding, since the sampling is already performed down and the coding mode adopted in the pre-analysis process is a coding mode with low coding complexity, the coding complexity involved in the pre-sampling process is far lower than that of the original coding mode of the current frame in a general coding mode, and the total search time complexity caused by the reduction of the prediction block search process of each coding block is also greatly reduced, which is also beneficial to the coding efficiency and the coding complexity.

The improved method for search-based IBC coding described above with reference to fig. 5A-7 may also be applied to any scene that is coded using search-based IBC coding without the premise that it must be determined whether to skip part of IBC coding (as described with reference to fig. 3), and therefore, the present application also provides a method for coding screen video, which may be used in an application where a current frame of screen video to be coded is coded using a search-based intra block matching (IBC) coding mode, as shown in fig. 8.

As shown in fig. 8, in step S810, a current frame is down-sampled to obtain a sampling frame corresponding to the current frame, and a coding mode of each sampling block in the sampling frame is pre-analyzed.

Determining a first number of sample blocks of which the encoding mode is the search-based IBC encoding mode among the sample blocks of the sample frame based on the result of the pre-analysis in step S820;

in step S830, a block that best matches the current coding block is searched from among the coded regions within the current frame based on the first number as a prediction block of the current coding block for coding the current coding block.

Further details of the above steps are similar to those related to the improved search-based IBC coding mode described with reference to fig. 5A-7, and therefore will not be repeated here.

Also, by referring to the method shown in fig. 8, the search time complexity can be reduced by reducing the search process for each coding block, which is beneficial to the coding efficiency as well as the coding complexity.

According to another aspect of the present application, there is also provided an apparatus for determining an encoding mode of a screen video and an apparatus for encoding of a screen video.

As shown in fig. 9, the apparatus 900 includes a cost calculation module 910, a candidate mode determination module 920, and a mode determination module 930.

The cost calculation module 910 is configured to encode a current coding block of a current frame of a screen video to be encoded respectively by using a first type of intra block matching (IBC) coding mode in an IBC coding mode set and by using a basic coding mode, so as to obtain at least two coding costs.

The IBC coding mode of the first type is a coding mode with the lowest coding complexity in the IBC coding mode set, and the coding complexity of the basic coding mode is lower than that of the IBC coding mode of the first type;

the candidate mode determining module 920 is configured to determine the coding mode corresponding to the smallest coding cost of the at least two coding costs as the candidate coding mode.

The mode determining module 930 is configured to skip coding the current coding block by using the remaining IBC coding modes in the IBC coding mode set and determine the base coding mode as the coding mode of the current coding block, when the candidate coding mode is the base coding mode and the number of base coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to a preset threshold.

Optionally, the cost determining module 910 is further configured to encode the current coding block by using the remaining IBC coding modes when the candidate coding mode is not the basic coding mode, or the candidate coding modes are the basic coding modes but the number of the candidate coding modes is smaller than a preset threshold; and the mode determining module 930 is further configured to determine the coding mode of the current coding block based on the coding cost of the current coding block coded by each coding mode in the base coding mode and the IBC coding mode set.

Optionally, when encoding the current coding block using a search-based IBC coding mode of the remaining IBC coding modes, the cost determining module 930 may also perform a pre-analysis of the sampled frame obtained by down-sampling the current frame to speed up at least part of the search process, as described above with reference to fig. 5A-7.

Further details of the operations involved in the various modules of the apparatus 900 have been described in detail above with reference to fig. 3-7 and therefore will not be repeated here. It should be noted that the apparatus 900 may also be divided into more or less modules according to different ways, and each module may further include further sub-modules to implement the required functions thereof, which is not limited in this application as long as the operations described above with reference to fig. 3-7 can be implemented by the combination of related hardware and/or software.

By the apparatus for determining the coding mode of the screen video described with reference to fig. 9, a part of coding modes can be skipped to code the current coding block according to the characteristics of the current coding block (the above-mentioned at least two coding costs and the number of basic coding modes in the coding modes of the coded surrounding coding blocks of the current coding block), so that the coding efficiency of the screen video can be improved, and higher coding performance and coding quality can be ensured. In addition, when the current coding block is coded by using the search-based IBC coding mode, the search time complexity can be reduced by adaptively changing the search step length in the horizontal direction and the vertical direction based on the pre-analysis result, so that the search process is accelerated, and the coding efficiency is improved while the coding quality and the coding performance are ensured.

Fig. 10 is a block diagram illustrating a structure of an apparatus for determining an encoding mode of a screen video according to an embodiment of the present application.

As shown in fig. 10, the apparatus 1000 includes a pre-analysis module 1010, a determination module 1020, and a search module 1030. The apparatus 1000 is configured to encode each encoded block of the current frame based on the searched IBC coding mode.

The pre-analysis module 1010 is configured to, when each coding block in a current frame of a screen video to be coded is coded by using a search-based intra block matching (IBC) coding mode, down-sample the current frame to obtain a sampling frame corresponding to the current frame, and pre-analyze a coding mode of each sampling block in the sampling frame,

the determining module 1020 is configured to determine, based on a result of the pre-analysis, a first number of sample blocks of the sample frame whose encoding mode is a search-based IBC encoding mode;

the searching module 1030 is configured to search, based on the first number, a block that best matches a current coding block from among coded regions in a current frame, as a prediction block of the current coding block, for coding the current coding block.

Further details of the operations involved in the various modules described above have been described in detail above with reference to fig. 5A-7 and therefore will not be repeated here.

Further details of the operations involved in the various modules of the apparatus 1000 have been described in detail above with reference to fig. 5A-7 and therefore will not be repeated here. It should be noted that the apparatus 1000 may be divided into more or less modules according to different ways, and each module may further include further sub-modules to implement the required functions thereof, which is not limited in this application as long as the operations described above with reference to fig. 5A-7 can be implemented by the combination of related hardware and/or software.

With the apparatus described with reference to fig. 10, when a current coding block is coded in a search-based IBC coding mode, by adaptively changing a search step size in a horizontal direction and a vertical direction based on a first number in a pre-analysis result, complexity of a search time may be reduced, thereby accelerating a search process and improving coding efficiency while ensuring coding quality and coding performance.

According to yet another aspect of the present application, a computing device is also disclosed.

FIG. 11 shows a schematic block diagram of a computing device 1100 in accordance with embodiments of the present application.

As shown in fig. 11, the computing device 1100 includes a processor, memory, a network interface, an input device, and a display screen connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the terminal stores an operating system and may also store a computer program that, when executed by the processor, may cause the processor to implement various operations as described in the steps of the method for determining an encoding mode of a screen video and the method for encoding of a screen video as described above. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform various operations described in the steps of the same method for determining an encoding mode of a screen video and the method for encoding a screen video.

For example, a method for determining an encoding mode of a screen video may include: respectively coding a current coding block of a current frame of a screen video to be coded by using a basic coding mode and a first type of an intra-frame block matching (IBC) coding mode in an IBC coding mode set to obtain at least two coding costs; determining a coding mode corresponding to the minimum coding cost in at least two coding costs as a candidate coding mode; and under the condition that the candidate coding mode is a basic coding mode and the number of basic coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to a preset threshold value, skipping to code the current coding block by using the residual IBC coding modes in the IBC coding mode set and determining the basic coding mode as the coding mode of the current coding block, wherein the IBC coding mode of the first type is the coding mode with the lowest coding complexity in the IBC coding mode set, and the coding complexity of the basic coding mode is lower than that of the IBC coding mode of the first type. Further details of each step have been described in detail above and are therefore not repeated here.

For example, a method for encoding of screen video, comprising: when each coding block in a current frame of a screen video to be coded is coded by adopting a search-based intra-frame block matching (IBC) coding mode, down-sampling the current frame to obtain a sampling frame corresponding to the current frame, pre-analyzing the coding mode of each sampling block in the sampling frame, and determining a first number of sampling blocks of which the coding mode is the search-based IBC coding mode in the sampling blocks of the sampling frame based on the result of the pre-analysis; based on the first number, searching a block which is most matched with the current coding block from the coded area in the current frame as a prediction block of the current coding block, and coding the current coding block.

The processor may be an integrated circuit chip having signal processing capabilities. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which may be of the X84 or ARM architecture.

The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. It should be noted that the memories of the methods described herein are intended to comprise, without being limited to, these and any other suitable types of memories.

The display screen of the computing device can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computing device can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on a terminal shell, an external keyboard, a touch pad or a mouse and the like.

The computing device may be a terminal or a server. Among others, terminals may include, but are not limited to: smart phones, tablet computers, notebook computers, desktop computers, smart televisions, and the like; various clients (APPs) can be run in the terminal, such as a multimedia playing client, a social client, a browser client, an information flow client, an education client, and so on. The server may be the server described with reference to fig. 1, that is, an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform.

According to another aspect of the present application, there is also provided a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method for determining an encoding mode of a screen video and the method for encoding of a screen video as described above.

According to yet another aspect of the present application, there is also provided a computer program product comprising a computer program which when executed by a processor implements the steps of the method for determining an encoding mode of a screen video and the method for encoding of a screen video as described previously.

It is to be noted that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and apparatus according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises at least one executable instruction for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The exemplary embodiments of the present application, which are described in detail above, are illustrative only and not limiting. It will be appreciated by those skilled in the art that various modifications and combinations of the embodiments or features thereof may be made without departing from the principles and spirit of the application, and that such modifications are intended to be within the scope of the application.

Claims

1. A method for determining an encoding mode of a screen video, comprising:

respectively coding a current coding block of a current frame of a screen video to be coded by using a first type of an intra-frame block matching (IBC) coding mode in an IBC coding mode set and using a basic coding mode to obtain at least two coding costs;

determining the coding mode corresponding to the minimum coding cost in the at least two coding costs as a candidate coding mode; and

skipping encoding of the current coding block by using the residual IBC coding modes in the IBC coding mode set and determining the base coding mode as the coding mode of the current coding block under the condition that the candidate coding mode is the base coding mode and the number of the base coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is more than or equal to a preset threshold value,

wherein the first type of IBC coding mode is a coding mode with the lowest coding complexity in a set of IBC coding modes, and the coding complexity of the base coding mode is lower than that of the first type of IBC coding mode.

2. The method of claim 1, further comprising: in case the candidate coding mode is not a base coding mode, or the candidate coding mode is a base coding mode but the number of base coding modes is smaller than the preset threshold,

encoding the current encoding block by using the residual IBC encoding mode; and

and determining the coding mode of the current coding block based on the coding cost for coding the current coding block by the basic coding mode and each coding mode in the IBC coding mode set.

3. The method of claim 2, wherein the set of IBC coding modes comprises: at least one of merge/skip based IBC encoding mode, search based IBC encoding mode and hash based IBC encoding mode;

wherein the base coding mode includes a Direct Current (DC) coding mode, a Planar (Planar) coding mode, and a direction-based coding mode.

4. The method of claim 3, wherein the first type of IBC encoding mode is a merge/skip based IBC encoding mode,

determining the coding mode of the current coding block based on the basic coding mode and the coding cost for coding the current coding block by each coding mode in the IBC coding mode set, wherein the determining comprises the following steps:

and determining the coding mode corresponding to the minimum coding cost in the at least two coding costs, the searching-based IBC coding mode and the Hash-based IBC coding mode to code the current coding block as the coding mode of the current coding block.

5. The method of claim 1, wherein surrounding coding blocks of the current coding block comprise: the coding blocks at the left side, the upper left side, the upper right side and the lower left side of the current coding block and the coding blocks at the upper level of the current coding block,

wherein the current frame is iteratively divided into coding blocks of different levels by different sizes, and a coding block of a level above the current coding block is split into a plurality of coding blocks including the current coding block.

6. The method of claim 3, wherein encoding a current coding block using a search-based IBC coding mode of the remaining IBC coding modes comprises:

down-sampling the current frame to obtain a sampling frame corresponding to the current frame, pre-analyzing the coding mode of each sampling block in the sampling frame,

determining a first number of sample blocks of the sample frame whose coding mode is a search-based IBC coding mode based on a result of the pre-analysis;

and searching a block which is matched with the current coding block most from the coded area in the current frame as a prediction block of the current coding block based on the first quantity, wherein the block is used for coding the current coding block.

7. The method of claim 6, wherein down-sampling a current frame to obtain a sampled frame corresponding to the current frame and pre-analyzing an encoding mode of each block of samples in the sampled frame comprises:

dividing the sample frame into a plurality of sample blocks;

pre-coding each sampling block by utilizing a coding mode set to obtain a coding cost corresponding to each coding mode, wherein the coding mode set comprises a basic coding mode and a search-based IBC coding mode;

determining a coding mode for each sample block based on a coding cost corresponding to each coding mode in the set of coding modes for each sample block.

8. The method of claim 7, wherein searching for a block from the encoded region within the current frame that best matches the current encoded block based on the first number comprises:

under the condition that the first number meets a first threshold condition, searching a block which is most matched with a current coding block by a first search step in the horizontal direction and a first search step in the vertical direction, wherein the first search step is more than or equal to 2;

searching for a block from the current intra coded block that best matches the current coded block based on the block vector for each of the first number of sample blocks if the first number does not satisfy a first threshold condition,

wherein the block vector of the sample block indicates a horizontal offset value and a vertical offset value of the sample block with respect to its best matched block obtained by the search.

9. The method of claim 8, wherein searching for a block from a coded block within a current frame that best matches a current coded block based on a block vector for each sample block of the first number of sample blocks comprises:

determining a second number of sample blocks of which horizontal offset values indicated by the block vectors are smaller than vertical offset values among the first number of sample blocks based on a result of the pre-analysis; and

based on the second number, a block that best matches a current encoding block is searched from among the encoded regions within the current frame.

10. The method of claim 9, wherein searching for a block that best matches a current coding block from among coded regions within a current frame based on the second number comprises:

under the condition that the second quantity meets a second threshold condition, searching a block which is most matched with a current coding block by a first search step length in the horizontal direction and a second search step length in the vertical direction, wherein the second search step length is smaller than the first search step length;

under the condition that the second number meets a third threshold condition, searching a block which is most matched with the current coding block by a second search step in the horizontal direction and a first search step in the vertical direction; and

and in the case that the second number does not satisfy both the second threshold condition and the third threshold condition, searching for a block that best matches the current coding block with a second search step in the horizontal direction and a second search step in the vertical direction.

11. An apparatus for determining an encoding mode of a screen video, comprising:

the cost calculation module is used for respectively encoding a current coding block of a current frame of a screen video to be encoded by using a first type of an intra-frame block matching (IBC) coding mode in an IBC coding mode set and using a basic coding mode to obtain at least two encoding costs;

a candidate mode determining module, configured to determine, as a candidate coding mode, a coding mode corresponding to a smallest coding cost of the at least two coding costs; and

a mode determining module, configured to skip coding of the current coding block using the remaining IBC coding modes in the IBC coding mode set when the candidate coding mode is the base coding mode and the number of base coding modes in the coding modes of the coded surrounding coding blocks of the current coding block is greater than or equal to a preset threshold, and determine the base coding mode as the coding mode of the current coding block,

wherein the first type of IBC coding mode is a coding mode with the lowest coding complexity in the set of IBC coding modes, and the coding complexity of the base coding mode is lower than that of the first type of IBC coding mode.

12. A computing device, comprising:

a processor; and

a memory having stored thereon a computer program which, when executed by the processor, implements the method of any of claims 1-10.