CN111669595A - Screen content coding method, device, equipment and medium - Google Patents

Screen content coding method, device, equipment and medium Download PDF

Info

Publication number
CN111669595A
CN111669595A CN202010454534.7A CN202010454534A CN111669595A CN 111669595 A CN111669595 A CN 111669595A CN 202010454534 A CN202010454534 A CN 202010454534A CN 111669595 A CN111669595 A CN 111669595A
Authority
CN
China
Prior art keywords
coding unit
reference block
target coding
image
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010454534.7A
Other languages
Chinese (zh)
Inventor
陈玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010454534.7A priority Critical patent/CN111669595A/en
Publication of CN111669595A publication Critical patent/CN111669595A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Abstract

The application belongs to the technical field of video coding, and discloses a screen content coding method, a device, equipment and a medium. Therefore, the data calculation cost is reduced, the effective search area of the target coding unit is expanded, the number of reference blocks is increased, and the screen content coding quality is improved.

Description

Screen content coding method, device, equipment and medium
Technical Field
The present application relates to the field of screen content encoding, and in particular, to a method, an apparatus, a device, and a medium for screen content encoding.
Background
The screen content image is an image generated by an electronic device, unlike an image captured by a conventional camera, and is captured from an image display unit of various devices (e.g., a computer, a mobile terminal, etc.). Such as computer graphics text, mixed images of natural images combined with graphics text, and computer generated animations.
In order to improve the compression efficiency of Screen Content Coding, High Efficiency Video Coding (HEVC) Screen Content Coding (SCC) has been introduced.
In HEVC SCC coding, a frame of screen content image is usually divided into a plurality of Coding Tree Units (CTUs), and then the CTUs are divided into a plurality of Coding Units (CUs), and a block-level coding mode (IBC) is adopted to perform screen content coding, so-called IBC technology, i.e. a matching target reference block is determined in an effective search area of a target coding unit to be processed. Wherein the valid search region and the invalid search region of the target coding unit are determined according to the HEVC SCC standard.
However, in the existing IBC technology, it is necessary to determine whether the reference block in the screen content image is located in the effective search area of the target coding unit, which results in higher calculation overhead, and a smaller number of reference blocks in the effective search area, and thus the Rate-Distortion (RD) quality of the screen content image is poor.
Therefore, a new technical solution is needed to optimize IBC to reduce the computation overhead and improve the screen content coding quality.
Disclosure of Invention
The embodiment of the application provides a screen content coding method, a screen content coding device and a screen content coding medium, which are used for optimizing an IBC (interactive text core controller) when the IBC is adopted for coding screen content, reducing the calculation cost and improving the screen content coding quality.
In one aspect, a screen content encoding method is provided, including:
screening out a target reference block matched with a target coding unit to be coded in the screen content image from the reference blocks in the coded area of the screen content image;
and carrying out screen content coding based on the target coding unit and the target reference block.
In one aspect, there is provided a screen content encoding apparatus including:
the matching unit is used for screening out a target reference block matched with a target coding unit to be coded in the screen content image from the reference blocks in the coded area of the screen content image;
and the processing unit is used for carrying out screen content coding based on the target coding unit and the target reference block.
Preferably, the matching unit is configured to:
acquiring pixel information of a target coding unit;
respectively executing the following steps aiming at each reference block in the single-direction linked list until determining that the reference block with the image similarity meeting the preset matching condition exists:
acquiring pixel information of a reference block, determining image similarity between a target coding unit and the reference block according to the pixel information of the target coding unit and the pixel information of the reference block, and determining the reference block as the reference block meeting a preset matching condition if the image similarity is not lower than a first similarity threshold.
Preferably, the matching unit is further configured to:
and when the reference blocks not lower than the first similarity threshold do not exist, determining the reference blocks meeting the preset matching condition according to the acquired position distance between the target coding unit and each reference block and the corresponding image similarity.
In one aspect, a control device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to perform the steps of any of the above-mentioned screen content encoding methods.
In one aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of any of the above-mentioned screen content encoding methods.
In the screen content encoding method, device, equipment and medium provided by the embodiment of the application, the target reference block matched with the target encoding unit to be encoded in the screen content image is screened out from all the reference blocks in the encoded region of the screen content image, and screen content encoding is performed based on the target encoding unit and the corresponding target reference block. Therefore, the position calculation cost is reduced, the effective search area of the target coding unit is expanded, the number of the reference blocks is increased, the screen content coding quality is improved, and further, if the target reference block is located in the ineffective search area of the target coding unit, the search path length of the target reference block can be reduced, and the calculation complexity is reduced.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic view of an application scenario in an embodiment of the present application;
fig. 2 is a schematic flow chart of HEVC SCC coding according to the prior art;
FIG. 3 is a flowchart illustrating an implementation of a method for encoding screen content according to an embodiment of the present disclosure;
FIG. 4a is a schematic diagram of a quadtree partitioning according to an embodiment of the present disclosure;
FIG. 4b is a schematic diagram of a CTU according to an embodiment of the present application;
FIG. 4c is a diagram illustrating a multi-threaded content encoding process according to the prior art;
fig. 4d is a schematic diagram of search area division according to an embodiment of the present disclosure;
fig. 5 is a flowchart of an implementation of a target reference block determination method in an embodiment of the present application;
FIG. 6 is a diagram illustrating multi-threaded screen content encoding according to an embodiment of the present disclosure;
FIG. 7 is a diagram illustrating a structure of a screen content encoding apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a control device in an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solution and beneficial effects of the present application more clear and more obvious, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
First, some terms referred to in the embodiments of the present application are explained to facilitate understanding by those skilled in the art.
The terminal equipment: may be a mobile terminal, a fixed terminal, or a portable terminal such as a mobile handset, station, unit, device, multimedia computer, multimedia tablet, internet node, communicator, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, personal communication system device, personal navigation device, personal digital assistant, audio/video player, digital camera/camcorder, positioning device, television receiver, radio broadcast receiver, electronic book device, gaming device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the terminal device can support any type of interface to the user (e.g., wearable device), and the like.
A server: the cloud server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, big data and artificial intelligence platform and the like.
IBC: is a block-level coding mode, and at the coding end, a block matching technique is used to find the best matching block for each current CU, and to calculate the block vector between the current block and the location of the best matching block.
Inter-frame prediction: the position which is most matched with the current block is found from a reference frame, which refers to the information on the time domain, including Motion Estimation (ME) and Motion Compensation (MC).
And (3) motion estimation: the best corresponding image block of the CU currently to be encoded in the already encoded picture (reference frame) is determined and the offset (motion vector) of the corresponding block is calculated.
Cloud computing (cloud computing) is a computing model that distributes computing tasks over a pool of resources formed by a large number of computers, enabling various application systems to obtain computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as being infinitely expandable and available at any time, available on demand, expandable at any time, and paid for on-demand.
A distributed cloud storage system (hereinafter, referred to as a storage system) refers to a storage system that integrates a large number of storage devices (storage devices are also referred to as storage nodes) of different types in a network through application software or application interfaces to cooperatively work by using functions such as cluster application, grid technology, and a distributed storage file system, and provides a data storage function and a service access function to the outside. At present, a storage method of a storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.
The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a set of capacity measures of objects stored in a logical volume (the measures usually have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.
The design concept of the embodiment of the present application is described below.
HEVC SCC has been produced to improve the compression efficiency of screen content coding. In HEVC SCC coding, an encoder obtains a corresponding one-way linked list according to an image hash value of a target coding unit, wherein the one-way linked list is created according to a reference block in a coded region in a CTU, and the image hash values corresponding to all the reference blocks in one-way linked list are the same.
It should be noted that, in practical applications, the single linked list (including the single linked list) is formed by linking the positions of the reference blocks. In the embodiment of the present application, for convenience of description, the reference blocks corresponding to each position in the unidirectional chain table are expressed as the reference blocks included in the unidirectional chain table.
Then, the encoder determines whether each reference block in the single-direction chain table is located in the effective search area of the target coding unit until the target reference block matched with the target coding unit is obtained.
The deeper the one-way linked list is, the greater the computational overhead for determining whether the reference block in the one-way linked list is located in the effective search area, and the limited number of reference blocks in the effective search area, so that the screen content encoding quality is poor.
Obviously, in the conventional technology, a technical scheme that can optimize IBC in HEVC SCC, reduce computational overhead, and improve screen content coding quality is not provided. Therefore, a technical solution for encoding screen content is needed, which optimizes IBC to reduce the computation overhead and improve the quality of screen content encoding.
The applicant carefully analyzes the division principle of the effective search area and the ineffective search area in the traditional technology, and finds that the determination standard of the effective search area is set on the ideal premise that the number of the encoding threads is infinite, so that the multithreading parallel processing is prevented from being influenced by the effective search area. In practical applications, the number of encoding threads is usually limited, and if the reference block of the invalid search area is located in the encoded area, the parallel processing of the encoding threads is not affected, so that the valid search area can be expanded into the encoded area.
In view of the above analysis and consideration, the present embodiment provides a screen content encoding scheme, in which, among all reference blocks in an encoded region of a screen content image, a target reference block matching a target coding unit to be encoded in the screen content image is screened out, and screen content encoding is performed based on the target coding unit and the corresponding target reference block.
To further illustrate the technical solutions provided by the embodiments of the present application, the following detailed description is made with reference to the accompanying drawings and the detailed description. Although the embodiments of the present application provide method steps as shown in the following embodiments or figures, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the embodiments of the present application. The method can be executed in the order of the embodiments or the method shown in the drawings or in parallel in the actual process or the control device.
Fig. 1 is a schematic diagram of an application scenario. An application scenario related to screen content coding is described below with reference to fig. 1.
The network shown in fig. 1 includes a terminal device 10A, a terminal device 10B, a network 20, and a server 30, wherein the terminal device 10A and the terminal device 10B are communicatively connected to the server 30 through the network 20.
In fig. 1, a user using a terminal device 10A wants to push a piece of video being viewed to a terminal device 10B. In this scenario, the terminal device 10A corresponds to the encoding side and the terminal device 10B corresponds to the decoding side. The terminal device 10A needs to encode the video to obtain an encoded video stream, the terminal device 10A uploads the video stream to the server 30, the server 30 forwards the video stream to the terminal device 10B, and the terminal device 10B decodes the video stream to realize normal playing of the video on the terminal device 10B.
Fig. 2 is a schematic diagram illustrating a flow of HEVC SCC coding in the prior art. In the encoding process of HEVC SCC, a frame of screen content image is read from a frame buffer and then sent to an encoder, and a predicted value is obtained after intra-frame or inter-frame prediction. After the predicted value is obtained, subtracting the predicted value from the input data to obtain a residual error, then performing Discrete Cosine Transform (DCT) change and quantization to obtain a residual error coefficient, then sending the residual error coefficient into an entropy coding module to output a code stream, meanwhile, after inverse quantization and inverse transformation of the residual error coefficient, obtaining a residual error value of a reconstructed image, and then adding the residual error value and the predicted value obtained after intra-frame or inter-frame prediction to obtain a reconstructed image, and after the intra-ring filtering of the reconstructed image, entering a reference frame queue to be used as a reference image of the next frame, thereby backward coding the next frame. Among them, the in-loop filtering may include Deblocking filtering (DBF) and Adaptive pixel compensation (SAO).
The intra-frame prediction adopts IBC to search a matched target reference block for each target coding unit to be coded, and calculates a block vector between the two positions, wherein the reference is information on a space domain.
In the IBC technology, it is necessary to determine whether each reference block is located in an effective search area of a target coding unit, so that the computation complexity and the computation cost are high, and the number of reference blocks in the effective search area is limited, so that the IBC RD quality is poor, and the screen content coding quality is poor.
In the embodiment of the present application, the execution subject may be a control device for screen content encoding. The control device can be a server or a terminal device. Alternatively, the control device may be a cloud server providing cloud storage and cloud computing. The screen content coding scheme provided in the embodiment of the application is applied to a scene in which HEVC SCC is adopted to code a screen content image.
Fig. 3 is a flowchart illustrating an implementation of a screen content encoding method according to the present application. The method comprises the following specific processes:
step 300: the control device acquires a target coding unit to be coded in the screen content image.
In the embodiment of the present application, the screen content image is an image generated by an electronic device, and is captured from an image display unit of various devices (e.g., a computer, a mobile terminal, etc.), and may be a frame image in a video or a single image.
Among them, the screen content image may be a text document, a slide, a web page, a game screen, and the like. The screen content image may be color coded in any one of YUV420, YUV422, or YUV444 formats. "Y" represents brightness (Luminince or Luma), i.e., a gray scale value; "U" and "V" denote chromaticity (Chroma) which is used to describe the color and saturation of the image for specifying the color of the pixel.
Wherein the target coding unit is a CU in the screen content image. The size of the target coding unit may be 8 × 8, or may be set according to an actual application scenario, and is not limited herein.
In one embodiment, the control device divides the screen content image into a plurality of Coding Tree Unit (CTU) blocks according to a preset maximum CU size, and then each CTU block is divided a plurality of times by using a cyclic hierarchical structure of a quadtree until the preset minimum CU size.
Fig. 4a is a schematic diagram of a quadtree splitting principle. The specific segmentation process of the quadtree is labeled with two variables: depth (Depth) and Split flag (Split _ flag). The largest CU (lcu) may be 64 × 64 in size and 0 in depth, denoted by CU0, CU0 may be split into four CU1 of 32 × 32 in depth 1, and so on, until CU3, which may be split into 3 in depth, is not subdivided. For CUd with a size of 2N x 2N and a depth of d, at this time, if its split _ flag value is 0, CUd is not divided; otherwise it is split into CUd +1 of four depths d +1 of size N x N.
It should be noted that the reference block is an image block having the same size (i.e., the same size) as the target coding unit, and is obtained in a different manner from the CU, and each reference block is generated according to each reference point in the screen content image.
In the embodiment of the present application, only the target reference block matched with one target coding unit is determined as an example for description, and based on a similar principle, the target reference blocks of other target coding units in the screen content image may be determined, which is not described herein again.
Fig. 4b is a schematic diagram of a CTU. In fig. 4b, the CTU includes multiple CUs of different sizes. The control device takes one CU to be encoded in the CTU as a target coding unit.
Thus, the target coding unit to be currently coded can be acquired.
Step 301: and the control equipment screens out a target reference block matched with a target coding unit to be coded in the screen content image from the reference blocks in the coded area of the screen content image.
Specifically, the reference blocks in the encoded region are all reference blocks in the encoded region in the screen content image.
For example, referring to fig. 4b, the reference blocks of the gray region (i.e., the reference blocks in the first three rows and the reference block before the target coding unit in the fourth row) are all the reference blocks in the coded region.
This is because in the conventional technology, on the ideal premise that the number of the coding threads is unlimited, the set division standard of the valid search area and the invalid search area is used, but in consideration of the practical application, the number of the threads for coding the screen content is limited, and a target reference block matched with a target coding unit may also exist in the invalid search area, so in the embodiment of the present application, the valid search area is expanded.
Fig. 4c is a diagram illustrating a multi-thread screen content encoding according to the prior art. The control device respectively encodes the CUs in the first to fourth rows using 4 encoding threads, the first thread currently encoding the 9 th CU in the first row, the second thread currently encoding the 8 th CU in the second row, the third thread currently encoding the 7 th CU in the third row, and the fourth thread currently encoding the CU in the 6 th encoding block in the fourth row. In the conventional technique, to avoid affecting the parallel processing of the encoding threads, if the CU processed by the (N + 1) th thread is the (M + 1) th CU, the CU number processed by the nth thread is the mth CU. Therefore, when a CU in the 6 th coding block in the fourth row is the target coding unit, the effective search area of the target coding unit is the gray area in fig. 4 c. Furthermore, in the conventional technique, when determining a target reference block matched with a target coding unit, it is sequentially determined whether each reference block in the single-direction linked list is located in an effective search area according to the direction indicated by the arrow in fig. 4 c. The one-way linked list includes the reference block (i.e., the invalid search area) of the non-gray area in fig. 4c, and the deeper the linked list is, the higher the computation overhead of a Central Processing Unit (CPU) is, obviously, a large amount of system resources are consumed, and the screen content encoding efficiency is reduced.
In consideration of the fact that in practical applications, the number of threads for screen content encoding is limited, and there may also be target reference blocks matching target coding units in the invalid search area, in the embodiment of the present application, an improvement is made in the conventional IBC technology, that is, the valid search area is expanded to all reference blocks in the encoded area. Fig. 4d is a schematic diagram illustrating search area division. In the conventional IBC technique, the gray area in fig. 4d is used as an effective search area, and the black area is used as an ineffective search area, whereas in the present application, both the gray area and the black area in fig. 4d are used as effective search areas.
That is, in the conventional manner, the coded area is divided into an effective search area and an ineffective search area, whereas in the embodiment of the present application, the entire coded area is expanded into the effective search area.
Step 302: the control device performs screen content encoding based on the target encoding unit and the target reference block.
Referring to fig. 5, a flowchart of an implementation of a target reference block determination method is shown. The following describes the step 301 specifically by using a specific embodiment, and when the step 301 is executed, the following process may be adopted:
step 500: the control device acquires image characteristic information of a target encoding unit.
Specifically, the image feature information is determined according to the pixel value of each pixel point in the target coding unit.
In one embodiment, the image characteristic information may be determined by:
s5001: the control equipment acquires the pixel value of each pixel point in the target coding unit.
The pixel point is the smallest image unit, and one picture is composed of a plurality of pixel points.
For example, the picture size of a picture is 500 × 338, which indicates that the picture is formed by a pixel matrix of 500 × 338, the width of the picture is 500 pixels, the height of the picture is 338 pixels, and there are a total of 500 × 338 pixels and 149000 pixels.
S5002: the control device determines the sum of the pixel values of the encoding units.
S5003: the control device determines a horizontal pixel gradient average value in the horizontal direction and a vertical pixel gradient average value in the vertical direction of the target coding unit based on the respective pixel values.
Specifically, the control device determines a horizontal pixel gradient average value from each pixel value in the horizontal direction, and determines a vertical pixel gradient average value from each pixel value in the vertical direction.
S5004: the control device obtains image feature information of the target coding unit based on the sum of the pixel values, the horizontal pixel gradient average value, and the vertical pixel gradient average value.
Specifically, the control device takes the sum of the respective pixel values, the horizontal pixel gradient average value, and the combination of the vertical pixel gradient average values as the image characteristic information of the target coding unit.
In practical applications, the image characteristic information may also be determined in other manners, which is not limited herein.
Thus, image characteristic information of the target coding unit can be obtained.
Step 501: the control device performs hash processing on the image characteristic information of the target coding unit to obtain an image hash value of the target coding unit.
Specifically, the control device performs hash processing on the image characteristic information of the target encoding unit by using a preset hash algorithm to obtain an image hash value of the target encoding unit.
Wherein the hash algorithm is configured to generate a fixed length output based on an input. Changing one character in the input will each result in a completely different hash value.
Optionally, the hash algorithm may adopt a perceptual hash algorithm, where the perceptual hash algorithm is a generic term of a class of hash algorithms, and is used to generate a "fingerprint" character string of each image and compare fingerprint information of different images to determine similarity of the images. The closer the result is to the image the more similar. Perceptual hash algorithms include mean hash (aHash), perceptual hash (pHash), and difference value hash (dHash). aHash speed is faster, but accuracy is lower; pHash is performed in reverse, with higher accuracy but slower speed; the dHash takes both into account, and has higher accuracy and higher speed. After a 64-bit hash value is obtained, the hamming distance is used to quantify the similarity of the two images. The greater the hamming distance is, the smaller the similarity of the images is, and the smaller the hamming distance is, the greater the similarity of the images is.
Thus, the image hash value of the target encoding unit can be obtained.
Step 502: and the control equipment acquires a single-direction linked list which is correspondingly arranged on the image hash value of the target coding unit.
In one embodiment, the control device determines a one-way linked list corresponding to the image hash value of the target coding unit through a hash index table.
The hash index table comprises hash values of the images and a corresponding set one-way linked list. The single-direction linked list is established according to the reference blocks in the coded area, and the image hash values corresponding to the reference blocks in one single-direction linked list are the same. The reference blocks in the single chain list are arranged according to the order of the position distance from the target coding unit from near to far.
In this way, a plurality of reference blocks similar to the target coding unit can be efficiently filtered out by the image hash value.
Step 503: and the control equipment sequentially determines the image similarity between each reference block in the screened one-way linked list and the target coding unit until determining that the reference block with the image similarity meeting the preset matching condition exists.
Specifically, the control device obtains pixel information of the target coding unit, and executes the following steps for each reference block in the single-direction linked list until determining that a reference block with image similarity meeting a preset matching condition exists:
acquiring pixel information of a reference block, determining image similarity between a target coding unit and the reference block according to the pixel information of the target coding unit and the pixel information of the reference block, and determining the reference block as the reference block meeting a preset matching condition if the image similarity is not lower than a first similarity threshold.
Alternatively, the image similarity may be determined by Sum of Absolute Differences (SAD), or may be determined by other methods according to the actual application scenario, which is not limited herein.
The pixel information is the pixel value of each pixel point. In digital image processing, SAD is a measure of the similarity between image blocks, calculated by the absolute difference between each pixel in one image block and the corresponding pixel in another image block. The smaller the SAD, the higher the image similarity.
In practical applications, the first similarity threshold may be set according to practical application scenarios, for example, the first similarity threshold may be 100%, and is not limited herein.
Further, when it is determined that there is no reference block not lower than the first similarity threshold, the reference block meeting the preset matching condition may also be determined according to the acquired position distance and/or the corresponding image similarity between the target coding unit and each reference block, respectively.
In one embodiment, a reference block corresponding to a maximum value among the image similarities is determined, when the image similarity corresponding to the determined reference block is greater than a second preset threshold, whether the determined reference block is one or not is judged, if yes, the reference block is determined to meet a preset matching condition, and otherwise, the reference block with the smallest position distance with the target coding unit among the determined reference blocks is determined to be the reference block meeting the preset matching condition.
In one embodiment, the reference blocks with the image similarity greater than the third preset threshold are screened out, if one reference block is determined, the reference block is determined to meet the preset matching condition, otherwise, the reference block with the smallest position distance with the target coding unit in the plurality of determined reference blocks is determined to be the reference block meeting the preset matching condition.
In practical application, the first preset threshold, the second threshold, and the third preset threshold may all be set according to a practical application scenario, for example, the first preset threshold, the second preset threshold, and the third preset threshold may be 1, 0.9, and 0.8 in sequence, and are not limited herein.
The method and the device can be applied to a single-thread screen content coding scene and can also be applied to a multi-thread screen content coding scene. Further, not only the IBC mode in the I frame but also the IBC coding mode in the inter-coded frame P or B frame may be used.
In a specific application scenario, in a single-thread mode, the QQ265 SCC YUV420 encoder obtains a code rate saving of 1.65% (by adopting the scheme provided by the embodiment of the present application)
Figure BDA0002508726790000131
delta bit rate, BD-BR) quality improvement and coding complexity is reduced from 29.5% to 23% compared to the prior art. By adopting the scheme provided by the embodiment of the application, the QQ265 SCC YUV444 encoder achieves the BD-BR quality improvement of 0.05%, and the encoding complexity is reduced by 1%. Wherein, BD-BR is the code rate saving of the two methods under the same objective quality. Among them, the BD-BR is an index for measuring the video RD.
The following describes the extension of the multi-threaded effective search area in a specific application scenario.
Fig. 6 is a schematic diagram of multi-thread screen content coding according to an embodiment of the present application. For a multi-thread mode with a wave front Parallel Processing (WPP) greater than 1, using WPP 2 as an example, i.e., two threads as an example, the first thread encodes the CU in the first row in the direction indicated by the arrow, and all reference blocks in the first row are reference blocks within the encoded region. The second thread parallelly encodes the CUs in the second row according to the arrow direction, and the (N + 1) th CU is a target coding unit, then the encoded region of the target coding unit includes: the area of the entire first row and the area before the (N + 1) th CU in the second row. The first thread parallelly encodes CUs in the third row according to the arrow direction, and the nth CU is a target coding unit, then the encoded region of the target coding unit includes: the area of the entire first row, the area before the (N + 1) th CU in the second row, and the area before the nth CU in the third row. The second thread encodes the fourth row of CUs, and the N-1 st CU is a target coding unit, the encoded region of the target coding unit includes: the area of the entire first row, the area before the (N + 1) th CU in the second row, the area before the nth CU in the third row, and the area before the (N-1) th CU in the fourth row.
It can be seen that the screen content encoding efficiency and encoding quality are improved compared to the conventional effective search area even if only the effective search area of the first line is expanded.
In the embodiment of the application, the IBC technology is optimized, the effective search area is expanded, the number of reference blocks is increased, the IBC RD quality is improved, whether the reference blocks in the single-direction chain table are located in the effective search area or not does not need to be judged, system resources consumed by position calculation are reduced, further, if matched target reference blocks are obtained in the expanded search area, the actual search path length can be shortened, and therefore the calculation complexity is reduced.
Based on the same inventive concept, the embodiment of the present application further provides a device for screen content encoding, and because the principle of the device and the apparatus for solving the problem is similar to that of a method for screen content encoding, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
Fig. 7 is a schematic structural diagram of an apparatus for encoding screen content according to an embodiment of the present application. An apparatus for screen content encoding includes:
a matching unit 701, configured to screen out, from reference blocks in an encoded region of a screen content image, a target reference block that is matched with a target encoding unit to be encoded in the screen content image;
a processing unit 702, configured to perform screen content encoding based on the target encoding unit and the target reference block.
Preferably, the matching unit 701 is configured to:
acquiring image characteristic information of a target coding unit;
carrying out Hash processing on the image characteristic information of the target coding unit to obtain an image Hash value of the target coding unit;
acquiring one-way linked lists which are correspondingly arranged to the image hash values of the target coding units, wherein the one-way linked lists are created according to the reference blocks in the coded regions, and the image hash values corresponding to the reference blocks contained in one-way linked list are the same;
sequentially determining the image similarity between each reference block in the single-direction linked list and the target coding unit until determining that the reference block with the image similarity meeting the preset matching condition exists;
and determining the determined reference block as a target reference block matched with the target coding unit.
Preferably, the matching unit 701 is configured to:
acquiring a pixel value of each pixel point in a target coding unit;
determining the sum of pixel values of a target coding unit;
determining a horizontal pixel gradient average value of the target coding unit in the horizontal direction and a vertical pixel gradient average value in the vertical direction according to each pixel value;
and obtaining the image characteristic information of the target coding unit based on the sum of the pixel values, the horizontal pixel gradient average value and the vertical pixel gradient average value.
Preferably, the matching unit 701 is configured to:
acquiring pixel information of a target coding unit;
respectively executing the following steps aiming at each reference block in the single-direction linked list until determining that the reference block with the image similarity meeting the preset matching condition exists:
acquiring pixel information of a reference block, determining image similarity between a target coding unit and the reference block according to the pixel information of the target coding unit and the pixel information of the reference block, and determining the reference block as the reference block meeting a preset matching condition if the image similarity is not lower than a first similarity threshold.
Preferably, the matching unit 701 is further configured to:
and when the reference blocks not lower than the first similarity threshold do not exist, determining the reference blocks meeting the preset matching condition according to the acquired position distance between the target coding unit and each reference block and the corresponding image similarity.
In the screen content encoding method, device, equipment and medium provided by the embodiment of the application, the target reference block matched with the target encoding unit to be encoded in the screen content image is screened out from all the reference blocks in the encoded region of the screen content image, and screen content encoding is performed based on the target encoding unit and the corresponding target reference block. Therefore, whether the reference block is located in the effective search area of the target coding unit or not does not need to be judged, the calculation cost is reduced, the effective search area of the target coding unit is expanded, the number of the reference blocks is increased, the screen content coding quality is improved, further, if the target reference block is located in the ineffective search area of the target coding unit, the search path length of the target reference block can be reduced, and the calculation complexity is reduced.
Fig. 8 shows a schematic structural diagram of a control device 8000. Referring to fig. 8, the control device 8000 includes: a processor 8010, a memory 8020, a power supply 8030, a display unit 8040, and an input unit 8050.
The processor 8010 is the control center of the control device 8000, and it is possible to monitor the control device 8000 as a whole by connecting various components using various interfaces and lines, and performing various functions of the control device 8000 by operating or executing software programs and/or data stored in the memory 8020.
In the embodiment of the present application, the processor 8010 executes the method of screen content encoding provided in the embodiment shown in fig. 3 when calling the computer program stored in the memory 8020.
Alternatively, the processor 8010 may comprise one or more processing units; preferably, the processor 8010 may integrate the application processor, which handles primarily the operating system, user interface, applications, etc., and the modem processor, which handles primarily the wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 8010. In some embodiments, the processor, memory, and/or memory may be implemented on a single chip, or in some embodiments, they may be implemented separately on separate chips.
The memory 8020 may mainly include a program storage area and a data storage area, in which an operating system, various applications, and the like may be stored; the stored data area may store data created according to the use of the control device 8000, and the like. Further, the memory 8020 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The control device 8000 may also include a power supply 8030 (e.g., a battery) to provide power to the various components, which may be logically coupled to the processor 8010 via a power management system, so as to manage charging, discharging, and power consumption via the power management system.
The display unit 8040 may be used to display information input by a user or information provided to the user, and various menus of the control device 8000, and is mainly used to display a display interface of each application in the control device 8000 and objects such as texts and pictures displayed in the display interface in the embodiment of the present invention. The display unit 8040 may include a display panel 8041. The Display panel 8041 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The input unit 8050 can be used to receive information such as numbers or characters input by a user. The input unit 8050 may include a touch panel 8051 and other input devices 8052. Among other things, the touch panel 8051, also referred to as a touch screen, can collect touch operations by a user on or near the touch panel 8051 (e.g., operations by a user on or near the touch panel 8051 using any suitable object or accessory such as a finger, a stylus, etc.).
Specifically, the touch panel 8051 can detect a touch operation of a user, detect signals caused by the touch operation, convert the signals into touch point coordinates, send the touch point coordinates to the processor 8010, receive a command sent by the processor 8010, and execute the command. In addition, the touch panel 8051 can be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. Other input devices 8052 can include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, power on/off keys, etc.), a trackball, a mouse, a joystick, and the like.
Of course, the touch panel 8051 can cover the display panel 8041, and when the touch panel 8051 detects a touch operation thereon or nearby, the touch panel 8051 is transmitted to the processor 8010 to determine the type of the touch event, and then the processor 8010 provides a corresponding visual output on the display panel 8041 according to the type of the touch event. Although in FIG. 8, the touch panel 8051 and the display panel 8041 are shown as two separate components to implement the input and output functions of the control device 8000, in some embodiments, the touch panel 8051 and the display panel 8041 may be integrated to implement the input and output functions of the control device 8000.
The control device 8000 may also include one or more sensors, such as pressure sensors, gravitational acceleration sensors, proximity light sensors, and the like. Of course, the control device 8000 may also include other components such as a camera, as required in a particular application, and these components are not shown in fig. 8 and will not be described in detail since they are not components of importance in the embodiments of the present application.
Those skilled in the art will appreciate that fig. 8 is merely an example of a control device and is not intended to be limiting and may include more or less components than those shown, or some components in combination, or different components.
Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the screen content encoding method in any of the above method embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or partially contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a control device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method for screen content encoding, comprising:
screening out a target reference block matched with a target coding unit to be coded in a screen content image from reference blocks in a coded area of the screen content image;
and carrying out screen content coding based on the target coding unit and the target reference block.
2. The method of claim 1, wherein screening out a target reference block in the screen content image that matches a target coding unit to be coded from among reference blocks in a coded region of the screen content image comprises:
acquiring image characteristic information of the target coding unit;
carrying out Hash processing on the image characteristic information of the target coding unit to obtain an image Hash value of the target coding unit;
acquiring one-way linked lists which are correspondingly arranged to the image hash values of the target coding units, wherein the one-way linked lists are created according to reference blocks in a coded region, and the image hash values corresponding to the reference blocks contained in one-way linked list are the same;
sequentially determining the image similarity between each reference block in the single-direction linked list and the target coding unit until determining that the reference block with the image similarity meeting the preset matching condition exists;
and determining the determined reference block as a target reference block matched with the target coding unit.
3. The method of claim 2, wherein obtaining image feature information of the target coding unit comprises:
acquiring the pixel value of each pixel point in the target coding unit;
determining a sum of pixel values of the target coding unit;
determining a horizontal pixel gradient average value of the target coding unit in the horizontal direction and a vertical pixel gradient average value in the vertical direction according to each pixel value;
and obtaining the image characteristic information of the target coding unit based on the sum of the pixel values, the horizontal pixel gradient average value and the vertical pixel gradient average value.
4. The method of claim 2, wherein sequentially determining the image similarity between each reference block in the singly linked list and the target coding unit until determining that there is a reference block whose image similarity meets a preset matching condition comprises:
acquiring pixel information of the target coding unit;
respectively executing the following steps aiming at each reference block in the single-direction linked list until determining that the reference block with the image similarity meeting the preset matching condition exists:
acquiring pixel information of the reference block, determining image similarity between the target coding unit and the reference block according to the pixel information of the target coding unit and the pixel information of the reference block, and determining the reference block as the reference block meeting a preset matching condition if the image similarity is not lower than a first similarity threshold.
5. The method of claim 3, further comprising:
and when the reference block not lower than the first similarity threshold does not exist, determining the reference block meeting the preset matching condition according to the acquired position distance between the target coding unit and each reference block and the corresponding image similarity.
6. An apparatus for screen content encoding, comprising:
the matching unit is used for screening out a target reference block matched with a target coding unit to be coded in the screen content image from the reference blocks in the coded area of the screen content image;
and the processing unit is used for carrying out screen content coding on the basis of the target coding unit and the target reference block.
7. The apparatus of claim 6, wherein the matching unit is to:
acquiring image characteristic information of the target coding unit;
carrying out Hash processing on the image characteristic information of the target coding unit to obtain an image Hash value of the target coding unit;
acquiring one-way linked lists which are correspondingly arranged to the image hash values of the target coding units, wherein the one-way linked lists are created according to reference blocks in a coded region, and the image hash values corresponding to the reference blocks contained in one-way linked list are the same;
sequentially determining the image similarity between each reference block in the single-direction linked list and the target coding unit until determining that the reference block with the image similarity meeting the preset matching condition exists;
and determining the determined reference block as a target reference block matched with the target coding unit.
8. The apparatus of claim 7, wherein the matching unit is to:
acquiring the pixel value of each pixel point in the target coding unit;
determining a sum of pixel values of the target coding unit;
determining a horizontal pixel gradient average value of the target coding unit in the horizontal direction and a vertical pixel gradient average value in the vertical direction according to each pixel value;
and obtaining the image characteristic information of the target coding unit based on the sum of the pixel values, the horizontal pixel gradient average value and the vertical pixel gradient average value.
9. A control apparatus, characterized by comprising:
at least one memory for storing program instructions;
at least one processor for calling program instructions stored in said memory and for executing the steps of the method according to any one of the preceding claims 1 to 5 in accordance with the program instructions obtained.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
CN202010454534.7A 2020-05-26 2020-05-26 Screen content coding method, device, equipment and medium Pending CN111669595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010454534.7A CN111669595A (en) 2020-05-26 2020-05-26 Screen content coding method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010454534.7A CN111669595A (en) 2020-05-26 2020-05-26 Screen content coding method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN111669595A true CN111669595A (en) 2020-09-15

Family

ID=72384759

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010454534.7A Pending CN111669595A (en) 2020-05-26 2020-05-26 Screen content coding method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111669595A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804528A (en) * 2021-02-05 2021-05-14 北京字节跳动网络技术有限公司 Screen content processing method, device and equipment
CN115119046A (en) * 2022-06-02 2022-09-27 绍兴市北大信息技术科创中心 Image coding and decoding method, device and system with reference to pixel set
CN115529460A (en) * 2021-10-29 2022-12-27 深圳小悠娱乐科技有限公司 Method for realizing dynamic mosaic based on content coding

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804528A (en) * 2021-02-05 2021-05-14 北京字节跳动网络技术有限公司 Screen content processing method, device and equipment
WO2022166727A1 (en) * 2021-02-05 2022-08-11 北京字节跳动网络技术有限公司 Screen content processing method and apparatus, and device
CN115529460A (en) * 2021-10-29 2022-12-27 深圳小悠娱乐科技有限公司 Method for realizing dynamic mosaic based on content coding
CN115119046A (en) * 2022-06-02 2022-09-27 绍兴市北大信息技术科创中心 Image coding and decoding method, device and system with reference to pixel set
CN115119046B (en) * 2022-06-02 2024-04-16 绍兴市北大信息技术科创中心 Image coding and decoding method, device and system for reference pixel set

Similar Documents

Publication Publication Date Title
US20220046261A1 (en) Encoding method and apparatus for screen sharing, storage medium, and electronic device
US20200329233A1 (en) Hyperdata Compression: Accelerating Encoding for Improved Communication, Distribution & Delivery of Personalized Content
JP2020174374A (en) Digital image recompression
CN111669595A (en) Screen content coding method, device, equipment and medium
CN111670580A (en) Progressive compressed domain computer vision and deep learning system
CN105163127A (en) Video analysis method and device
US9984504B2 (en) System and method for improving video encoding using content information
KR20160104035A (en) Content-adaptive chunking for distributed transcoding
CN112672149B (en) Video processing method and device, storage medium and server
KR102343648B1 (en) Video encoding apparatus and video encoding system
US10021398B2 (en) Adaptive tile data size coding for video and image compression
CN113068034A (en) Video encoding method and device, encoder, equipment and storage medium
CN113810654A (en) Image video uploading method and device, storage medium and electronic equipment
CN108668170B (en) Image information processing method and device, and storage medium
CN114900717B (en) Video data transmission method, device, medium and computing equipment
CN108668169B (en) Image information processing method and device, and storage medium
US20220385914A1 (en) Methods and apparatus for processing of high-resolution video content
CN112565760B (en) Encoding method, apparatus and storage medium for string encoding technique
CN112492350B (en) Video transcoding method, device, equipment and medium
Li et al. A CU Depth Prediction Model Based on Pre-trained Convolutional Neural Network for HEVC Intra Encoding Complexity Reduction
Chen et al. Real‐time action feature extraction via fast PCA‐Flow
WO2022250397A1 (en) Methods and apparatus for processing of high-resolution video content
CN116760983B (en) Loop filtering method and device for video coding
US11622118B2 (en) Determination of coding modes for video content using order of potential coding modes and block classification
KR20240006667A (en) Point cloud attribute information encoding method, decoding method, device and related devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination