WO2020237759A1 - 有效表示图像并对图像进行编码的系统和方法 - Google Patents
有效表示图像并对图像进行编码的系统和方法 Download PDFInfo
- Publication number
- WO2020237759A1 WO2020237759A1 PCT/CN2019/092656 CN2019092656W WO2020237759A1 WO 2020237759 A1 WO2020237759 A1 WO 2020237759A1 CN 2019092656 W CN2019092656 W CN 2019092656W WO 2020237759 A1 WO2020237759 A1 WO 2020237759A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- boundary
- image
- segmentation
- approximation
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
- H04N19/463—Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
Definitions
- the present invention relates to a system and method for effectively representing an image and encoding an image, in particular to the processing of an image depth map.
- Intra coding methods play an important role in hybrid video coding schemes, especially in applications such as effect access, prediction reference, fault tolerance, bit rate control, and low-complexity coding.
- the intra-frame coding method only performs operations with respect to the information included in the current frame, and does not perform operations with respect to the information included in any other frames in the video sequence.
- the intra-frame coding compression algorithm in the prior art is usually based on spatial sample prediction first, and then coding based on discrete cosine transform (DCT).
- DCT discrete cosine transform
- DCT-based inter-frame coding methods In the case of smooth image segmentation, such as depth maps, these methods in the prior art are not efficient.
- the traditional DCT-based inter-frame coding method needs to use a lot of bits to deal with the depth discontinuity problem in the depth map.
- DCT-based inter-frame coding methods usually produce artifacts in discontinuous areas and reduce coding quality.
- the prior art also proposes intra-frame wedgelet partition ("WP") and contour partition (“CP”) encoding methods of depth maps.
- WP intra-frame wedgelet partition
- CP contour partition
- the basic principle of these methods is to divide the image into regions of interest called "blocks". These blocks are usually 8 ⁇ 8, 16 ⁇ 16, 32 ⁇ 32, or 64 ⁇ 64 pixels each. Then, by expressing the discontinuity as a segment and approximating the pixel value as a constant, for example, using the average value of all pixels belonging to the same region, these blocks are further divided into smooth regions.
- the WP method divides the block into two regions, and the boundary can only be represented by a straight line through the calculation of exhaustive search. Due to the limitation of modeling the boundary as a straight line and the high computational complexity of exhaustive search, the WP method is usually limited to a few images.
- the CP method does not represent the boundary by a straight line, but divides the block into two regions by comparing the pixel value with a certain threshold, and the threshold is usually the average value of all pixels in the block.
- Pixels with a value larger than the threshold are classified into area 1, and pixels with a value smaller than the threshold are classified into area 2, and other compression techniques are used to compress the boundary of the area.
- the CP method will generate many areas and boundaries, reducing the efficiency of the algorithm.
- the present invention provides a new system and method for effectively representing images and encoding images, and particularly relates to the processing of image depth maps.
- the present invention provides a method and system for effectively representing and encoding an image, particularly a method and system for separating parameter blocks of a depth map of an image, including the following steps and modules: before segmenting the image, the image Initialize the depth tiles; partition or segment the depth tiles; perform boundary and block approximation on the partitioned or segmented depth tiles; repeat the above partition or segmentation of the depth tiles and perform the boundary summation of the depth tiles. The step of block approximation until the maximum number of segmentation levels is reached; the image after the boundary and block approximation is compressed to obtain the compressed image.
- the parameter block separation step and module further include: providing an input representation of the image block; performing parameter separation on the image block through a parameter block separator or blocker; and providing an image The output representation of the block, where the output is represented as a tree structure.
- the input representation for providing the image block includes the image block and some control parameters; and the control parameter is the maximum level or the retrieval radius.
- the tree structure includes the reconstruction residual and the basic parameters of the reconstruction block; the parameters in the output representation are separation mode, key point list, residual part or segmentation index.
- the present invention does not approximate the pixel value as a constant of the entire partition, but uses a parameter model (such as a polynomial) to model the pixel. This provides greater flexibility in the required compression quality for modeling depending on different types of image blocks.
- the present invention does not simply divide the block into two partitions and model the partition boundary as a straight line, but the proposed method can divide the block into multiple partitions and model the partition boundary as a block-polynomial, such as a linear spline .
- Fig. 1 shows a schematic diagram of the parameter block separation step of an image according to the present invention.
- Figure 2 is a schematic diagram of segmentation steps according to the present invention.
- Figure 3-1 is a schematic diagram of the boundary and block approximation steps after the first level of block in Figure 2.
- Figure 3-2 is a schematic diagram of the boundary and block approximation steps of the second-level block in Figure 2 after being divided into blocks.
- Figure 4-1 is a flowchart of the method for the boundary and block approximation steps of the invention.
- Figure 4-2 shows a process diagram of the segmented image approximation using the boundary and block approximation steps of the present invention.
- Fig. 5 is a schematic diagram of possible partitions obtained by the boundary and block approximation steps of the linear spline model proposed by the present invention.
- Fig. 6 is a coordinate index (C2I) table used for searching in the boundary and block approximation step of the present invention.
- FIG. 7 is the internal point index of the boundary point index (IPI2BPI) table used for searching in the boundary and block approximation step of the present invention.
- IPI2BPI boundary point index
- Fig. 8 is a table of boundary pixel to segmentation image (BP2SI) used for searching in the boundary and block approximation step of the present invention.
- B2SI boundary pixel to segmentation image
- Fig. 9 schematically shows a block diagram of a server for executing the method according to the present invention.
- Fig. 10 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present invention.
- Table 1 compares the memory usage of different lookup tables.
- the image depth map processing method proposed by the present invention includes the following steps: the image depth map processing method proposed by the present invention, especially the method for parameter block separation of the depth map of the image, includes the following steps: before segmenting the image, Initialize the image block; partition or segment the depth map; perform boundary and block approximation on the partitioned or segmented depth map; compress the image after the boundary and block approximation to obtain the compressed image.
- Fig. 1 shows a schematic diagram of the parameter block separation step of an image according to the present invention.
- double data rate expresses the depth map as a series of discontinuities and obtains the smoothest part of the depth map area from it.
- DDR double data rate
- an image is divided into blocks of different sizes, and the blocks are divided into smooth blocks or blocks containing large-depth discontinuities (discontinuous blocks).
- Pixels can contain information such as the intensity of the color signal in the RGB format, and the brightness and chrominance of the YUV format represented separately by the luminance parameter and the chrominance parameter.
- Pixels can undergo image deformation; disparity or depth values can be used to represent the degree of deformation associated with the pixel.
- d(x,y) is the attribute value of the pixel, which can be the color signal intensity, brightness, chroma, parallax, or depth value.
- the attribute values of different pixels can be embodied in the form of the following series:
- d(x, y) is usually an integer with no specific range limit, depending on the bit depth. For example, d(x,y) ⁇ [0,255] is 8-bit depth; d(x,y) ⁇ [0,1023] is 10-bit depth.
- the data size is bit_depth ⁇ M ⁇ N bits.
- Image coding involves a reduced data size d(x,y) value, also known as compressed data size, without causing a significant reduction in visual quality.
- Block coding is a widely used image coding technique in which the image I is divided into multiple square regions called blocks.
- an encoding technique that optimizes compression performance can be used to compress/encode the attribute value d(x, y) in each rectangular region/block.
- Encoding refers to the process of finding alternative representations of pixel attribute values, which are usually more compact in data size.
- Two standards are usually used to measure the coding performance of a block: 1) reconstruction error, 2) compression ratio (CR).
- the reconstruction error of block B k is as follows:
- the compression ratio (CR) of block B k is as follows:
- Fig. 1 is a frame diagram of the parameter block separation step according to the present invention.
- the input representation of the block is converted into a more compact representation.
- an input representation of the image block is provided, the representation includes the image block and some control parameters; as described above, the image block is, for example, 8 ⁇ 8, 16 ⁇ 16, 32 ⁇ 32, and 64 ⁇ 64 blocks; Control parameters include but are not limited to maximum level, search radius, etc.
- the image blocks are separated by parameter blocks by a parameter block separator or a block divider.
- an output representation of the image block is provided, where the output representation is a tree structure that contains the reconstruction residual and the basic parameters of the reconstruction block; the parameters in the output representation include, but are not limited to, separation mode and key point list , Residual part, segmentation index, etc.
- the present invention relates to a compression/encoding block B k of pixel attribute values, which is also called block encoding.
- the pixel attributes can be color intensity, brightness and chroma intensity, deformation, etc.
- the proposed parameter block separation can be written as:
- the output is the tree structure G k .
- the subscript k will be deleted, and the k-th block will be referenced in the following sections.
- the output structure G can be written as:
- V ⁇ v 1,1 ,v 2,1 ,v 2,2 ...,v l,n ⁇ is the node setting of the tree
- V l,n is the first level of the tree n nodes.
- the nth node of the lth layer can be represented as node (n, l).
- Each node contains the following information:
- ⁇ n,l 0 , and 1, 2 are separate modes. If division is not performed, ⁇ n,l is zero. Otherwise, depending on the mode used to approach the boundary, ⁇ n,l selects 1 or 2, which will be described in detail below.
- n p is the number of nodes in the parent node at level l-1.
- n c,1 and n c,2 are the number of nodes of two child nodes when the separation mode is ⁇ n,l and 1 or 2 is selected. In the case where ⁇ n,l is 0 , the two items n c,1 and n c,2 are omitted.
- c is the boundary condition number, which will be described in detail below.
- ⁇ n,l is a list of key points, which will be described in detail below.
- ⁇ n,l is used for reconstruction The estimated parameters of the parameter mode.
- each reconstructed element in ⁇ n,l The relationship between can be expressed as the following model:
- g has a mapping function.
- g is taken as the following polynomial:
- ⁇ n,l [ ⁇ n,l,0,0 , ⁇ n,l,0,1 , ⁇ n,l,1,0 , ⁇ n,l,2,0 , ⁇ n,l,0, 2 , ⁇ n,l,1,1 ,...] T.
- the parameter block separation method of the present invention can be mainly divided into the following steps:
- Segmentation step The criterion is used to determine whether the node should be divided into two representative segments/partitions. These segments/partitions are called child nodes, and if segmentation is performed on the node, each of the nth nodes of the lth level will contain two child nodes of the l+1th level. During the creation of two nodes, the counter variable will be Is incremented, and then the first child node will be assigned a node number Subsequently, the counter variable will Is incremented, and then the second node will allocate a node number
- Boundary and block approximation step After the segment is obtained, the boundary of the segment of the node (n,l) is approximated by a straight line or two straight lines, which is called the separation mode ⁇ n,l .
- FIG 2 is a schematic diagram of the segmentation steps of the present invention.
- the obtained attribute values of the encoded pixels are displayed as a hierarchical tree.
- the image block 200 is a block to be segmented.
- the image block 201 is the first-level block; the image block 202 and the image block 203 are the second-level blocks after segmentation.
- Those of ordinary skill in the art can recursively segment the image block according to the first and second levels of segmentation and the above-mentioned method disclosed in the present invention until reaching the L level of segmentation.
- an image block is divided into two representative segments.
- This step can be done through different image segmentation methods.
- the image segmentation method usually adopts an iterative method and may be time-consuming. Since only focusing on small blocks up to 64x64, compared to today's video resolutions (such as 720p and 1080p), blocks of this size may only contain less than 1% of the total pixels of the video frame or image. As a result, some simple structures can be used to model image changes. Therefore, a hierarchical threshold technique is used to separate blocks.
- This hierarchical segmentation step is shown in Figure 2. First, the user needs to specify the maximum number L of segmentation levels. For each level, the average value of the pixel attribute values of the parent node is used to perform the split.
- each pixel attribute value will be assigned a partition label
- the present invention proposes to adopt a novel boundary and block approximation method to model the class boundary through a polynomial (such as a linear function or a spline function).
- a polynomial such as a linear function or a spline function.
- the partition boundaries and partition labels of pixels can be calculated offline in advance in the form of a lookup table, which enables the proposed algorithm to omit the calculation of group labels that require a lot of calculation time for a large number of images.
- a polynomial or a smooth piecewise function (such as a spline function) can be used to approximate the segmentation boundary so that the difference between the reconstructed block and the original block can be minimized.
- a polynomial or a smooth piecewise function such as a spline function
- Figure 3-1 is a schematic diagram of the boundary and block approximation method after the first level of block in Figure 2.
- Figure 3-2 is a schematic diagram of the boundary and block approximation method of the second-level block in Figure 2 after being divided into blocks. Both single-line approximation and double-line approximation can be applied to all levels of segmented images.
- the boundary and block approximation after the first-level block in Fig. 3-1 and the second-level block in Fig. 3-2; Fig. 3-1 respectively shows the first level after linear approximation of the image block 300
- Block 301 is a first-level block 302 that approximates a linear spline, that is, two straight lines.
- Figure 3-2 shows the second-level block 306 that performs linear approximation on the image block 303, and the second-level block 305 that approximates two straight lines with linear spline approximation; among them, the first-level block 302 or the second-level block 305
- the block 305 is composed of an ineffective area 309, a first effective area 307, and a second effective area 305;
- Fig. 3-2(2) also shows an image block 304 with an unconnected area, because the simple threshold segmentation method cannot guarantee every Because of the connectivity of each segment, the image block 304 with non-connected regions is not applicable to the node segmentation method of the present invention.
- FIG. 4-1 is a flowchart of the boundary and block approximation steps of the present invention. If two end points are found accurately in the block, the proposed boundary and block approximation algorithm will be executed. Otherwise, the block will be ignored.
- step 400 a segmented image is obtained; in step 401, the boundary between the segments is extracted and its end point is found; in step 402, it is judged whether there are two end points; and in the case of two end points, a linear spline approximation is performed Step 403; in the case of an end point, perform linear approximation step 404; after linear approximation step 404, in step 405, convert a straight line into segments and compare with the original segment; after linear spline approximation step 403, In step 406, the two straight lines are converted into segments and compared with the original segments; finally, according to the results of steps 450 and 406, the minimum residual and corresponding block reconstruction parameters are output.
- Figure 4-2 shows the approximate process diagram of the segmented image using the boundary and block approximation steps of the present invention.
- Figure 4-2 takes the segmented image 408 as an example to show the process of performing boundary and block approximation.
- the segmented image 408 is 8 ⁇ 8, 16 ⁇ 16 image blocks, including but not limited to this size.
- the segmented image 408 is linearly approximated to obtain a linear image block 409, where in the linear image block 409, the first end point 413 and the second end point 415 are the end points obtained within the end point search range 417, respectively.
- the first end point 413 and the second end point 415 are linearly connected (as shown in FIG.
- a linear spline approximation is performed on the segmented image 408 to obtain a spline image block 410, wherein the spline image block 410 further includes a turning point 419 between the first end point 414 and the second end point 416. 419 is located in the potential turning area 418.
- the first end point 413, the second end point 415 and the turning point 419 are respectively linearly connected (as shown in Figure 4-2), and the current approximated block is converted into segmentation to obtain a segmented image 412 after the boundary and block approximation .
- a set of boundary pixels (BP) is extracted from the search range specified by the whiskers ("whisker") in the red area ("RED") to generate a set of straight lines formed by all combinations of BP.
- BP boundary pixels
- RED red area
- a straight line is drawn to divide the block into two segments, as shown in the segmented image 411.
- the pixel attribute value d(x, y) of each segment is approximately specified in the above equation (11) by the average value, and the signal-to-noise ratio ("SNR”) of the entire block is calculated.
- SNR signal-to-noise ratio
- g(0,l+1) and g(1,l+1) are approximate values of groups 0 and 1 obtained from equation (11), and will be designated as child nodes of an existing node.
- z′(x,y) is the fine partition label of the location, determined by the following equation:
- (x a , y a ) and (x b , y b ) are a pair of optimal boundary pixels obtained by exhaustive search in a series of B boundary pixels.
- the pixel is designated as group 1.
- ⁇ b is the number of unmatched partition labels calculated for the b-th boundary pixel combination or the b-th straight line.
- a lookup table containing two possible cases of the refined label z'(x, y) can be pre-calculated. For example, in the segmented image 411, there are two possible segmentation scenarios for how to assign refined labels, that is, the upper segment in black and the lower segment in white are designated as 1 and 0, and vice versa.
- these pre-calculated z'(x, y) can be taken from the lookup table.
- the linear model is first used to generate two initial guesses of BP. Then, the BP is fixed and an exhaustive search is performed to find the turning point/knot in the latent region, which is the expansion of the segmentation boundary, as shown in the spline image block 410 in Figure 4-2. The junction that gives the peak signal-to-noise ratio is selected, and its definition is similar to equations (14) to (16). Then, the block can be divided into two segments by linear splines, as shown in the segmented image 412 in Figure 4-2.
- the present invention proposes another extended implementation of the proposed method, which extrapolates the two lines to the corresponding terminal pixels at the block boundary. Therefore, four divided regions are obtained because there are two straight lines, and each of them is associated with two divided regions. Combining them gives four possible scenarios, as shown in Figure 5. Since the label is either 0 or 1 for the RED area or the WHITE area, there are a total of 8 possible combinations.
- the same lookup table can be used to approximate the boundaries and blocks of each sub-segment.
- the shape of the child segment may be different from the parent segment.
- a mask can be created to mark pixels belonging to sub-segments.
- dissimilarity measurement/peak SNR only the pixels in the effective area are calculated (see Figure 3-1 and Figure 3-2). This allows the same set of tables to be reused for all possible shapes and sizes of sub-segments.
- the shape of the segment can be represented by the segment boundary, which has been calculated and stored in the previous level.
- the present invention can be implemented efficiently. As mentioned earlier, since the combination of boundary pixels is limited, for 8x8 blocks, there are In this combination, the partition boundary and partition label of the pixel can be pre-calculated and stored in the lookup table, which can reduce a lot of calculations for recalculating the same value. Examples of the lookup table of the present invention are summarized as follows:
- C2I Coordinate index
- IP internal pixels
- An illustration of this table of block diagram 8x8 can be found in Figure 6.
- the periphery is the boundary point, and the interior is the interior point.
- the index of BP ranges from 0 to 27, and for internal pixels, from 0 to 35.
- the coordinates of the pixels in a given block can be easily converted to an index through this table. You can use the same concept to generate tables of 16 ⁇ 16, 32 ⁇ 32, and 64 ⁇ 64.
- IPI2BPI Boundary point index
- Boundary pixel to segmented image (BP2SI) table As shown in Figure 8, given two BPs, the table returns the partition labels of pixels separated by a straight line through two points. The entire partition label set is called segmented image (SI). The size of the table is 28x28, and each unit contains a binary segmented image of size 8x8, and the block size is 8x8.
- the method of the present invention further performs block reconstruction.
- block B k can be reconstructed as follows:
- the method for rebuilding the path mentioned in the recursive step (2) is as follows:
- Table 1 is a summary of the sizes of the different lookup tables mentioned above. Using the existing data types in the C++ language, approximately 66.00MB of memory may be required. However, some symmetric properties of these tables can be easily observed, and memory consumption can be reduced. Both IPI2BPI(b) and BP2SI are symmetrical, and the upper triangular part can be simply maintained. In this way, memory consumption can be reduced to 32.50MB.
- the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by their combination.
- a microprocessor or a digital signal processor (DSP) can be used in practice to implement the method for improving the video resolution and quality and the video encoder and the decoder of the display terminal according to the embodiments of the present invention.
- DSP digital signal processor
- the present invention can also be implemented as a device or device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
- Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
- FIG. 9 shows a server, such as an application server, that can implement the present invention.
- the server traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium.
- the memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
- the memory 1020 has a storage space 1030 of program code 1031 for executing any method steps in the above-mentioned method.
- the storage space 1030 for program codes may include various program codes 1031 respectively used to implement various steps in the above method. These program codes can be read from or written into one or more computer program products.
- Such computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
- Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 10.
- the storage unit may have storage segments, storage spaces, etc. arranged similarly to the storage 1020 in the server of FIG. 9.
- the program code can be compressed in an appropriate form, for example.
- the storage unit includes computer readable code 1031', that is, code that can be read by, for example, a processor such as 1010, which, when run by a server, causes the server to perform the steps in the method described above.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及有效表示图像以及对图像进行编码的系统和方法,特别是涉及对图像深度图的处理。本发明在对图像进行分段之前,对图像的块进行初始化;对深度图块进行分区或分段;对分区或分段后的深度图块进行边界和块近似;重复上述对深度图块进行分区或分段并对深度图块进行边界和块近似的步骤,直至达到最大数量的分割级别;对边界和块近似后的图像进行压缩,获得压缩后的图像。本发明不是将像素值近似为整个分区的常量,而是使用参数模型(例如多项式)对像素进行建模。这为依赖于不同类型的图像块建模在所需的压缩质量提供了更大的灵活性。这使得所提出的方法能够更准确地近似复杂结构,从而实现更好的压缩质量。为了进一步提高压缩率,应用分层聚类进行分区。这使得分区具有更紧凑地表示方式,因此减少压缩后的数据大小。
Description
本发明涉及有效表示图像以及对图像进行编码的系统和方法,特别是涉及对图像深度图的处理。
随着对图像传输要求的不断提高,为了获得图像和视频的高效存储和传输,需要对图像和视频的有效压缩方法。帧内编码(intra coding)方法在混合视频编码方案中起到重要作用,特别在诸如效果访问、预测参考、容错、比特率控制、低复杂度编码等应用中。帧内编码方法仅相对于当前帧内包括的信息执行操作,而相对于视频序列中的任何其他帧中包括的信息不执行操作。现有技术中帧内编码压缩算法通常是先基于空间样本预测,接着进行基于离散余弦变换(DCT)的编码。
在图像分段平滑的情况下,例如深度图,现有技术的这些方法效率不高。传统的基于DCT的帧间编码方法需要使用相当多的比特来处理深度图中的深度不连续问题。在高压缩率处,基于DCT的帧间编码方法通常会在不连续的区域产生伪像并且降低编码质量。
现有技术中还提出了深度图的帧内楔形分区(Wedgelet partition,“WP”)和轮廓分区(Contour Partition,“CP”)编码方法。这些方法的基本原理是将图像分成称为“块”的兴趣区域。这些块通常为每块8×8,16×16,32×32 或64×64像素。然后,通过将不连续性表示为片段并且将像素值近似为常数,例如采用属于同一区的所有像素的平均值,进一步将这些块分成平滑的区域。
这两种方法的缺点是,像素值被区域中的常数代替,这在物理上将某一块中的所有像素用相同的颜色强度或深度来替换了,结果图像的所有细节都丢失了。此外,WP方法将块分成两个区域,仅能通过穷举搜索的计算由直线来表示边界。由于将边界建模成直线的限制以及穷举搜索的高计算复杂性,WP方法通常只限于少数几种图像。CP方法不是通过直线来表示边界,而是通过将像素值与某一阈值相比较将块分为两个区域,所述阈值通常为该块中所有像素的平均值。具有比该阈值大的像素被分到区域1,比该阈值小的像素被分到区域2,并利用其他压缩技术来压缩区域的边界。然而,如果在阈值附近的值发生小的波动,CP方法会产生很多区和边界,降低了算法的效率。
发明内容
本发明提供了一种新的有效表示图像以及对图像进行编码的系统和方法,特别涉及对图像深度图的处理。本发明提供一种有效表示图像并对图像进行编码的方法和系统,特别是对图像的深度图进行参数块分离的方法和系统,包括下列步骤和模块:在对图像进行分段之前,对图像的块进行初始化;对深度图块进行分区或分段;对分区或分段后的深度图块进行边界和块近似;重复上述对深度图块进行分区或分段并对深度图块进行边界和块近似的步骤,直至达到最大数量的分割级别;对边界和块近似后的图 像进行压缩,获得压缩后的图像。本发明的一个方面所述的方法和系统,其中所述的参数块分离步骤和模块还包括:提供图像块的输入表示;通过参数块分离器或分块器对图像块进行参数分离;提供图像块的输出表示,其中该输出表示为树结构。本发明的另外一个方面所述的方法和系统,其中所述的提供图像块的输入表示包括图像块和一些控制参数;并且所述控制参数为最大值水平或检索半径。其中所述的树结构包含重构残差和重构块的基本参数;所述输出表示中的参数为分离模式、关键点列表、残差部分或分割指数。
本发明不是将像素值近似为整个分区的常量,而是使用参数模型(例如多项式)对像素进行建模。这为依赖于不同类型的图像块建模在所需的压缩质量提供了更大的灵活性。本发明不是简单地将块划分为两个分区并将分区边界建模为直线,而是提出的方法能够将块划分为多个分区,并将分区边界建模为块-多项式,例如线性样条。
这使得所提出的方法能够更准确地近似复杂结构,从而实现更好的压缩质量。为了进一步提高压缩率,应用分层聚类进行分区。这使得分区可以更紧凑地表示方式,因此减少压缩后的数据大小。
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍。显而易见地,下面描述中的附图仅仅是本发明的一些实例,对于本领域普通技术人员来讲,在不付出创新性劳动的前提下,还可以根据这些附图获得其他的附图。
图1展示了根据本发明将图像进行参数块分离步骤的示意图。
图2为根据本发明的分段步骤示意图。
图3-1为图2中第1级分块后的边界和块近似步骤示意图。
图3-2为图2中第2级块分块后的边界和块近似步骤示意图。
图4-1为本发明边界和块近似步骤的方法流程图。
图4-2展示了采用本发明边界和块近似步骤的分段图像近似过程图。
图5为本发明所提出的线性样条模型边界和块近似步骤获得的可能分区情况示意图。
图6为本发明边界和块近似步骤中用于查找的坐标索引(C2I)表。
图7为本发明边界和块近似步骤中用于查找的边界点索引(IPI2BPI)表的内部点索引。
图8为本发明边界和块近似步骤中用于查找的边界像素到分割图像(BP2SI)表。
图9示意性地示出了用于执行根据本发明的方法的服务器的框图;以及
图10示意性地示出了用于保持或者携带实现根据本发明的方法的程序代码的存储单元。
表1为不同查找表的内存使用对比。
以下阐述的是当前被认为是所要求保护的发明的优选实施例或最佳表示性示例的内容。仔细考虑了对实施例和优选实施例的将来和现在的表示或修改,在功能、目的、结构或结果方面作出实质性改变的任何变更或修改,都旨在被本专利的权利要求所涵盖。现在将参考附图仅以举例的方 式描述本发明的优选实施例。
本发明提出的图像深度图处理方法,包括如下步骤:本发明提出的图像深度图处理方法,特别是对图像的深度图进行参数块分离的方法,包括下列步骤:在对图像进行分段之前,对图像的块进行初始化;对深度图块进行分区或分段;对分区或分段后的深度图块进行边界和块近似;对边界和块近似后的图像进行压缩,获得压缩后的图像。
图1展示了根据本发明将图像进行参数块分离步骤的示意图。针对图像深度图不连续和平滑的特性,双倍数据速率(“DDR”)将深度图表示为一系列不连续性,并从中获得深度图区域中最平滑的部分。为了获得基于块的编码的DDR,将图像分成不同尺寸的块,将所述块分为平滑块或包含大深度不连续性的块(不连续块)。
以图像I(图1中未示出)为示例,图像I的像素i(x,y)的分辨率为M×N,其中M为像素列(宽度)的数值,并且N为像素行(高度)的数值;x=1,2,…M,y=1,2…N,M和N分别为像素的x和y坐标。像素可以包含诸如RGB格式的彩色信号强度,亮度参量和色度参量分开表示的YUV格式的亮度和色度等信息。对于立体图像,像素可以进行图像变形;视差或深度值可以用于代表与该像素相关的变形程度。d(x,y)为像素的属性值,可以是彩色信号强度、亮度、色度、视差或深度值等。不同像素的属性值可以以下列数列的形式来体现:
D=[d
1,d
2,…,d
M],其中d
x=[d(x,1),d(x,2),…,d(x,N)]
T (1)
其中d(x,y)通常为没有特定范围限制的整数,取决于位深度。例如d(x,y)∈[0,255]为8比特深度;d(x,y)∈[0,1023]为10比特深度。为了代表 所有M×N像素的d(x,y)值,对于属性d(x,y),数据尺寸为bit_深度×M×N位。图像编码涉及减小的数据尺寸d(x,y)值,也被称为压缩的数据尺寸,而不会导致视觉质量的显着降低。块编码是一种广泛采用的图像编码技术,其中图像I被分成多个被称为块的方形区域。
其中,x
k和y
k分别是块k的起始像素位置,k=1,2……K。而S为块的尺寸,通常从S=8,16,32或64中选择。这种选择通常分别形成了8×8,16×16,32×32和64×64的块。随后,可以使用优化压缩性能的编码技术来压缩/编码每个矩形区域/块中的属性值d(x,y)。编码指的是寻找像素属性值的替代表示的过程,其通常在数据大小上更紧凑。通常使用两个标准来测量块的编码性能:l)重建误差,2)压缩比(CR)。块B
k的重建误差如下:
通常,其他误差测量可以应用于本发明所提出的方法。块B
k的压缩比(CR)如下:
假设属性D的每像素比特为8,对于每秒视频30帧、1080p分辨率的D数据速率是数据速率=(每像素比特×每秒帧数×M×N)/10
6=(8×30×1920×1080)/10
6=497Mbps。此外,由于像素通常包含多属性并且因此数据速率更大。对于许多应用来说,不希望有这种高数据速率。为了降低数据速率,我们提出了一种用于块编码的参数块分离方法。
图1为根据本发明的参数块分离步骤的框架图。根据该方法,将块的输入表示转换成更紧凑的表示。在步骤101,提供图像块的输入表示,该表示包括图像块和一些控制参数;如前文所述,图像块为例如8×8,16×16,32×32和64×64的块;所述控制参数包括但不限于最大值水平、检索半径等。在步骤102,通过参数块分离器或分块器对图像块进行参数块分离。在步骤103,提供图像块的输出表示,其中该输出表示为树结构,该树结构包含重构残差和重构块的基本参数;输出表示中的参数包括但不限于分离模式、关键点列表、残差部分、分割指数等。
更具体地,考虑块B
k的编码。本发明涉及像素属性值的压缩/编码块B
k,也被称为块编码。其中像素属性可以是颜色强度,亮度和色度强度,变形等。所提出的参数块分离可以写成:
f:{B
k,L,R}→G
k (6)
其中{B
k,L,R}是输入表示。L为分段级别的最大数量。R是搜索半径。输出是树结构G
k。在符号方便的情况下,将删除下标k,在后面的部分中将引用第k个块。输出结构G可以被写作:
G=(V,E), (7)
其中E为树的边界设置,V={v
1,1,v
2,1,v
2,2…,v
l,n}为树的节点设置,V
l,n为树的第l层的第n个节点。在符号方便的情况下,可以将第l层的第 n个节点表示为节点(n,l)。每个节点包含下列信息:
其中μ
n,l=0,1,2为分离模式。如果不进行分割,则μ
n,l为0。否则,取决于接近边界所采用的模式,μ
n,l选择1或者2,将在下文中详细描述。
n
p为在l-1层父节点的节点数目。n
c,1和n
c,2是在分离模式为μ
n,l选择1或者2的情况下,两个子节点的节点数目。在μ
n,l为0的情况下,n
c,1和n
c,2两项被省略。
c为边界条件编号,将在下文中进行详述。
Ω
n,l为关键点列表,将在下文中进行详述。
其中,g具有映射功能。特别是,g作为如下多项式:
其中β
n,l=[β
n,l,0,0,β
n,l,0,1,β
n,l,1,0,β
n,l,2,0,β
n,l,0,2,β
n,l,1,1,…]
T。在本发明中,多项式阶数选择为P=Q=0,并且进一步简化为0阶近似。
g(β
n,l,x,y)=β
n,l,0,0=g(n,l),其中β
n,l=β
n,l,0,0 (11)
本发明所述的参数块分离方法主要可以分为如下步骤:
2.分段步骤:判定标准用于确定节点是否要分成两个代表性段/分区。这些段/分区被称为子节点,并且如果在节点上执行分段,则第l级的第n个节点中的每一个将包含在第l+1级的两个子节点。在创建两个节点期间,计数器变量将以
的方式递增,接着第一子节点将分配一个节点数
随后,计数器变量将
的方式递增,接着第二节点将分配一个节点数
3.边界和块近似步骤:在获得分段之后,节点(n,l)的分段的边界由直线或两条直线近似,这被称为分离模式μ
n,l。直线的端点称为关键点列表Ω
n,l,和分段索引κ
n,l存储在两个代表段的输出表示中,β
n,l通过执行参数估计来估计。如果分离模式μ
n,l=0,β
n,l将被存储为节点(n,l)的重建参数。
4.递归步骤:重复上述步骤2和3,直到达到分割级别L的最大数量。
图2为本发明的分段步骤的示意图。图2中,获得的编码像素属性值显示为分层树。图像块200为待分段的块。图像块201为第1级块;图像块202和图像块203为分段后的第2级块。本领域普通技术人员能够根据第1和第2级的分段,根据本发明公开的上述方法,对图像块进行递归分段,直到达到L级的分段。
开始时,一个图像块被分成两个代表段。该步骤可以通过不同的图像分段方法来完成。然而,图像分段方法通常采用迭代的方法并且可能是耗时的。由于只专注于最多64x64的小块,与今天的视频分辨率(比如720p和1080p)相比,这种尺寸的块可能只包含不到视频帧或图像总像素的1%。结果,可以使用一些简单的结构来建模图像的变化。因此,采用分层阈值 技术来分离块。这种分层分割步骤如图2所示。首先,用户需要指定分段级别的最大数量L。对于每个级别,使用父节点的像素属性值的平均值来执行拆分。
接着,每个像素属性值都将被分配一个分区标签,
为了进一步提高压缩比,本发明提出采用新颖的边界和块近似方法来通过多项式(例如线性函数或样条函数)对类边界进行建模。这允许边界由几个节来表示。此外,像素的分区边界和分区标签可以以查找表的形式预先离线计算,这使得所提出的算法能够省略需要大量图像的大量计算时间的组标签的计算。
下文对边界和块近似步骤进行说明。一旦获得代表性片段,就可以使用多项式或平滑分段函数(例如样条函数)来近似分割边界使得可以最小化重建块和原始块之间的差异。考虑到计算成本和适用性,所提出的方法使用以下模型之一来近似分割边界:
1)线性模型(模式μ
n,l=1):分割边界近似为直线。
2)具有三个节点的线性样条模型(模式μ
n,l=2):边界近似为两条直线。笔直的终点线和两条直线相交的点(图3-1和图3-2中的圆圈)称为结。在数学上,它可以称为样条,它是由多项式分段定义的函数。
图3-1为图2中第1级分块后的边界和块近似方法示意图。图3-2为图2中第2级块分块后的边界和块近似方法示意图。单线近似和双线近 似都可以应用于所有级别的分割图像。例如,图3-1中第1级分块、图3-2中的第2级分块后的边界和块近似;图3-1分别示出了对图像块300进行线性近似后第1级块301,和线性样条近似即两条直线近似的第1级块302。图3-2分别示出了对图像块303进行线性近似的第2级块306,和线性样条近似即两条直线近似的第2级块305;其中,第1级块302或第2级块305分别由非有效区域309、第一有效区域307和第二有效区域305组成;图3-2(2)还示出了具有非连接区域的图像块304,由于简单阈值分割方法不能保证每个分段的连通性,因此具有非连接区域的图像块304不适用本发明的节点分段方法。
图4-1为本发明边界和块近似步骤的流程图。如果在块中精确地找到两个终点,则将执行所提出的边界和块近似算法。否则,该块将被忽略。在步骤400,获得分段图像;在步骤401,提取段之间的边界并找到它的终点;在步骤402,判断是否为两个终点;并在两个终点的情况下,进行线性样条近似步骤403;在一个终点的情况下,进行线性近似步骤404;线性近似步骤404后,在步骤405,将一条直线转换成分段,并与原始分段进行比较;在线性样条近似步骤403后,在步骤406,将两条直线转换成分段,并与原始分段进行比较;最后,将根据步骤450和406的结果,输出最小残差和相应的块重建参数。
图4-2展示了采用本发明边界和块近似步骤的分段图像的近似过程图。图4-2以分段图像408为例,展示在进行边界和块近似的过程。其中分段图像408为8×8,16×16的图像块,包括但不限于该尺寸。对分段图像408进行线性近似,获得线性图像块409,其中,在线性图像块409,第一终点413和第二终点415分别为终点搜索范围417内获得的终点。将第一终点413与第二终点415进行线性连接(如图4-2所示),并将现行近似后的块转换为分段,获得边界和块近似后的分段图像411。与之对应的, 对分段图像408进行线性样条近似,获得样条图像块410,其中,在样条图像块410,在第一终点414和第二终点416之间还包括转折点419,转折点419位于潜在的转折区域418内。将第一终点413与第二终点415与转折点419分别进行线性连接(如图4-2所示),并将现行近似后的块转换为分段,获得边界和块近似后的分段图像412。
图4-2中线性图像块409采用了线性模型,即模式μ
n,l=1。从由红色区域(“RED”)中的晶须(“whisker”)指定的搜索范围中提取一组边界像素(BP),以生成由BP的所有组合形成的一组直线。对于BP的每个组合,绘制一条直线以将块分成两个段,如分段图像411所示。然后,每个段的像素属性值d(x,y)由平均值近似在上述式(11)中指定,并计算整个块的信噪比(“SNR”)。在计算每条直线获得的分段的信噪比之后,进行穷举搜索(brute-force search)以定位给出峰值信噪比的直线,并且将选择该直线作为分段边界。最后,算法生成一个输出,其中包含两个BP的索引,这些索引给出了原始的最佳近似值块,标签顺序,两个段的平均值。
进一步,考虑在第n节点的第l级和第k块的B边界像素{(x
1,y
1),(x
2,y
2),…(x
B,y
B)},为了符号方便,省略下标l和k。通过将信噪比最大化获得第l级的分段边界,方程如下:
其中(x
a,y
a)和(x
b,y
b)在一系列B边界像素中进行穷举搜索获得的一对最佳边界像素。
尽管这两项近似都是通过穷举搜索完成的,但计算每条直线的SNR仍然需要大量的计算。为了进一步加快这一过程,将方程(14)中的信噪比用不相似性度量来替换,这样计算的复杂程度低,具体为:
其中ξ
b是为第b个边界像素组合或第b个直线计算的不匹配分区标签的数量。为了进一步避免执行方程(16)中耗时的比较过程。可以预先计算包含精炼标签z'(x,y)的两种可能情况的查找表。例如,在分段图像411中如何分配精炼标签有两种可能的分割情况,即黑色中的上段和白色的下段分别被指定为1和0,反之亦然。当在方程(17)中计算不相似性度量时,可以从查找表中取出这些预先计算的z'(x,y)。
图4-2中线性样条图像块410采用了具有三个节点的线性样条模型,即模式μ
n,l=2。
对于基于具有三个节点的线性样条模型的近似,首先使用线性模型来生成两个BP的初始猜测。然后,BP被固定并且执行穷举搜索以找到潜在区域中的转折点/结,这是分割边界的扩张,如图4-2中的样条图像块410所示。选择给出峰值信噪比的结,其定义类似于方程(14)至方程(16)。然后,可以通过线性样条将块分成两个段,如图图4-2中的分段图像412所示。然而,由于查找表仅包含与从边界像素直接绘制的直线相关联的边界像素对,因此不能直接应用所提出的新的相异度度量和查找表。为此, 本发明对所提出的方法提出另外一种扩展的实施方式,其将两条线外推到块边界处的对应终端像素。因此,获得四个分割区域,因为存在两条直线,并且它们中的每一条与两个分割区域相关联。将它们组合起来给出了四种可能的情况,如图5所示。由于标签对于RED区域或WHITE区域要么是0要么是1,总共有8种可能的组合。
图5为本发明所提出的线性样条模型边界和块近似步骤获得的可能分区情况示意图。如果RED中的区域标记为z'(x,y)=0,则WHITE中的区域将标记为z'(x,y)=1,反之亦然。这导致8种组合,它们以查找表的形式存储,其被称为模式2查找表。
在获得两个片段之后,可以使用相同的查找表来近似每个子片段的边界和块。但是,由于子段的形状可能与父段不同。为了克服这个问题,可以创建一个掩码来标记属于子段的像素。在计算相异度测量/峰值SNR时,仅计算有效区域中的像素(参见图3-1和图3-2)。这样可以为子段的所有可能形状和大小重复使用同一组表。段的形状可以是由段边界表示,已经计算并存储在先前的级别中。
节点创建和分区标签的分配:如前所述,在执行分段后,将创建第l+1级的两个子节点。将为第一个子节点分配节点编号n
c,1,并且将为第二个子节点分配
由于只有两个子节点,因此节点中的任一像素n
c,1或节点n
c,2将标记为z'(x,y)=0,并且剩余子节点中的像素将标记为z'(x,y)=1,反之亦然。对于上述线性模型,这两种可能的情况可以存储在查找表中并标记为c=0,1,其中c是方程(11)中的边界情况编号。但是,对于上述线性样条模型,有八种可能的边界情况,因此c=0,1,...,7。对于每个边界在这种情况下,RED中的区域和WHITE中的区域(参见图5)分别被分配给节点n
c,1和n
c,2。
本发明能够被高效地实施。如前面所述,由于边界像素的组合是有限 的,对于8x8块,有
种组合,可以预先计算像素的分区边界和分区标签,并将其存储在查找表中,这样可以减少大量的用于重新计算相同值的计算。本发明查找表的示例总结如下:
1)坐标索引(C2I)表。C2I表将BP和内部像素(IP)的坐标转换为不同的对象集。块图8x8的这个表的图示可以在图6中找到。其中周边为边界点,内部为内点。BP的指数从0到27,对于内部像素,从0到35。给定块中像素的坐标,可以很容易通过这个表轻松地转换为索引。可以使用同样的概念生成16×16,32×32,和64×64的表格。
2)边界点索引(IPI2BPI)表。该IPI2BPI表中有两个子表来处理不同的情况。如图7-1所示,给定BP和IP,如果通过两个点形成直线,子表(a)则可以返回另一个BP的索引。该表的大小为28x36,块大小为8x8时,其值为0到27。给定两个IP,如图7-2所示,子表(b)可以返回两个BP的索引,这两个BP通过这两个点创建一条直线。此表的大小为36x36,此表中的每个单元包含两个值,在块大小为8×8时范围从0到27。
3)边界像素到分割图像(BP2SI)表。如图8所示,给定两个BP,该表返回通过两个点由直线分开的像素的分区标签。整个分区标签集称为分割图像(SI)。该表的大小为28x28,并且其每个单元包含大小为8x8的二进制分段图像,块大小为8x8。
本发明的方法进一步进行块重建。利用子部分C中的表格,可以如下重建块B
k:
(2)进行递归步骤:对于n=0,1,2,...,N,l=0,1,2,...,L,根据模式μ
n,l,执行下文子部分中的路径A或B。
(3)进行终止步骤:当达到最大分段级别L和最大节点数N时,该过程终止。
上述递归步骤(2)中所提到的用于重建路径的方法如下:
(ii)在μ
n,l=1(线性模型)时,给定表示两个BP的C2I表索引,可以通过
a.使用两个BP和给定的边界情况数c来重建分割边界,以通过BP2SI表重建分割图像;
(iii)在μ
n,l=2(具有三个节点的线性样条模型)时,给定两个BP作为转折点和一个IP作为转折点,可以通过
a.使用一个BP和IP来通过IPI2BPI(a)表来定位另一个BP来完成块的重建;
b.使用两个BP和给定的边界情况数c通过BP2SI表重建分割图像。
内存占用:表1是上述不同查找表的大小的摘要。使用C++语言中的现有数据类型,则可能需要大约66.00MB的内存。但是,可以很容易地观察到这些表的一些对称属性,并且可以减少内存消耗。IPI2BPI(b)和BP2SI都是对称的,可以简单地保持其上三角形部分。这样,内存消耗可以减少到32.50MB。
表1
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的提升视频分辨率和质量的方法以及视频编码器和显示终端的解码器的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
例如,图9示出了可以实现根据本发明的服务器,例如应用服务器。该服务器传统上包括处理器1010和以存储器1020形式的计算机程序产品或者计算机可读介质。存储器1020可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储 器1020具有用于执行上述方法中的任何方法步骤的程序代码1031的存储空间1030。例如,用于程序代码的存储空间1030可以包括分别用于实现上面的方法中的各种步骤的各个程序代码1031。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图10所述的便携式或者固定存储单元。该存储单元可以具有与图9的服务器中的存储器1020类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码1031’,即可以由例如诸如1010之类的处理器读取的代码,这些代码当由服务器运行时,导致该服务器执行上面所描述的方法中的各个步骤。
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本发明的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。
以上描述并非旨在限制在限定本发明的以下权利要求书中使用的词语的含义或范围。而是提供了描述和说明以帮助理解各种实施例。预期未来在结构、功能或结果方面的修改将存在而并非实质性改变,并且权利要求书中的所有这些非实质性改变都旨在被权利要求所涵盖。因此,尽管已经说明和描述了本发明的优选实施例,但本领域技术人员将会理解,可以在不脱离要求保护的本发明的情况下做出许多改变和修改。另外,虽然术语“要求保护的发明”或“本发明”在本文中有时以单数形式使用,但将理解,存在如所描述和要求保护的多个发明。
Claims (40)
- 一种有效表示图像并对图像进行编码的方法,特别是对图像的深度图进行参数块分离的方法,包括下列步骤:在对图像进行分段之前,对图像的块进行初始化;对深度图块进行分区或分段;对分区或分段后的深度图块进行边界和块近似;重复上述对深度图块进行分区或分段并对深度图块进行边界和块近似的步骤,直至达到最大数量的分割级别;对边界和块近似后的图像进行压缩,获得压缩后的图像。
- 如权利要求1所述的方法,其中所述的参数块分离步骤还包括:提供图像块的输入表示;通过参数块分离器或分块器对图像块进行参数分离;提供图像块的输出表示,其中该输出表示为树结构。
- 如权利要求2所述的方法,其中所述的提供图像块的输入表示包括图像块和一些控制参数;并且所述控制参数为最大值水平或检索半径。
- 如权利要求2所述的方法,其中所述的树结构包含重构残差和重构块的基本参数;所述输出表示中的参数为分离模式、关键点列表、残差部分或分割指数。
- 如权利要求2-4所述的方法,其中所述的参数块分离步骤中对像素属性值的压缩块或编码块B k进行块编码,其中像素属性可以为颜色强度、亮度和色度强度或变形。
- 如权利要求5所述的方法,其中根据下列方程对参数块进行分离:f:{B k,L,R}→G k其中,{B k,L,R}是输入表示,L为分段级别的最大数量,R是搜索半径,输出是树结构G k。
- 如权利要求6所述的方法,其中在省略下标k的情况下,所述输出的树结构G为树的边界设置与树的节点设置,G=(V,E),其中E为树的边界设置,V={v 1,1,v 2,1,v 2,2…,v L,N}为树的节点设置,v l,n为树的第l层的第n个节点;第l层的第n个节点(n,l)包含如下信息:
- 如权利要求1所述的方法,其中,采用分层阈值技术来对深度图块进行分区或分段;所述分段级别的最大数量为L;对于每个级别,使用父节点的像素属性值的平均值来执行拆分;并且每个像素属性值被分配一个分区标签。
- 如权利要求1所述的方法,其中所述对分区或分段后的深度图块进行边界和块近似的步骤还包括:在获得分区或分段后的深度图块后,使用多项式或平滑分段函数来近似分割边界使得重建块和原始块之间的差异最小化。
- 如权利要求11所述的方法,其中采用线性模型,即分割边界近似为直线来近似所述分割边界;由RED中的晶须(“whisker”)指定的搜索范围中提取一组边界像素(BP),以生成由BP的所有组合形成的一组直线,对于BP的每个组合,绘制一条直线以将块分成两个段。
- 如权利要求11所述的方法,其中采用线性样条模型,即,边界近似为两条直线来近似所述分割边界。
- 如权利要求12或13所述的方法,其中在所述直线近似或线性 样条近似步骤后,还包括将一条或两条直线转换成分段,并与原始分段进行比较,输出最小残差和相应的块重建参数。
- 如权利要求12或13所述的方法,通过穷举搜索完成所述近似。
- 如权利要求12所述的方法,还包括:每个段的像素属性值d(x,y)由平均值近似获得,并计算整个块的信噪比(“SNR”);在计算每条直线获得的分段的信噪比之后,进行穷举搜索(brute-force search)以定位给出峰值信噪比的直线,并且将选择该直线作为分段边界;生成一个输出,其中包含两个BP的索引,这些索引给出了原始的最佳近似值块,标签顺序,和两个段的平均值。
- 如权利要求13所述的方法,还包括:使用线性模型来生成两个BP的初始猜测;BP被固定并且执行穷举搜索以找到潜在区域中的转折点/结,进行分割边界的扩张;选择给出峰值信噪比的结;可以通过线性样条将块分成两个段;其将两条线外推到块边界处的对应终端像素;获得四个分割区域。
- 如权利要求19所述的方法,其中所述查找表包括:坐标索引(C2I)表、边界点索引(IPI2BPI)表或者边界像素到分割图像(BP2SI)表;利用所述查找表进行块重建。
- 一种有效表示图像并对图像进行编码的系统,特别是对图像的深度图进行参数块分离的系统,包括下列:初始模块,在对图像进行分段之前,对图像的块进行初始化;分段模块,对深度图块进行分区或分段;边界和块近似模块,对分区或分段后的深度图块进行边界和块近似;递归模块,重复上述对深度图块进行分区或分段并对深度图块进行边界和块近似,直至达到最大数量的分割级别;压缩模块,对边界和块近似后的图像进行压缩,获得压缩后的图像。
- 如权利要求21所述的系统,其中所述的参数块分离系统还包括:输入模块,提供图像块的输入表示;参数分离模块,通过参数块分离器或分块器对图像块进行参数分离;输出模块,提供图像块的输出表示,其中该输出表示为树结构。
- 如权利要求22所述的系统,其中所述的输入模块包括图像块和一些控制参数;并且所述控制参数为最大值水平或检索半径。
- 如权利要求22所述的系统,其中所述的树结构包含重构残差和重构块的基本参数;所述输出表示中的参数为分离模式、关键点列表、残差部分或分割指数。
- 如权利要求22-24所述的系统,其中所述的参数块分离步骤中对像素属性值的压缩块或编码块B k进行块编码,其中像素属性可以为颜色强度、亮度和色度强度或变形。
- 如权利要求25所述的系统,其中根据下列方程对参数块进行分离:f:{B k,L,R}→G k其中,{B k,L,R}是输入表示,L为分段级别的最大数量,R是搜索半径,输出是树结构G k。
- 如权利要求26所述的系统,其中在省略下标k的情况下,所述输出的树结构G为树的边界设置与树的节点设置,G=(V,E),其中E为树的边界设置,V={v 1,1,v 2,1,v 2,2...,v l,n}为树的节点设置,V l,n为树的第l层的第n个节点;第l层的第n个节点(n,l)包含如下信息:
- 如权利要求21所述的系统,还包括:采用分层阈值技术来对深度图块进行分区或分段;所述分段级别的最大数量为L;对于每个级别,使用父节点的像素属性值的平均值来执行拆分;并且每个像素属性值被分配一个分区标签。
- 如权利要求21所述的系统,其中所述对分区或分段后的深度图块进行边界和块近似的步骤还包括:在获得分区或分段后的深度图块后,使用多项式或平滑分段函数来近似分割边界使得重建块和原始块之间的差异最小化。
- 如权利要求31所述的系统,还包括线性模型模块,采用线性模型,即分割边界近似为直线来近似所述分割边界;由RED中的晶须(“whisker”)指定的搜索范围中提取一组边界像素(BP),以生成由BP的所有组合形成的一组直线,对于BP的每个组合,绘制一条直线以将块分成两个段。
- 如权利要求31所述的系统,还包括线性样条模型模块,采用线性样条模型,即,边界近似为两条直线来近似所述分割边界。
- 如权利要求32或33所述的系统,分段输出模块,其中在所述直线近似或线性样条近似步骤后,还包括将一条或两条直线转换成分段,并与原始分段进行比较,输出最小残差和相应的块重建参数。
- 如权利要求32或33所述的系统,还包括近似模块,通过穷举搜索完成所述近似。
- 如权利要求32所述的系统,还包括:计算模块,每个段的像素属性值d(x,y)由平均值近似获得,并计算整个块的信噪比(“SNR”);穷举搜索模块,在计算每条直线获得的分段的信噪比之后,进行穷举搜索(brute-force search)以定位给出峰值信噪比的直线,并且将选择该直线 作为分段边界;第二输出模块,生成一个输出,其中包含两个BP的索引,这些索引给出了原始的最佳近似值块,标签顺序,和两个段的平均值。
- 如权利要求33所述的系统,还包括:猜测模块,使用线性模型来生成两个BP的初始猜测;分割边界扩张模块,BP被固定并且执行穷举搜索以找到潜在区域中的转折点/结,进行分割边界的扩张;选择模块,选择给出峰值信噪比的结;分段模块,通过线性样条将块分成两个段;外推模块,将两条线外推到块边界处的对应终端像素;获得四个分割区域。
- 如权利要求39所述的系统,其中所述查找表包括:坐标索引(C2I)表、边界点索引(IPI2BPI)表或者边界像素到分割图像(BP2SI)表;利用所述查找表进行块重建。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201980039040.8A CN112292860B (zh) | 2019-05-24 | 2019-06-25 | 有效表示图像并对图像进行编码的系统和方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
HK19124279.1 | 2019-05-24 | ||
HK19124279 | 2019-05-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020237759A1 true WO2020237759A1 (zh) | 2020-12-03 |
Family
ID=73553663
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/092656 WO2020237759A1 (zh) | 2019-05-24 | 2019-06-25 | 有效表示图像并对图像进行编码的系统和方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112292860B (zh) |
WO (1) | WO2020237759A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112700549A (zh) * | 2020-12-25 | 2021-04-23 | 北京服装学院 | 一种样衣的模拟方法及装置 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113129395B (zh) * | 2021-05-08 | 2021-09-10 | 深圳市数存科技有限公司 | 一种数据压缩加密系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103347187A (zh) * | 2013-07-23 | 2013-10-09 | 北京师范大学 | 一种基于自适应方向预测离散小波变换的遥感影像压缩方法 |
US20150010049A1 (en) * | 2013-07-05 | 2015-01-08 | Mediatek Singapore Pte. Ltd. | Method of depth intra prediction using depth map modelling |
CN105191317A (zh) * | 2013-03-05 | 2015-12-23 | 高通股份有限公司 | 视图内以及跨越视图的深度查找表的预测性译码 |
CN107592538A (zh) * | 2017-09-06 | 2018-01-16 | 华中科技大学 | 一种降低立体视频深度图编码复杂度的方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108141593B (zh) * | 2015-07-31 | 2022-05-03 | 港大科桥有限公司 | 用于针对深度视频的高效帧内编码的基于深度不连续的方法 |
-
2019
- 2019-06-25 WO PCT/CN2019/092656 patent/WO2020237759A1/zh active Application Filing
- 2019-06-25 CN CN201980039040.8A patent/CN112292860B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105191317A (zh) * | 2013-03-05 | 2015-12-23 | 高通股份有限公司 | 视图内以及跨越视图的深度查找表的预测性译码 |
US20150010049A1 (en) * | 2013-07-05 | 2015-01-08 | Mediatek Singapore Pte. Ltd. | Method of depth intra prediction using depth map modelling |
CN103347187A (zh) * | 2013-07-23 | 2013-10-09 | 北京师范大学 | 一种基于自适应方向预测离散小波变换的遥感影像压缩方法 |
CN107592538A (zh) * | 2017-09-06 | 2018-01-16 | 华中科技大学 | 一种降低立体视频深度图编码复杂度的方法 |
Non-Patent Citations (1)
Title |
---|
LI, KUN: "The Research of Intra Mode Coding for the Depth Maps", CHINA MASTER’S THESES FULL-TEXT DATABASE, INFORMATION SCIENCE AND TECHNOLOGY, 15 May 2017 (2017-05-15), ISSN: 1674-0246, DOI: 20200204171730A * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112700549A (zh) * | 2020-12-25 | 2021-04-23 | 北京服装学院 | 一种样衣的模拟方法及装置 |
CN112700549B (zh) * | 2020-12-25 | 2024-05-03 | 北京服装学院 | 一种样衣的模拟方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN112292860B (zh) | 2023-05-09 |
CN112292860A (zh) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108028941B (zh) | 用于通过超像素编码和解码数字图像的方法和装置 | |
CN104244007B (zh) | 一种图像编码方法和装置及解码方法和装置 | |
BR112014009431B1 (pt) | Método para codificar dados de vídeo, aparelho para codificar vídeo e memória legível por computador | |
CN102881026B (zh) | 一种具有透明度信息的图像的调色板装置与生成方法 | |
CN104378644A (zh) | 定宽度变长度像素样值串匹配增强的图像压缩方法和装置 | |
US12020461B2 (en) | Method and apparatus for Haar-based point cloud coding | |
US20220329833A1 (en) | Nearest neighbor search method, apparatus, device, and storage medium | |
WO2020237759A1 (zh) | 有效表示图像并对图像进行编码的系统和方法 | |
CN113518226A (zh) | 一种基于地面分割的g-pcc点云编码改进方法 | |
CN114930823A (zh) | 帧内预测方法、装置、编码器、解码器、及存储介质 | |
CN107148648A (zh) | 估计图像序列的两个不同颜色分级版本之间的颜色映射的方法和设备 | |
US20240037800A1 (en) | Encoding and decoding methods, related devices and storage medium | |
Kekre et al. | Color Image Segmentation using Vector Quantization Techniques Based on the Energy Ordering concept | |
CN114040211A (zh) | 一种基于avs3的帧内预测快速决策方法 | |
WO2022131948A1 (en) | Devices and methods for sequential coding for point cloud compression | |
US20220286677A1 (en) | Point cloud processing method, encoder, decoder and storage medium | |
Wang et al. | Visual quality optimization for view-dependent point cloud compression | |
KR20240006667A (ko) | 점군 속성 정보 부호화 방법, 복호화 방법, 장치 및 관련 기기 | |
CN112509107A (zh) | 一种点云属性重着色方法、装置及编码器 | |
CN115336264A (zh) | 帧内预测方法、装置、编码器、解码器、及存储介质 | |
WO2023024842A1 (zh) | 点云编解码方法、装置、设备及存储介质 | |
WO2024197680A1 (zh) | 点云编解码方法、装置、设备及存储介质 | |
WO2024207463A1 (zh) | 点云编解码方法、装置、设备及存储介质 | |
Hong et al. | Algorithm for Coding Unit Partition in 3D Animation Using High Efficiency Video Coding Based on Canny Operator Segment. | |
US20230156222A1 (en) | Grid-based patch generation for video-based point cloud coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19930735 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19930735 Country of ref document: EP Kind code of ref document: A1 |