CN116389753A - Data encapsulation method, related device, equipment and medium - Google Patents
Data encapsulation method, related device, equipment and medium Download PDFInfo
- Publication number
- CN116389753A CN116389753A CN202310254647.6A CN202310254647A CN116389753A CN 116389753 A CN116389753 A CN 116389753A CN 202310254647 A CN202310254647 A CN 202310254647A CN 116389753 A CN116389753 A CN 116389753A
- Authority
- CN
- China
- Prior art keywords
- data
- sub
- encoded
- block
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The application discloses a data encapsulation method, and related devices, equipment and media, wherein the data encapsulation method comprises the following steps: acquiring first coding data and second coding data of an image to be coded; determining a first load required for carrying the transparent channel image by using a preset packaging mechanism based on a first data volume of the first coded data; determining a first number of data packets required to carry the image to be encoded based on the first payload and a second amount of second encoded data; determining a third data amount of the first encoded data respectively allocated to each data packet based on the first data amount and the first number; for each data packet, determining a second payload required to carry a third amount of data using a preset encapsulation scheme, and allocating an unallocated portion of the second encoded data to a remaining payload of the data packet based on the second payload and a payload of the data packet. The scheme can be favorable for adapting to network transmission, and is favorable for improving the convenience of back-end processing.
Description
The present application is a divisional application of patent application No. 202111651192.9, entitled "compression encoding method and data packaging method and related devices, apparatuses, media", filed by applicant at 2021, 12/30.
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a data encapsulation method, and related devices, apparatuses, and media.
Background
The popularity of intelligent mobile devices and the continuous development of network technology now make video possible as a medium for information transfer and are becoming more and more widely used. Alpha channels (hereinafter, referred to as transparent channels) are generally used to represent the transparency of corresponding pixels in a picture. The transparent channel plays an important role in application scenes such as image processing software, animation creation field, video editing software and the like.
Based on this, how to encapsulate the first encoded data after compression encoding of the transparent channel image, so that the first encoded data is beneficial to adapting to network transmission, and improving the convenience of back-end processing becomes a problem to be solved.
Disclosure of Invention
The technical problem that this application mainly solves is to provide a data encapsulation method and relevant device, equipment, medium, can encapsulate through packing mechanism to transparent channel image after compression coding's first coded data, is favorable to adapting to network transmission, and is favorable to promoting the convenience of backend processing.
In order to solve the above technical problem, a first aspect of the present application provides a data encapsulation method, including: acquiring first coding data and second coding data of an image to be coded; the first encoded data are encoded data of a transparent channel image of an image to be encoded, and the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded; determining a first load required for carrying the transparent channel image by using a preset packaging mechanism based on a first data volume of the first coded data; determining a first number of data packets required to carry the image to be encoded based on the first payload and a second amount of second encoded data; determining a third data amount of the first encoded data respectively allocated to each data packet based on the first data amount and the first number; for each data packet, determining a second payload required to carry a third amount of data using a preset encapsulation scheme, and allocating an unallocated portion of the second encoded data to a remaining payload of the data packet based on the second payload and a payload of the data packet.
To solve the above technical problem, a second aspect of the present application provides a data packaging device, including: the system comprises a data acquisition module, a load determination module, a quantity determination module, a first distribution module and a second distribution module, wherein the data acquisition module is used for acquiring first coding data and second coding data of an image to be coded; the first encoded data are encoded data of a transparent channel image of an image to be encoded, and the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded; the load determining module is used for determining a first load required by carrying the transparent channel image by using a preset packaging mechanism based on a first data amount of first coded data; the quantity determining module is used for determining a first quantity of data packets required for bearing the image to be encoded based on the first load and a second data quantity of the second encoded data; a first allocation module, configured to determine a third data amount of the first encoded data allocated to each data packet based on the first data amount and the first number; and the second allocation module is used for determining a second load required for bearing a third data volume by using a preset packaging mechanism for each data packet, and allocating the unallocated part of the second coded data to the residual load of the data packet based on the second load and the payload of the data packet.
In order to solve the above technical problem, a third aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, where the memory stores program instructions, and the processor is configured to execute the program instructions to implement the data encapsulation method of the first aspect.
In order to solve the above technical problem, a fourth aspect of the present application provides a computer readable storage medium storing program instructions executable by a processor, where the program instructions are configured to implement the data encapsulation method of the first aspect.
According to the scheme, the first encoded data and the second encoded data of the image to be encoded are obtained, the first encoded data are encoded data of a transparent channel image of the image to be encoded, the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded, a first load required by the transparent channel image is determined based on the first data amount of the first encoded data by using a preset packaging mechanism, the first number of data packets required by the image to be encoded is determined based on the first load and the second data amount of the second encoded data, a third data amount of the first encoded data respectively distributed by each data packet is determined based on the first data amount and the first number, and a second load required by using the preset packaging mechanism is determined for each data packet, and an unallocated part of the second encoded data is distributed to a residual load of the data packet based on the second load and the effective load of the data packet.
Drawings
FIG. 1 is a flow chart of an embodiment of a compression encoding method of the present disclosure;
FIG. 2 is a schematic illustration of an application of an embodiment of a clear channel image;
FIG. 3 is a schematic diagram illustrating the partitioning of one embodiment of a clear channel image;
FIG. 4 is a schematic diagram of one embodiment of sub-block partitioning with a side length of 1;
FIG. 5 is a schematic diagram of one embodiment of sub-block partitioning with a side length of 2;
FIG. 6 is a schematic diagram of one embodiment of sub-block partitioning with a side length of 3;
FIG. 7 is a flowchart of the step S12 of FIG. 1;
FIG. 8 is a schematic view of a division of another embodiment of a clear channel image;
FIG. 9 is a schematic diagram of one embodiment of pixel expansion;
FIG. 10 is a flow chart of an embodiment of a data encapsulation method of the present application;
FIG. 11 is a schematic diagram of an embodiment of an RTP protocol packet;
FIG. 12 is a schematic diagram of a header extension in one-byte form, in one embodiment;
FIG. 13 is a schematic diagram of an embodiment of a header extension in the form of two-byte;
FIG. 14 is a schematic diagram of an embodiment of an IP packet;
FIG. 15 is a schematic diagram of a frame of an embodiment of a compression encoding apparatus of the present application;
FIG. 16 is a schematic diagram of a frame of an embodiment of a data encapsulation apparatus of the present application;
FIG. 17 is a schematic diagram of a frame of an embodiment of an electronic device of the present application;
FIG. 18 is a schematic diagram of a framework of one embodiment of a computer readable storage medium of the present application.
Detailed Description
The following describes the embodiments of the present application in detail with reference to the drawings.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a flow chart illustrating an embodiment of a compression encoding method of the present application.
Specifically, the method may include the steps of:
step S11: and obtaining a transparent channel image based on the transparency of each pixel point of the image to be encoded.
In one implementation scenario, the image to be encoded may be an image captured by the image capturing device alone, and illustratively, the image to be encoded may be an image captured by a user through a capturing device such as a cell phone, tablet computer, camera, or the like. Or, the image to be encoded may be at least one frame of image in the video data shot by the image pickup device, for example, the mobile phone, the tablet computer, the camera and other devices may shoot the video data, and each frame of image in the video data may be respectively used as the image to be encoded, so that the transparent channel image of each frame of image in the video data may be respectively compression encoded, and of course, at least one image may also be selected from the video data as the image to be encoded, which is not limited herein.
In one implementation scenario, the transparent channel image and the image to be encoded have the same resolution, and the pixel value of the pixel point in the transparent channel image is the transparency of the pixel point at the corresponding position in the image to be encoded. For convenience of description, the resolutions of the image to be encoded and the transparent channel image may be denoted by w×h, where W represents the image width and H represents the image height, and the pixel value of the pixel point located in the j-th column of the i-th row in the transparent channel image is the transparency of the pixel point located in the j-th column of the i-th row in the image to be encoded. Other situations can be similar and are not exemplified here.
In one implementation, the transparency may be represented as a number within a preset range of numbers. Illustratively, the preset value range may be 0 to 1. Of course, the preset value range may be other value ranges, which are not limited herein. Further, the value and transparency may be positively correlated, i.e. the greater the value, the higher the transparency, and vice versa, the smaller the value, the lower the transparency. Taking a preset value range of 0 to 1 as an example, when the transparency is 0, it may represent complete opacity, and when the transparency is 1, it may represent complete transparency. Of course, the value and transparency may also be inversely related, i.e. the smaller the value, the higher the transparency, and conversely the larger the value, the lower the transparency. Taking a preset value range of 0 to 1 as an example, when the transparency is 0, it may represent complete transparency, and when the transparency is 1, it may represent complete opacity. The above examples are only a few possible embodiments of the practical application process and are not thereby limiting the specific manner in which the practical application process may be carried out.
In one implementation scenario, please refer to fig. 2 in combination, fig. 2 is a schematic diagram illustrating an application of an embodiment of a clear channel image. As shown in fig. 2, the image C is composed of the foreground of the image a and the image B, i.e. can be simply expressed as c=a+a+b (1-), wherein 0< α <1. As shown in fig. 2, due to the presence of the transparent channel of image a, the object in image a can be separated from the background and added to the new image (i.e., image B). Just like image a, image B, image C, the transparent channel α is also a spatially varying signal component. In digital video, the transparent channel is also time-varying and behaves the same as the RGB channel. As illustrated in fig. 2, the transparent channel may represent a shape on the one hand and a transparency of the pixel on the other hand. The transparent channel is characterized in that the background and the foreground have sharp edges, and the background and the foreground are generally respectively concentrated in one area, but are not relatively scattered like a luminance component and a chrominance component, but the technical scheme of the embodiment disclosed in the application can just keep the edge information as much as possible, so that the possibility of occurrence of artifacts can be reduced as much as possible during back-end processing, the coding quality is improved, and the specific process can be continuously referred to as follows.
Step S12: and dividing the transparent channel image to obtain a plurality of sub-blocks.
In the embodiment of the disclosure, the pixel values of the pixel points in the sub-blocks are the same, and the set of several sub-blocks can cover the transparent channel image. The sub-blocks may be the same or different in size, that is, any two of the sub-blocks may be the same or different in size, and the present invention is not limited thereto. In addition, the specific division manner of the transparent channel image may refer to the following related description, which is not repeated herein.
In one implementation, each sub-block may be rectangular in shape, and the length and width of the rectangle may be different. Alternatively, each sub-block may be rectangular in shape, and the length and width of the rectangle may be the same, in which case the sub-block is square in shape. Alternatively, each sub-block may be rectangular in shape, with portions of the sub-blocks being the same in length and width and portions of the sub-blocks being different in length and width. It should be noted that the above cases are only possible embodiments in practical application, and do not limit the possibility of the sub-blocks taking other shapes.
In one implementation scenario, the sub-blocks may be disjoint from each other, that is, any pixel point in the clear channel image exists in only one sub-block, and not in two or more sub-blocks at the same time. According to the mode, all sub-blocks obtained by dividing the transparent channel image are mutually disjoint, so that data redundancy can be reduced as much as possible, and the data compression degree can be improved.
In one implementation scenario, for any sub-block there is at least one neighboring sub-block around it, and there is at least one pair of corner points adjacent to each other for both the sub-block and its neighboring sub-block. For example, in the case that the shape of the sub-block is rectangular, the vertex of the rectangle may be regarded as a corner of the sub-block, and at least one corner may be found in the sub-block, with the corner being adjacent to the corner of its neighboring sub-block. Note that, for convenience of description, the pixel point in the ith row and jth column in the transparent channel image may be simply referred to as a pixel point (i, j), and the pixel points in other positions may be similarly inferred, and the pixel points adjacent to the pixel point (i, j) may include: pixel points (i+1, j), (i-1, j), (i, j+1), (i, j-1). In the mode, at least one adjacent sub-block exists around the sub-block, and at least one pair of corner points adjacent to each other exist between the sub-block and the adjacent sub-block, so that the position distribution order of each sub-block can be improved, and the subsequent data decoding efficiency can be improved.
Step S13: and encoding based on the attribute information of each sub-block to obtain encoded data of the transparent channel image.
In an embodiment of the present disclosure, the attribute information includes at least one of: sub-block position, sub-block size, sub-block amplitude, and sub-block amplitude represents pixel values of each pixel point in the sub-block. It should be noted that, since the pixel values of the pixels in the sub-block are the same, the sub-block amplitude can be regarded as the pixel value of any pixel in the sub-block.
In one implementation scenario, as previously described, the shape of the sub-block may be rectangular, in which case the sub-block location may represent coordinates of the vertices of the rectangle, and the sub-block size may include the length and width of the rectangle. According to the method, the sub-blocks are set to be rectangular, the positions of the sub-blocks represent coordinates of vertexes of the rectangle, and the sizes of the sub-blocks comprise the length and the width of the rectangle, so that difficulty in sub-block division is reduced, and compression coding efficiency is improved.
In a specific implementation scenario, coordinates respectively represented by sub-block positions in the attribute information of each sub-block are coordinates of vertices of the rectangle at the first orientation. That is, in the case where the shape of the sub-block is rectangular, the coordinates of the vertices at the same orientation of the rectangle may be used for the representation. Illustratively, the first orientation may be: any of the upper left corner, the lower left corner, the upper right corner, and the lower right corner may be selected, for example, the coordinates of the vertex of the upper left corner of each sub-block may be selected to represent the sub-block position in the attribute information of each sub-block, and other orientations may be selected, which are not illustrated here. In the above manner, the coordinates respectively represented by the sub-block positions in the attribute information of each sub-block are the coordinates of the vertices of the rectangle at the first azimuth, and the first azimuth is any one of the upper left corner, the lower left corner, the upper right corner and the lower right corner, namely, the sub-block positions are represented by the vertices of the same azimuth, so that the encoding complexity can be reduced as much as possible, and the compression encoding efficiency can be improved.
In a specific implementation scenario, as mentioned above, the length and width of the rectangle may also be the same, in which case the shape of the sub-block is square, and correspondingly, the sub-block size may be the side length of the square. According to the method, the sub-blocks are set to be square, the sub-block size is the side length of the square, the sub-block dividing difficulty can be further reduced, the compression coding efficiency is further improved, in addition, the sub-block size is set to be square, only the side length of the square is required to be coded, the data size can be further reduced, and the compression degree is further improved.
In one implementation scenario, please refer to fig. 3 in combination, fig. 3 is a schematic diagram illustrating a division of an embodiment of a clear channel image. As shown in fig. 3, in order to raise the compression degree as much as possible, the shape of each sub-block may be square, and for each sub-block, attribute information thereof may be represented by four elements (X, Y, L, a). Wherein, X, Y represents the sub-block position, as shown in fig. 3, for the sub-block shown in the lower left corner white line, X, Y may represent the abscissa and ordinate of the upper left corner vertex of the sub-block; l represents the sub-block size, as shown in fig. 3, for a sub-block shown by the lower left white line, L may represent the side length of the sub-block; a may represent the sub-block amplitude, as shown in fig. 3, for a sub-block shown by the lower left white line, a may represent the pixel value of the individual pixel points in the sub-block. Therefore, when the shape of the sub-block is set to be square, the sub-blocks can be coded by adopting the quadruple, and the compression degree can be improved.
In one implementation scenario, each element in the attribute information of the sub-block may be represented by one byte, for example. For example, for a shape setting of a sub-block as a square, one byte may be used to represent the abscissa X, one byte may be used to represent the ordinate Y, one byte may be used to represent the side length L, and one byte may be used to represent the sub-block amplitude a. That is, when the shape of the sub-block is set to be square, the amount of data encoded per sub-block is four bytes. In the case where the shape of the sub-block is set to another shape, the same can be said, and no example is given here. On this basis, the set of encoded bytes of all sub-blocks can be used as encoded data for the transparent channel image.
In one implementation scenario, for convenience in explaining the effectiveness of compression encoding a transparent channel image based on sub-block division in the present application for improving the compression degree, please refer to fig. 4, fig. 5 and fig. 6 in combination, fig. 4 is a schematic diagram of an embodiment of sub-block division when the side length is 1, fig. 5 is a schematic diagram of an embodiment of sub-block division when the side length is 2, and fig. 6 is a schematic diagram of an embodiment of sub-block division when the side length is 3. As shown in fig. 4, in the case where the side length L is 1, the sub-block (fig. 3 White block) contains a pixel point of the transparent channel image, the sub-block can be represented by a quadruple (2,3,1,1), the elements in the sub-block can be represented by 1 byte before being encoded, the quadruple is represented by 4 bytes after being encoded, and the compression ratio r=4/1=400; as shown in fig. 5, in the case that the side length L is 2, the sub-block (shown in the white block of fig. 4) includes four pixels of the transparent channel image, the sub-block may be represented by four tuples (2, 3,2, 1), and for the elements in the sub-block, the sub-block may be represented by 4 bytes before encoding, and the four tuples after encoding need to be represented by 4 bytes, where the compression ratio r=4/4=100%; as shown in fig. 6, in the case that the side length L is 3, the sub-block (shown in the white block of fig. 4) contains nine pixels of the transparent channel image, the sub-block may be represented by a four-tuple (2,3,3,1), and for the elements in the sub-block, the sub-block may be represented by 9 bytes before encoding, and the four-tuple needs to be represented by 4 bytes after encoding, where the compression ratio r=4/9≡44.4%. Accordingly, the relationship of the compression ratio R and L can be expressed as: r=4/L 2 That is, the larger the size of the sub-block is, the lower the compression ratio is and the higher the compression degree is, so the compression encoding is performed by dividing the sub-block, the characteristic of the data set of the transparent channel image can be fully utilized, and the compression degree of the transparent channel image can be improved as much as possible.
According to the scheme, the transparent channel image is obtained based on the transparency of each pixel point of the image to be encoded, the transparent channel image is divided to obtain a plurality of sub-blocks, the pixel values of each pixel point in each sub-block are the same, the transparent channel image is covered by a set of the sub-blocks, on the basis, encoding is performed based on attribute information of each sub-block to obtain encoded data of the transparent channel image, and the attribute information comprises at least one of the following: the method has the advantages that the characteristics that the foreground and the background of a transparent channel generally have sharp edges and the foreground and the background are generally concentrated in one region respectively can be fully utilized in the compression coding process, the transparent channel image is divided into a plurality of sub-blocks with the same pixel values in the respective regions to realize compression coding, the edge information can be reserved as much as possible, the quality of the compression coding is facilitated to be improved, and on the other hand, the compression coding can be realized only by sub-block division without other complex operations, the operation complexity is reduced, the efficiency of the compression coding can be improved, and the power consumption of the compression coding is reduced.
Referring to fig. 7, fig. 7 is a flowchart of an embodiment of step S12 in fig. 1. Specifically, the method may include the steps of:
step S121: and taking the preset position of the transparent channel image as a reference point.
In one implementation scenario, the preset position is located at a corner of the transparent channel image in the second direction. Specifically, the second orientation may be any one of the upper left corner, the lower left corner, the upper right corner, and the lower right corner, that is, any one of the corner points of the transparent channel image may be selected as the reference point at the time of the first division. For example, please refer to fig. 8 in combination, fig. 8 is a schematic diagram illustrating the division of another embodiment of the transparent channel image. As shown in fig. 8, at the time of the first division, a corner point D1 of the upper left corner of the transparent channel image may be selected as a reference point.
Step S122: and expanding the transparent channel image by taking the reference point as a starting point until the preset condition is not met, and obtaining the sub-block.
In the embodiment of the disclosure, the corner points of the sub-blocks are reference points, that is, one of the corner points of the sub-blocks obtained by expansion is the reference point (that is, the starting point). Illustratively, as shown in fig. 8, the sub-blocks shown in the white boxes of fig. 8 may be obtained by a first expansion, and for ease of distinction, the numeral "1" is marked in the white boxes to indicate that the sub-blocks are obtained by the first expansion. The corner point D1 of the upper left corner of the sub-block 1 is the reference point.
In one implementation scenario, the specific manner of expansion may be set according to a predefined shape of the sub-block. For example, in the case where the shape of the sub-block is prescribed in advance to be rectangular, expansion of several pixel widths may be performed in the horizontal direction and the vertical direction, respectively, at a time; alternatively, in the case where the shape of the sub-block is predetermined to be square, the same pixel width may be extended simultaneously in the horizontal direction and the vertical direction each time. Other situations can be similar and are not exemplified here.
In one implementation scenario, as previously described, the preset position is located at a corner of the transparent channel image in the second orientation. Based on this, the first pixel width and the second pixel width may be initialized, the first pixel width is extended in the first direction with the reference point as the starting point, the second pixel width is extended in the second direction, the candidate extended area is obtained, the first direction is the horizontal direction and is far away from the starting point, the second direction is the vertical direction and is far away from the starting point, on the basis that the current extension satisfies the preset condition, the latest candidate extended area may be used as the target extended area, the first pixel width is updated based on the first pixel step size, the second pixel width is updated based on the second pixel step size, and the step and the subsequent steps with the reference point as the starting point are re-executed until the preset condition is not satisfied. According to the method, the reference point is used as the starting point, and pixel expansion is carried out in the first direction which is horizontal and far away from the starting point and in the second direction which is vertical and far away from the starting point, so that the latest candidate expansion area is used as the target expansion area according to the fact that the current expansion meets the preset condition, the pixel width during expansion is updated based on the pixel step length, expansion is carried out again, the sub-block division flow can be simplified as much as possible, and the compression coding efficiency is improved.
In a specific implementation scenario, taking the shape of a predefined sub-block as a square as an example, the first pixel width and the second pixel width may be initialized to 1, respectively. Other situations can be similar and are not exemplified here. It should be noted that, in order to improve the accuracy of the compression encoding, the first pixel width and the second pixel width may be set to 1 by default.
In a specific implementation scenario, where the second orientation is the upper left corner, the first direction is horizontal to the right and the second direction is vertical to the bottom; or in the case that the second direction is the lower left corner, the first direction is horizontal to the right, and the second direction is vertical to the upper; or in the case that the second direction is the upper right corner, the first direction is horizontal to the left, and the second direction is vertical to the bottom; alternatively, in the case where the second orientation is the lower right corner, the first direction is horizontally leftward and the second direction is vertically upward. According to the mode, the corner points of the transparent channel image are selected as the starting points, and the expansion is carried out in the direction away from the starting points horizontally and the direction away from the starting points vertically, so that the expansion efficiency is improved.
In a specific implementation scenario, taking the shape of the pre-defined sub-block as a square as an example, the first pixel step size and the second pixel step size may be 1, respectively, i.e. 1 pixel width is extended towards the first direction and the second direction at a time. Other situations can be similar and are not exemplified here. It should be noted that, in order to improve the accuracy of the compression encoding, the first pixel step size and the second pixel step size may default to 1.
In a specific implementation scenario, if the current expansion does not meet the preset condition during the first expansion, the reference point may be considered as an isolated pixel point, so that the encoding may be performed with reference to fig. 4 and the related description thereof, and further lossless compression encoding may be implemented. Of course, if in practical application, lossy compression can be accepted, considering that a single pixel has little effect on subsequent decoding, the pixel may not be encoded, and in the subsequent decoding process, an algorithm such as interpolation may be adopted to predict the pixel value of the pixel. In addition, if the current extension does not satisfy the preset condition in the i (i is greater than 1) th extension, the target extension area obtained by the i-1 th extension may be used as a sub-block.
In one implementation scenario, the preset conditions are set to include: the pixel values of the extended pixel points are the same as the pixel values of the starting points, the extended pixel points do not belong to the divided sub-blocks, and the extended pixel points belong to the transparent channel image.
In one implementation scenario, please refer to fig. 9 in combination, fig. 9 is a schematic diagram of an embodiment of pixel expansion. As shown in fig. 9, a corner point of the upper left corner of the transparent channel image may be used as a reference point, and the reference point may be used as a starting point for expansion when expanding for the first time. Specifically, the first pixel width and the second pixel width can be initialized to be 1 respectively, and the first pixel width and the second pixel width are respectively expanded to the horizontal right direction and the vertical down direction, at this time, the obtained candidate expansion area is the pixel area of the pixel point at the upper left corner, and because the current expansion meets the preset condition, the pixel area can be used as the target expansion area corresponding to the first expansion, the first pixel width and the second pixel width are respectively updated according to the first pixel step length 1 and the second pixel step length 1, and the updated first pixel width and the updated second pixel width are respectively 2; in the 2 nd expansion, 2 pixel widths can be respectively expanded towards the horizontal right direction and the vertical downward direction, the obtained candidate expansion area is a pixel area which is represented by dot-dash arrow crossing of the upper left area, and the pixel area can be used as a target expansion area corresponding to the 2 nd expansion because the current expansion meets the preset condition, and the first pixel width and the second pixel width are respectively updated according to the first pixel step length 1 and the second pixel step length 1, and the updated first pixel width and second pixel width are respectively 3; when expanding for the 3 rd time, 3 pixel widths can be respectively expanded towards the horizontal right direction and the vertical downward direction, the obtained candidate expansion area is a pixel area which is represented by the strip-shaped dotted arrow of the upper left area in a crossing way, and the pixel area can be used as a target expansion area corresponding to the 3 rd time expansion because the current expansion meets the preset condition, and the first pixel width and the second pixel width are respectively updated according to the first pixel step length 1 and the second pixel step length 1, and the updated first pixel width and the updated second pixel width are respectively 4; in the 4 th expansion, 4 pixel widths can be respectively expanded towards the horizontal right direction and the vertical downward direction, the obtained candidate expansion area is a pixel area which is represented by the solid arrow of the upper left area in a crossing way, and the pixel area can be used as a target expansion area corresponding to the 4 th expansion because the current expansion meets the preset condition, and the first pixel width and the second pixel width are respectively updated according to the first pixel step length 1 and the second pixel step length 1, and the updated first pixel width and second pixel width are respectively 5; at the time of the 5 th expansion, the widths of 5 pixels can be respectively expanded towards the horizontal right and the vertical downward, and since the current expansion does not meet the preset condition (i.e. the pixel value of the expanded pixel point is no longer the same as the pixel value of the starting point), the expansion can be ended at this time, and the latest target expansion area (i.e. the target expansion area corresponding to the 4 th expansion) can be used as a sub-block. Other cases may be subspecies and are not exemplified herein.
Step S123: and obtaining new reference points based on the rest angular points of the sub-blocks, and re-executing the step of expanding the transparent channel image by taking the reference points as starting points and the subsequent steps until all the reference points are taken as the starting points.
In the embodiment of the disclosure, the rest corner points are corner points other than the starting point. Still taking the transparent channel image shown in fig. 8 as an example, after the first expansion to obtain the sub-block 1, a new reference point may be obtained based on the remaining corner points of the sub-block 1. The remaining corner points may comprise three corner points other than the corner point D1, and the other cases may be similar, and are not exemplified here. When the pixel extension is performed again, the first pixel width and the second pixel width need to be reinitialized. Illustratively, in the foregoing example, when the first pixel width and the second pixel width are 5 in the last pixel extension, respectively, the first pixel width and the second pixel width are reinitialized to 1 when the sub-block is obtained and the pixel extension is repeated to find a new sub-block. Other situations can be similar and are not exemplified here.
In one implementation scenario, after the sub-block is obtained by expansion, pixel points in the transparent channel image, which are respectively adjacent to each other corner point of the sub-block and do not belong to the sub-block, may be used as new reference points. In the above manner, the pixel points which are adjacent to the other corner points and do not belong to the sub-blocks in the transparent channel image are used as new reference points, so that any two sub-blocks are disjoint, and the compression degree is further improved.
In one implementation scenario, after obtaining the new reference points, the step of pixel expansion may be continuously performed based on each new reference point, so that a plurality of No. 2 sub-blocks may be obtained, after obtaining a plurality of No. 2 sub-blocks, new reference points may be obtained based on the rest corner points of each No. 2 sub-block, and the step of pixel expansion may be continuously performed based on each new reference point, so that a plurality of No. 3 sub-blocks may be obtained, and so on, and no further examples are given here. As the sub-block division continues, the reference points are fewer and fewer, and the division can be ended when there are no more available reference points, i.e., the transparent channel image division can be considered to be ended at this time.
According to the scheme, the preset position of the transparent channel image is used as the reference point, the transparent channel image is expanded by taking the reference point as the starting point until the preset condition is not met, the sub-block is obtained, the corner points of the sub-block are used as the reference point, on the basis, the new reference point is obtained based on the rest of the corner points of the sub-block, the step of expanding by taking the reference point as the starting point is repeatedly executed until all the reference points are used as the starting point, and the rest of the corner points are corner points other than the starting point, so that the sub-block division can be orderly and efficiently realized through pixel expansion, and the compression coding efficiency and quality are improved.
Referring to fig. 10, fig. 10 is a flowchart illustrating an embodiment of a data encapsulation method according to the present application.
Specifically, the method may include the steps of:
step S101: first encoded data and second encoded data of an image to be encoded are acquired.
In an embodiment of the disclosure, the first encoded data is encoded data of a transparent channel image of an image to be encoded, and the second encoded data is encoded data of chromaticity and/or luminance of the image to be encoded.
In an implementation scenario, the first encoded data may be obtained through steps in any of the foregoing embodiments of the compression encoding method, and a specific encoding process may refer to the foregoing disclosed embodiments, which are not described herein again. Of course, the first encoded data is not limited to being encoded by other compression encoding methods.
In one implementation scenario, the second encoded data may be encoded by a standard encoding method, such as h.264, without limitation.
In an implementation scenario, as described above, the image to be encoded may be an image captured by the image capturing device alone, or may be at least one frame of image in video data captured by the image capturing device, which may be specifically described in the foregoing disclosure embodiments, and will not be described herein.
Step S102: and determining a first load required for carrying the transparent channel image by using a preset packaging mechanism based on the first data quantity of the first coded data.
In one implementation, the preset encapsulation mechanism may include encapsulation by an RTP (Real-time Transport Protocol ) header extension. For ease of understanding, please refer to fig. 11 in combination, fig. 11 is a schematic structural diagram of an embodiment of an RTP protocol packet. As shown in fig. 11, each RTP protocol packet would fixedly contain the first 12 bytes (i.e., RTP header). Note that CSRC identifier (i.e., a special source identifier) is only available when the Mixer is plugged in. V represents the Version (i.e., version), taking 2 bits, representing the Version of RTP. P represents Padding (i.e., padding) and takes 1 bit, if Padding is set to indicate that the end of the message will contain one or more Padding bytes but not be part of the payload. X represents an Extension (i.e., extension) that takes 1 bit, and if Extension is set, the RTP header will be followed by a header Extension that can follow the CSRC if it exists (i.e., a special source). It should be noted that, in the case of following the header extension, the field X may be set to 1, otherwise, the field X may be set to 0. In addition, M represents a flag (i.e., marker), which occupies 1 bit, and in the packet stream, the boundary of each frame may be divided by a field M, PT represents a load Type (i.e., payload Type), and PT occupies 7 bits, which represents an RP load Type, for specifying the coding standard adopted by the load. Sequence num represents a Sequence number, which takes 16 bits, and each time an RTP protocol packet is transmitted, the Sequence number can be added with 1, and a receiving party can detect whether to lose a packet or reconstruct a Sequence according to the Sequence number. The Timestamp represents a Timestamp, which takes 32 bits, and reflects the sampling time of the first byte of the RTP protocol packet, and the receiver can calculate delay and jitter through the Timestamp. SSRC identifier represents a synchronization source identifier, which takes 32 bits to identify a synchronization source, and is randomly selected so that two synchronization sources that are required to participate in the same video conference may not have the same synchronization source identifier. CSRC identifier represents a special source identifier, which may be 0-15 in number, accounting for 32 bits, each special source identifier identifying all special sources contained in the RTP protocol packet payload.
In one implementation, the preset encapsulation mechanism may include encapsulation by an RTP (Real-time Transport Protocol ) header extension. Referring to fig. 12 and 13 in combination, fig. 12 is a schematic diagram of an embodiment of a header extension in the form of one-byte, and fig. 13 is a schematic diagram of an embodiment of a header extension in the form of two-byte. It should be noted that the main difference between the header extension of the one-byte (i.e., single byte) type and the header extension of the two-byte (i.e., double byte) type is that the field ID and the field L occupy different byte lengths, and in the header extension of the one-byte type, the field ID and the field L occupy one byte, respectively, while in the header extension of the two-byte type, the field ID and the field L occupy two bytes, respectively. The first two bytes of the header extension represent the type of the header extension, 0xBEDE represents the header extension in the form of one-byte, and 0x1000 represents the header extension in the form of two-byte. The immediately following two bytes represent length (i.e., the header of 4 bytes is removed, the subsequent length), and the header extension is composed of a plurality of extension elements, each of which is equally divided into a header and data, wherein the header of the extension element is composed of a field ID for identifying a different type of extension element and a field L for indicating the length of bytes occupied by the extension element (i.e., the bytes occupied after removing the field ID and the field L). Furthermore, as shown in fig. 12 or 13, the header extension needs to be aligned by 4 bytes, and thus zero padding alignment is performed when the 4-byte alignment is not satisfied. It should be noted that, in order to improve the data transmission efficiency, a header extension in the form of two-byte may be used.
In one implementation scenario, RTP packets may be encapsulated in IP packets for transmission, referring to fig. 14, fig. 14 is a schematic diagram of an embodiment of an IP packet. As shown in fig. 14, the IP packet is usually not more than 1500 bytes, but 1460 bytes remain after removing the IP header, UDP (User Datagram Protocol, user packet protocol) header, and RTP header for 40 bytes, and since the first encoded data of the transparent channel image needs to be added to the RTP header extension, 4 bytes of extension header needs to be removed, that is, 1456 bytes of payload may also remain for carrying encoded data. In the case of encapsulating the RTP packet with a data packet of another protocol, the payload may be calculated by analogy, and this is not exemplified here.
In one implementation scenario, the second number of extension elements required to carry the data amount to be allocated may be determined based on the data amount to be allocated and the maximum amount of carrying data of the extension elements, and based on this, the actual load consumed for carrying the data amount to be allocated by using the preset encapsulation mechanism may be determined based on the data amount to be allocated, the second number, and the fourth data amount occupied by the element header in the extension elements. It should be noted that the data amount to be allocated may be a first data amount, and the actual load may be a first load. For convenience of description, the first data amount may be denoted as X bytes, that is, the data amount to be allocated is X bytes, taking the header extension in the form of two-byte as an example, each extension element may carry 255 bytes at maximum, that is, the maximum carrying data amount of the extension element is 255 bytes, so that the second number of extension elements required to carry the data amount to be allocated may be determined as X/255, and, considering that the fourth data amount occupied by the element header in each extension element in the header extension in the form of two-byte is 2 bytes, based on the data amount to be allocated X, the second number X/255, and the fourth data amount occupied by the element header in the extension element (that is, 2 bytes), the load to be consumed may be determined as x+ (X/255) X2, and for convenience of description, the load to be consumed as x+ (X/255) 2 may be denoted as T. Of course, consider that the header extension needs to be padded with 0's in 4 bytes. That is, when T is divisible by 4, this indicates that 0 is not needed, i.e., the actual load Q is T; or when T is not divisible by 4, it indicates that 0 is needed, i.e., the actual load Q may be expressed as (T/4+1) ×4, where, () represents rounding the result of the operation. For example, when T is 21 bytes, the actual load Q is 24 bytes, i.e., 3 bytes are required. Other situations can be similar and are not exemplified here. In the above manner, the preset packet mechanism includes that the RTP header extension is encapsulated, and the RTP header extension includes a plurality of extension elements, based on the data amount to be allocated and the maximum bearing data amount of the extension elements, the second number of the extension elements required for bearing the data amount to be allocated is determined, and based on the data amount to be allocated, the second number and the fourth data amount occupied by the element header in the extension elements, the actual load consumed by bearing the data amount to be allocated by using the preset packet mechanism is determined, and the data amount to be allocated is the first data amount, and the actual load is the first load, so that the accuracy of determining the first load can be improved.
Step S103: a first number of data packets required to carry the image to be encoded is determined based on the first payload and a second amount of second encoded data.
Specifically, for convenience of description, the first load may be denoted as Q, the second data amount of the second encoded data may be denoted as Y, and the consumed load q+y of the image to be encoded may be acquired based on the first load Q and the second data amount Y, and on this basis, the first number may be determined based on the consumed load q+y and the payload of the packet (e.g., IP packet). Taking an IP packet as an example, as described above, if the payload of the IP packet is 1456 bytes, it may be determined whether the consumed load q+y is not greater than the payload, if so, it may be determined that the encoded data only needs one frame of packet, otherwise, if the consumed load q+y is greater than the payload, it may be determined that the encoded data needs multiple frames of packets, e.g., the quotient of the consumed load q+y divided by the payload may be increased by 1. Taking IP packets as an example, the first number N may be denoted as (q+y)/1456+1. Other situations can be similar and are not exemplified here. In the above manner, the consumed load of the image to be encoded is obtained based on the first load and the second number, and the first number is determined based on the consumed load and the payload of the data packet, which is beneficial to reducing the complexity of determining the first number.
Step S104: a third data amount of the first encoded data respectively allocated to each data packet is determined based on the first data amount and the first number.
In one implementation, the first encoded data may be randomly allocated to individual packets, and the remaining payloads of the individual packets reassign the second encoded data. That is, the third data amount allocated to each data packet may be randomly allocated.
In one implementation, the ratio between the first amount of data and the first number may be used as the third amount of data, with the remaining payloads of the respective packets reassigning the second encoded data. That is, the third data amount allocated to each data packet is equally allocated. Illustratively, as previously described, the first amount of data may be denoted as X, the first amount may be denoted as N, and the third amount of data may be denoted as X/N. According to the mode, the ratio between the first data quantity and the first quantity is used as the third data quantity, the first coding data can be evenly distributed to the payloads of the data packets, so that in the back-end decoding process, a part of an image to be coded can be decoded after each data packet is received, and the back-end processing efficiency is improved.
Step S105: for each data packet, determining a second payload required to carry a third amount of data using a preset encapsulation scheme, and allocating an unallocated portion of the second encoded data to a remaining payload of the data packet based on the second payload and a payload of the data packet.
In one implementation scenario, the second number of extension elements required for carrying the data amount to be allocated may be determined based on the data amount to be allocated and the maximum carrying data amount of the extension elements, and then the actual load consumed for carrying the data amount to be allocated by using the preset packet mechanism is determined based on the data amount to be allocated, the second number and the fourth data amount occupied by the element header in the extension elements, where the data amount to be allocated is the third data amount and the actual load is the second load. Illustratively, taking the average allocation as an example, the third data amount is denoted as X/N, the calculation process of the first load may be referred to, and the load X/n+ (X/N/255) X2 required to be consumed by each RTP may be calculated. Based on this, whether zero padding is needed or not can be determined according to whether the consumed load X/n+ (X/N/255) X2 can be divided by 4, so as to obtain the second load actually required by each data packet to bear the third data amount, and the detailed description of calculating the first load can be referred to, which is not repeated here. That is, the first load and the second load can be calculated by adopting similar calculation steps, so that the complexity of data encapsulation can be further reduced, and the data encapsulation efficiency can be improved.
In one implementation, after determining the second payload required for each data packet to carry the third amount of data, for each data packet, the unassigned portion of the second encoded data is assigned to the remaining payload of the data packet based on the second payload and the payload of the data packet. For example, for packet 1, based on the second payload and the payload of the packet (e.g., 1456 bytes for IP packets), the remaining payload of the packet's payload may be calculated, and the second encoded data packet is added to the remaining payload by a unit splitting mechanism such as h.264nal (Network Abstract Layer, network abstraction layer), and each RTP protocol packet sequence number (i.e., sequence num) after the packet is incremented by 1 in turn, with the RTP protocol packet time stamp being the same and unique.
According to the scheme, the first encoded data and the second encoded data of the image to be encoded are obtained, the first encoded data are encoded data of a transparent channel image of the image to be encoded, the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded, a first load required by the transparent channel image is determined based on the first data amount of the first encoded data by using a preset packaging mechanism, the first number of data packets required by the image to be encoded is determined based on the first load and the second data amount of the second encoded data, a third data amount of the first encoded data respectively distributed by each data packet is determined based on the first data amount and the first number, and a second load required by using the preset packaging mechanism is determined for each data packet, and an unallocated part of the second encoded data is distributed to a residual load of the data packet based on the second load and the effective load of the data packet.
Referring to fig. 15, fig. 15 is a schematic diagram illustrating a frame of an embodiment of a compression encoding apparatus 150 according to the present application. The compression encoding apparatus 150 includes: the device comprises a channel acquisition module 151, a sub-block division module 152 and an attribute coding module 153, wherein the channel acquisition module 151 is used for obtaining a transparent channel image based on the transparency of each pixel point of an image to be coded; the sub-block dividing module 152 is configured to divide the transparent channel image to obtain a plurality of sub-blocks; wherein, the pixel values of all pixel points in the sub-blocks are the same, and the set of a plurality of sub-blocks covers the transparent channel image; the attribute coding module 153 is configured to perform coding based on attribute information of each sub-block to obtain coded data of the transparent channel image; wherein the attribute information includes at least one of: sub-block position, sub-block size, sub-block amplitude, and sub-block amplitude represents pixel values of each pixel point in the sub-block.
According to the scheme, on one hand, in the compression coding process, the characteristics that the foreground and the background of the transparent channel generally have sharp edges and the foreground and the background are generally concentrated in one region respectively can be fully utilized, the transparent channel image is divided into a plurality of sub-blocks with the same pixel value in each region, so that compression coding is realized, edge information can be reserved as far as possible, the quality of the compression coding is improved, on the other hand, the compression coding can be realized only by dividing the sub-blocks, other complex operations are not needed, the operation complexity is reduced, the efficiency of the compression coding can be improved, and the power consumption of the compression coding is reduced.
In some disclosed embodiments, the sub-blocks are rectangles, and the sub-block positions represent coordinates of vertices of the rectangles, and the sub-block sizes include the length and width of the rectangles.
Therefore, the sub-blocks are set to be rectangular, the positions of the sub-blocks represent coordinates of vertexes of the rectangle, and the sizes of the sub-blocks comprise the length and the width of the rectangle, so that difficulty in sub-block division is reduced, and compression coding efficiency is improved.
In some disclosed embodiments, coordinates respectively represented by sub-block positions in the attribute information of each sub-block are coordinates of vertices of a rectangle at a first orientation, and the first orientation is any one of the following: upper left corner, lower left corner, upper right corner, lower right corner.
Therefore, the coordinates respectively represented by the sub-block positions in the attribute information of each sub-block are coordinates of vertexes of the rectangle at a first azimuth, and the first azimuth is any one of the upper left corner, the lower left corner, the upper right corner and the lower right corner, namely, the same azimuth vertexes are adopted to represent the sub-block positions, so that the encoding complexity can be reduced as much as possible, and the compression encoding efficiency is improved.
In some disclosed embodiments, the sub-blocks are square, and the sub-block size is the side length of the square.
Therefore, the sub-block is set to be square, and the sub-block size is the side length of square, so that the sub-block dividing difficulty can be further reduced, the compression coding efficiency can be further improved, and in addition, the sub-block size is set to be square, only the side length of square is required to be coded, the data volume can be further reduced, and the compression degree can be further improved.
In some disclosed embodiments, the sub-blocks do not intersect each other.
Therefore, each sub-block obtained by dividing the transparent channel image is mutually exclusive, which is beneficial to reducing data redundancy as much as possible and improving the data compression degree.
In some disclosed embodiments, there is at least one neighboring sub-block around the sub-block, and there is at least one pair of corner points adjacent to each other for both the sub-block and its neighboring sub-block.
Therefore, at least one adjacent sub-block exists around the sub-block, and at least one pair of corner points adjacent to each other exist between the sub-block and the adjacent sub-block, so that the ordering of the position distribution of each sub-block can be improved, and the efficiency of subsequent data decoding can be improved.
In some disclosed embodiments, the sub-block dividing module 152 includes a start determining sub-module for taking a preset position of the transparent channel image as a reference point; the sub-block dividing module 152 includes a pixel expansion sub-module, configured to expand the transparent channel image with a reference point as a starting point until a preset condition is not satisfied, so as to obtain a sub-block; wherein the corner points of the sub-blocks are reference points; the sub-block dividing module 152 includes a reference obtaining sub-module, configured to obtain a new reference point based on the remaining corner points of the sub-block; the sub-block dividing module 152 includes a repeating executing sub-module for re-executing the step of expanding the transparent channel image with the reference point as the starting point and the subsequent steps in combination with the pixel expanding sub-module and the reference acquiring sub-module until all the reference points have been used as the starting points; wherein the remaining corner points are corner points other than the starting point.
Therefore, the preset position of the transparent channel image is used as a reference point, the transparent channel image is expanded by taking the reference point as a starting point until the preset condition is not met, the sub-block is obtained, the corner points of the sub-block are taken as the reference point, on the basis, a new reference point is obtained based on the rest corner points of the sub-block, the step of expanding the transparent channel image by taking the reference point as the starting point is repeatedly executed until all the reference points are taken as the starting point, and the rest corner points are corner points other than the starting point, so that the sub-block division can be orderly and efficiently realized through pixel expansion, and the compression coding efficiency and quality are improved.
In some disclosed embodiments, the preset position is located at a corner of the transparent channel image at the second orientation; the pixel expansion submodule comprises an initialization unit, a first pixel width and a second pixel width, wherein the initialization unit is used for initializing the first pixel width and the second pixel width; the pixel expansion submodule comprises a candidate expansion unit, a first pixel width expanding unit and a second pixel width expanding unit, wherein the candidate expansion unit is used for expanding a first pixel width in a first direction and expanding a second pixel width in a second direction by taking a reference point as a starting point to obtain a candidate expansion region; the first direction is a direction which is horizontal and far away from the starting point, and the second direction is a direction which is vertical and far away from the starting point; the pixel expansion submodule comprises a target expansion unit, a target expansion unit and a pixel expansion submodule, wherein the target expansion unit is used for taking the latest candidate expansion area as a target expansion area based on the fact that the current expansion meets preset conditions; the pixel extension submodule comprises a width updating unit, a first pixel width updating unit and a second pixel width updating unit, wherein the width updating unit is used for updating the first pixel width based on a first pixel step length and updating the second pixel width based on a second pixel step length; the pixel expansion sub-module comprises a circulation unit, which is used for combining the candidate expansion unit, the target expansion unit and the width updating unit to execute the step taking the reference point as the starting point and the subsequent steps again until the preset condition is not met, and taking the latest target expansion area as the sub-block.
Therefore, by taking the reference point as a starting point and respectively carrying out pixel expansion towards a first direction which is horizontal and far away from the starting point and a second direction which is vertical and far away from the starting point, the latest candidate expansion area is taken as a target expansion area according to the fact that the current expansion meets the preset condition, and the pixel width during expansion is updated based on the pixel step length so as to carry out expansion again, so that the sub-block division flow can be simplified as much as possible, and the compression coding efficiency is improved.
In some disclosed embodiments, the second orientation is an upper left corner, the first direction is horizontal to the right, and the second direction is vertical to the bottom; or the second direction is the lower left corner, the first direction is horizontal to the right, and the second direction is vertical to the upper side; or the second direction is the upper right corner, the first direction is horizontal left, and the second direction is vertical downward; or the second direction is the lower right corner, the first direction is horizontal left, and the second direction is vertical upwards.
Therefore, by selecting the corner point of the transparent channel image as the starting point and expanding the corner point in the direction away from the starting point horizontally and the direction away from the starting point vertically, the expansion efficiency is improved.
In some disclosed embodiments, the reference acquisition submodule is specifically configured to use, as a new reference point, pixel points that are adjacent to each of the remaining corner points and that do not belong to a sub-block in the transparent channel image.
Therefore, the pixel points which are adjacent to the other corner points and do not belong to the sub-blocks in the transparent channel image are used as new reference points, so that any two sub-blocks are disjoint, and the compression degree is further improved.
In some disclosed embodiments, the preset conditions include: the pixel values of the extended pixel points are the same as the pixel values of the starting points, the extended pixel points do not belong to the divided sub-blocks, and the extended pixel points belong to the transparent channel image.
Therefore, the judgment is carried out from three aspects of pixel values of the extended pixel points, whether the extended pixel points are occupied or not, whether the extended pixel points belong to the transparent channel image or not, and the accuracy of compression coding is improved.
Referring to fig. 16, fig. 16 is a schematic diagram illustrating a frame of an embodiment of a data encapsulation device 160 according to the present application. The data encapsulation device 160 includes: a data acquisition module 161, a load determination module 162, a number determination module 163, a first allocation module 164 and a second allocation module 165, the data acquisition module 161 being configured to acquire first encoded data and second encoded data of an image to be encoded; the first encoded data are encoded data of a transparent channel image of an image to be encoded, and the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded; the load determining module 162 is configured to determine, based on a first data amount of the first encoded data, a first load required for carrying the transparent channel image by using a preset encapsulation mechanism; a number determining module 163 for determining a first number of data packets required to carry the image to be encoded based on the first payload and a second data amount of the second encoded data; a first allocation module 164, configured to determine a third data amount of the first encoded data allocated to each data packet based on the first data amount and the first number; a second allocation module 165, configured to determine, for each data packet, a second payload required to carry the third data amount using the preset encapsulation mechanism, and allocate an unallocated portion of the second encoded data to a remaining payload of the data packet based on the second payload and the payload of the data packet.
According to the scheme, the first encoded data and the second encoded data of the image to be encoded are obtained, the first encoded data are encoded data of a transparent channel image of the image to be encoded, the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded, a first load required by the transparent channel image is determined based on the first data amount of the first encoded data by using a preset packaging mechanism, the first number of data packets required by the image to be encoded is determined based on the first load and the second data amount of the second encoded data, a third data amount of the first encoded data respectively distributed by each data packet is determined based on the first data amount and the first number, and a second load required by using the preset packaging mechanism is determined for each data packet, and an unallocated part of the second encoded data is distributed to a residual load of the data packet based on the second load and the effective load of the data packet.
In some disclosed embodiments, the first encoded data is obtained using the compression encoding apparatus of any of the above disclosed embodiments.
Therefore, the first encoded data is obtained by encoding by using any one of the compression encoding devices, which is beneficial to improving the data encapsulation efficiency and the quality of the data packet finally obtained by encapsulation.
In some disclosed embodiments, the preset encapsulation mechanism includes encapsulation by an RTP header extension, and the RTP header extension includes a number of extension elements; the load determining module 162 or the second allocating module 165 includes an element number determining submodule and an actual load calculating submodule, where the element number determining submodule is used for determining a second number of expansion elements required for carrying the data amount to be allocated based on the data amount to be allocated and the maximum carrying data amount of the expansion elements; the actual load calculation submodule is used for determining the actual load consumed by bearing the data quantity to be distributed by utilizing a preset packaging mechanism based on the data quantity to be distributed, the second quantity and the fourth data quantity occupied by the element head in the extension element; wherein the actual load is the first load when the amount of data to be allocated is the first amount of data, and the actual load is the second load when the amount of data to be allocated is the third amount of data.
Therefore, the preset packet mechanism comprises that the RTP header extension is packaged, the RTP header extension comprises a plurality of extension elements, based on the data quantity to be distributed and the maximum bearing data quantity of the extension elements, the second quantity of the extension elements required for bearing the data quantity to be distributed is determined, and based on the data quantity to be distributed, the second quantity and the fourth data quantity occupied by the element header in the extension elements, the actual load consumed by the data quantity to be distributed is determined by utilizing the preset packet mechanism, so that the accuracy of determining the first load and the second load can be improved, and the first load and the second load are calculated by adopting similar calculation steps, so that the complexity of data packaging can be further reduced, and the data packaging efficiency can be improved.
In some disclosed embodiments, the number determination module 163 includes a consumed load calculation sub-module for obtaining a consumed load of an image to be encoded based on the first load and the second data amount; the number determination module 163 includes a first number determination sub-module for determining a first number based on the consumed load and the payload of the data packet.
Therefore, the first number is determined based on the first load and the second number, which is beneficial to reducing the complexity of determining the first number.
In some disclosed embodiments, the first allocation module 164 is specifically configured to use a ratio between the first amount of data and the first amount of data as the third amount of data.
Therefore, the ratio between the first data amount and the first number is used as the third data amount, so that the first encoded data can be evenly distributed into the payloads of the data packets, and a part of the image to be encoded can be decoded after each data packet is received in the back-end decoding process, thereby being beneficial to improving the back-end processing efficiency.
Referring to fig. 17, fig. 17 is a schematic diagram illustrating a frame of an embodiment of an electronic device 170 of the present application. The electronic device 170 comprises a memory 171 and a processor 172 coupled to each other, the memory 171 having stored therein program instructions, the processor 172 being adapted to execute the program instructions to implement steps in any of the above-described compression encoding method embodiments or to implement steps in any of the above-described data encapsulation method embodiments. In particular, the electronic device 170 may include, but is not limited to: desktop computers, notebook computers, servers, cell phones, tablet computers, and the like, are not limited herein.
In particular, the processor 172 is configured to control itself and the memory 171 to implement the steps of any of the compression encoding method embodiments described above, or to implement the steps of any of the data encapsulation method embodiments described above. The processor 172 may also be referred to as a CPU (Central Processing Unit ). The processor 172 may be an integrated circuit chip having signal processing capabilities. The processor 172 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 172 may be commonly implemented by an integrated circuit chip.
According to the scheme, on one hand, in the compression coding process, the characteristics that the foreground and the background of the transparent channel generally have sharp edges and the foreground and the background are generally concentrated in one region respectively can be fully utilized, the transparent channel image is divided into a plurality of sub-blocks with the same pixel value in each region, so that compression coding is realized, edge information can be reserved as far as possible, the quality of the compression coding is improved, on the other hand, the compression coding can be realized only by dividing the sub-blocks, other complex operations are not needed, the operation complexity is reduced, the efficiency of the compression coding can be improved, and the power consumption of the compression coding is reduced.
Referring to FIG. 18, FIG. 8 is a schematic diagram illustrating an embodiment of a computer-readable storage medium 180 of the present application. The computer readable storage medium 180 stores program instructions 181 capable of being executed by a processor, the program instructions 181 being for implementing the steps in any of the compression encoding method embodiments described above, or implementing the steps in any of the data encapsulation method embodiments described above.
According to the scheme, on one hand, in the compression coding process, the characteristics that the foreground and the background of the transparent channel generally have sharp edges and the foreground and the background are generally concentrated in one region respectively can be fully utilized, the transparent channel image is divided into a plurality of sub-blocks with the same pixel value in each region, so that compression coding is realized, edge information can be reserved as far as possible, the quality of the compression coding is improved, on the other hand, the compression coding can be realized only by dividing the sub-blocks, other complex operations are not needed, the operation complexity is reduced, the efficiency of the compression coding can be improved, and the power consumption of the compression coding is reduced.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.
In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Claims (14)
1. A method of data encapsulation, comprising:
acquiring first coding data and second coding data of an image to be coded; the first encoded data are encoded data of a transparent channel image of the image to be encoded, and the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded;
determining a first load required for bearing the transparent channel image by using a preset packaging mechanism based on a first data volume of the first coded data;
determining a first number of data packets required to carry the image to be encoded based on the first payload and a second amount of data of the second encoded data;
determining a third data amount of first coded data respectively allocated to each data packet based on the first data amount and the first number;
for each of the data packets, determining a second payload required to carry the third amount of data using the preset encapsulation scheme, and allocating an unallocated portion of the second encoded data to a remaining payload of the data packet based on the second payload and a payload of the data packet.
2. The method according to claim 1, wherein the acquiring the first encoded data of the image to be encoded comprises:
Obtaining a transparent channel image based on the transparency of each pixel point of the image to be coded;
dividing the transparent channel image to obtain a plurality of sub-blocks; wherein, the pixel values of all pixel points in the same sub-block are the same, and the set of the sub-blocks covers the transparent channel image;
coding based on the attribute information of each sub-block to obtain the first coded data; wherein the attribute information includes at least one of: sub-block position, sub-block size, sub-block amplitude, and the sub-block amplitude represents a pixel value of each of the pixel points in the sub-block.
3. The method of claim 2, wherein the sub-block is a rectangle and the sub-block position represents coordinates of vertices of the rectangle, the sub-block size comprising a length and a width of the rectangle.
4. The method of claim 2, wherein the sub-blocks do not intersect each other.
5. The method according to claim 2, wherein the dividing the transparent channel image to obtain a plurality of sub-blocks includes:
taking a preset position of the transparent channel image as a reference point;
expanding the transparent channel image by taking the reference point as a starting point until a preset condition is not met, so as to obtain the sub-block; wherein the corner point of the sub-block is the reference point;
Obtaining new reference points based on the rest angular points of the sub-blocks, and re-executing the step of expanding the transparent channel image by taking the reference points as starting points and the subsequent steps until all the reference points are taken as the starting points;
wherein the rest corner points are corner points outside the starting point.
6. The method of claim 5, wherein the predetermined location is at a corner of the transparent channel image at a second location; and expanding by taking the reference point as a starting point until a preset condition is not met, so as to obtain the sub-block, wherein the method comprises the following steps:
initializing a first pixel width and a second pixel width;
the reference point is used as the starting point, the first pixel width is expanded in a first direction, and the second pixel width is expanded in a second direction, so that a candidate expansion area is obtained; the first direction is a direction which is horizontal and far away from the starting point, and the second direction is a direction which is vertical and far away from the starting point;
based on the current expansion meeting the preset condition, taking the latest candidate expansion area as a target expansion area;
updating the first pixel width based on a first pixel step length, updating the second pixel width based on a second pixel step length, and re-executing the step taking the reference point as the starting point and the subsequent steps until the preset condition is not met, wherein the latest target expansion area is taken as the sub-block.
7. The method of claim 5, wherein the obtaining a new reference point based on the remaining corner points of the sub-block comprises:
and taking the pixel points which are adjacent to the rest angular points and do not belong to the sub-blocks in the transparent channel image as the new reference points.
8. The method of claim 5, wherein the preset conditions include: the pixel values of the extended pixel points are the same as the pixel values of the starting points, the extended pixel points do not belong to the divided sub-blocks, and the extended pixel points belong to the transparent channel image.
9. The method of claim 1, wherein the pre-determined encapsulation mechanism comprises encapsulation by an RTP header extension, and wherein the RTP header extension comprises a plurality of extension elements; the determining of the first load or the second load comprises:
determining a second number of extension elements required for carrying the data amount to be allocated based on the data amount to be allocated and the maximum carrying data amount of the extension elements;
determining an actual load consumed by bearing the data quantity to be distributed by using the preset packaging mechanism based on the data quantity to be distributed, the second quantity and a fourth data quantity occupied by an element head in the expansion element;
Wherein the actual load is the first load when the data amount to be allocated is the first data amount, and the actual load is the second load when the data amount to be allocated is the third data amount.
10. The method of claim 1, wherein the obtaining a first number of data packets required to carry the image to be encoded based on the first payload and a second amount of data of the second encoded data comprises:
acquiring a consumption load of the image to be encoded based on the first load and the second data amount;
the first number is determined based on the consumed load and a payload of the data packet.
11. The method of claim 1, wherein the determining a third amount of first encoded data carried by each of the data packets based on the first amount of data and the first amount of data comprises:
and taking the ratio between the first data quantity and the first quantity as the third data quantity.
12. A data encapsulation apparatus, comprising:
the data acquisition module is used for acquiring first coding data and second coding data of the image to be coded; the first encoded data are encoded data of a transparent channel image of the image to be encoded, and the second encoded data are encoded data of chromaticity and/or brightness of the image to be encoded;
The load determining module is used for determining a first load required by bearing the transparent channel image by using a preset packaging mechanism based on a first data volume of the first coded data;
a quantity determining module, configured to determine, based on the first payload and a second data quantity of the second encoded data, a first quantity of data packets required to carry the image to be encoded;
a first allocation module, configured to determine a third data amount of first encoded data allocated to each of the data packets based on the first data amount and the first number;
and the second allocation module is used for determining a second load required by bearing the third data volume by utilizing the preset encapsulation mechanism for each data packet, and allocating an unallocated part of the second coded data to the residual load of the data packet based on the second load and the payload of the data packet.
13. An electronic device comprising a memory and a processor coupled to each other, the memory having stored therein program instructions, the processor being configured to execute the program instructions to implement the data encapsulation method of any one of claims 1 to 11.
14. A computer readable storage medium, characterized in that program instructions executable by a processor for implementing the data encapsulation method according to any one of claims 1 to 11 are stored.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310254647.6A CN116389753A (en) | 2021-12-30 | 2021-12-30 | Data encapsulation method, related device, equipment and medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111651192.9A CN114390285B (en) | 2021-12-30 | 2021-12-30 | Compression coding method, data packaging method, related device, equipment and medium |
CN202310254647.6A CN116389753A (en) | 2021-12-30 | 2021-12-30 | Data encapsulation method, related device, equipment and medium |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111651192.9A Division CN114390285B (en) | 2021-12-30 | 2021-12-30 | Compression coding method, data packaging method, related device, equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116389753A true CN116389753A (en) | 2023-07-04 |
Family
ID=81199401
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310254647.6A Pending CN116389753A (en) | 2021-12-30 | 2021-12-30 | Data encapsulation method, related device, equipment and medium |
CN202111651192.9A Active CN114390285B (en) | 2021-12-30 | 2021-12-30 | Compression coding method, data packaging method, related device, equipment and medium |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111651192.9A Active CN114390285B (en) | 2021-12-30 | 2021-12-30 | Compression coding method, data packaging method, related device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN116389753A (en) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2575436B (en) * | 2018-06-29 | 2022-03-09 | Imagination Tech Ltd | Guaranteed data compression |
CN111369581B (en) * | 2020-02-18 | 2023-08-08 | Oppo广东移动通信有限公司 | Image processing method, device, equipment and storage medium |
US11481929B2 (en) * | 2020-03-16 | 2022-10-25 | Meta Platforms Technologies, Llc | System and method for compressing and decompressing images using block-based compression format |
CN113645469B (en) * | 2020-05-11 | 2022-06-24 | 腾讯科技(深圳)有限公司 | Image processing method and device, intelligent terminal and computer readable storage medium |
-
2021
- 2021-12-30 CN CN202310254647.6A patent/CN116389753A/en active Pending
- 2021-12-30 CN CN202111651192.9A patent/CN114390285B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN114390285B (en) | 2023-04-04 |
CN114390285A (en) | 2022-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113615207A (en) | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device, and point cloud data receiving method | |
WO2019166688A1 (en) | An apparatus, a method and a computer program for volumetric video | |
US20210201540A1 (en) | Point cloud encoding method and encoder | |
EP3732653B1 (en) | An apparatus, a method and a computer program for volumetric video | |
JP5852226B2 (en) | Devices and methods for warping and hole filling during view synthesis | |
JP7438993B2 (en) | Method and apparatus for encoding/decoding point cloud geometry | |
US20210377554A1 (en) | Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding on image | |
CN106412595B (en) | Method and apparatus for encoding high dynamic range frames and applied low dynamic range frames | |
TWI487366B (en) | Bitstream syntax for graphics-mode compression in wireless hd 1.1 | |
US11321878B2 (en) | Decoded tile hash SEI message for V3C/V-PCC | |
US20110222600A1 (en) | Systems and Methods for Improved Data Transmission | |
CN108184101B (en) | Apparatus and method for processing video | |
CN110049379B (en) | Video delay detection method and system | |
TWI505717B (en) | Joint scalar embedded graphics coding for color images | |
WO2023221764A1 (en) | Video encoding method, video decoding method, and related apparatus | |
CN114390285B (en) | Compression coding method, data packaging method, related device, equipment and medium | |
US20240244259A1 (en) | Methods and apparatuses for encoding/decoding a volumetric video, methods and apparatus for reconstructing a computer generated hologram | |
US20230186522A1 (en) | 3d scene transmission with alpha layers | |
KR20220054283A (en) | Methods for transmitting and rendering a 3D scene, a method for generating patches, and corresponding devices and computer programs | |
CN113450293A (en) | Video information processing method, device and system, electronic equipment and storage medium | |
EP3804334A1 (en) | An apparatus, a method and a computer program for volumetric video | |
WO2019162564A1 (en) | An apparatus, a method and a computer program for volumetric video | |
CN115546328B (en) | Picture mapping method, compression method, decoding method and electronic equipment | |
KR102721289B1 (en) | Error concealment in discrete rendering using shading atlases | |
WO2020057338A1 (en) | Point cloud coding method and encoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |