CN112822492B - Method, device, equipment, system and medium for recombining error code resistant video coding - Google Patents

Method, device, equipment, system and medium for recombining error code resistant video coding Download PDF

Info

Publication number
CN112822492B
CN112822492B CN202011622272.7A CN202011622272A CN112822492B CN 112822492 B CN112822492 B CN 112822492B CN 202011622272 A CN202011622272 A CN 202011622272A CN 112822492 B CN112822492 B CN 112822492B
Authority
CN
China
Prior art keywords
coding
video
unit
dependency
coding units
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011622272.7A
Other languages
Chinese (zh)
Other versions
CN112822492A (en
Inventor
刘云淮
黄永贵
苏玥琦
谷晟
冯哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202011622272.7A priority Critical patent/CN112822492B/en
Publication of CN112822492A publication Critical patent/CN112822492A/en
Application granted granted Critical
Publication of CN112822492B publication Critical patent/CN112822492B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Abstract

The application provides a method, a device, equipment, a system and a medium for recombining error-resistant video coding. The method comprises the steps of obtaining an HEVC coded video code stream of at least one image group from a video source; analyzing the video code stream to separate a plurality of coding units, and retrieving the prediction mode of each coding unit and the motion vector information of each prediction unit in the coding unit to obtain a dependency relationship graph among the coding units; traversing the dependency graph, orderly dividing the dependency graph into a plurality of regions according to a preset rule of a user, and grouping each coding unit according to the regions; and encapsulating the packets into network data packets by taking each packet as a unit and transmitting the network data packets through the network. The method and the device can reduce error diffusion when packet loss occurs in network transmission, so that the range of the affected video image is reduced. Meanwhile, the decoding end can use a proper method to reversely operate to restore each coding unit to the original position, so that the coding unit can be correctly decoded by a video decoder conforming to the internet video coding standard.

Description

Method, device, equipment, system and medium for recombining error code resistant video coding
Technical Field
The present application relates to the field of digital video coding technologies, and in particular, to a method, an apparatus, a device, a system, and a medium for error-resistant video coding reassembly.
Background
In the HEVC coding algorithm, the association between each part of picture in the video is largely considered, so that the slices obtained after coding have strong dependency. A series of consecutive video frames having dependency in encoding constitutes a Group of Pictures (GOP). Each video frame is composed of one or more slices, and each slice comprises a plurality of coding units. When a certain coding unit is decoded by a decoding algorithm, the pre-coding unit is required to be correctly obtained and decoded, namely the fragments to which the coding unit belongs are required to be correctly obtained, so that the video picture at the corresponding position can be successfully restored, otherwise, the video decoding algorithm automatically uses some substitute information to fill up missing information, so that the finally restored video picture is inconsistent with the original video picture and has a defective part.
In network transmission, especially in wireless networks, the capacity of network transmission is unstable, and at the same time, certain bit error rate exists in network transmission, which inevitably brings about information error or loss of some coding units. Because the minimum unit of data loss is a packet, and one packet contains information of a plurality of coding units, different packing modes lead to huge difference of the proportions of decodable coding units when the proportion of correctly transmitted packets is the same.
The existing video coding and decoding system generally transmits video code streams in sequence, so that the sequence of coding units carried by the final data packet is fixed, dependence is not considered, a large number of correctly arrived packets are discarded because the pre-coding units in other error code packets cannot be correctly decoded, and the quality of video pictures under the condition that data are lost in a network is greatly reduced.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, it is an object of the present application to provide an error resilient video coding reassembly method, apparatus, device, system and medium for addressing at least one of the above-mentioned problems.
To achieve the above and other related objects, the present application provides an error-resilient video coding and recomposing method applied to a coding end, the method comprising: the method comprises the following steps: the method comprises the steps of obtaining an HEVC (high efficiency video coding) video code stream of at least one image group from a video source; analyzing the video code stream to separate a plurality of coding units, and retrieving the prediction mode of each coding unit and the motion vector information of each prediction unit in the coding unit to obtain a dependency relationship graph among the coding units; traversing the dependency graph, orderly dividing the dependency graph into a plurality of regions according to a preset rule of a user, and grouping each coding unit according to the regions; and encapsulating the packets into network data packets by taking each packet as a unit and transmitting the network data packets through the network.
In an embodiment of the present application, the motion vector information includes motion vector information of an inter prediction mode: the dependency graph is represented by a directed acyclic graph.
In an embodiment of the present application, the prediction mode determines a decoding dependency range of a corresponding prediction unit; the intra-frame prediction mode depends on a coded coding unit in the same image frame slice adjacent to the current area, and the inter-frame prediction mode depends on a coding unit in a motion vector pointing area in a reference frame; the dependency range of the coding unit is then the sum of the dependency ranges of the corresponding prediction units.
In an embodiment of the application, traversing the dependency graph and dividing the dependency graph into a plurality of regions in order according to a preset rule of a user includes: the sum of the packed sizes of the coding units in each region does not exceed the maximum allowable range of the network data packet; the division is used for separating the dependency relationship between the coding units; wherein, the coding units with close dependence relations are divided into the same area, otherwise, the coding units are divided into different areas.
In an embodiment of the present application, the user preset rule includes: depth-first traversal is performed on the dependency graph by taking a coding unit which does not depend on other coding units as a starting point; and dividing the regions according to the traversal sequence, and dividing coding units accessed adjacently during traversal into the same region.
In an embodiment of the present application, metadata added before each coding unit in each group after grouping is ordered into a format of an original video code stream when being recombined; the metadata includes: frame number and unit number.
To achieve the above and other related objects, the present application provides an error-resilient video coding and re-assembling apparatus applied to a coding end, the apparatus comprising: the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a video code stream of HEVC coding of at least one image group from a video source; the analysis module is used for analyzing the video code stream to separate a plurality of coding units and retrieving the prediction mode of each coding unit and the motion vector information of each prediction unit in the coding unit so as to obtain a dependency relationship graph among the coding units; the grouping module is used for traversing the dependency graph, orderly dividing the dependency graph into a plurality of regions according to a preset rule of a user and grouping each coding unit according to the regions; and the transmission module is used for encapsulating each group into a network data packet and transmitting the network data packet through a network.
To achieve the above and other related objects, the present application provides an encoding-side apparatus, including: a memory, a processor, and a communicator; the memory is used for storing a computer program; the processor runs the computer program to realize the method; the communicator is used for being in communication connection with the decoding end device.
To achieve the above and other related objects, the present application provides an error resilient video coding re-assembly system, comprising: the encoding side device and the decoding side device as described above; and the decoding end equipment is used for receiving the data packet sent by the encoding end equipment, sequencing each encoding unit based on the metadata and reducing the data packet into the format of the original video code stream for decoding and playing.
To achieve the above and other related objects, the present application provides a computer readable storage medium having stored thereon computer instructions which, when executed, perform the method as described above.
In summary, the present application provides a method, an apparatus, a device, a system and a medium for error-resistant video coding reassembly. According to the Coding Unit (CU) structure in the HEVC video code stream, all coding units are extracted from the code stream, the dependency relationship of the coding units is analyzed, the sequence and the packaging mode of the coding units are adjusted according to the dependency relationship and a rule specified by a user, error diffusion during network transmission packet loss is reduced, and therefore the range size of an affected video image is reduced. Meanwhile, the decoding end can use a proper method to reversely operate to restore each coding unit to the original position, so that the coding unit can be correctly decoded by a video decoder conforming to the internet video coding standard.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flow chart of an error-resistant video coding and recombining method in the present application.
Fig. 2 is a schematic diagram illustrating the partitioning and order of coding units in HEVC in the present application.
Fig. 3 is a schematic diagram of the dependent regions of intra-frame coding and inter-frame coding in the present application.
Fig. 4 shows a scene diagram of three forms of the dependent area in the present application.
Fig. 5 shows a schematic diagram of a dependency relationship directed acyclic in the present application.
Fig. 6 is a schematic view of a scenario of a process applied to an encoding end and a corresponding decoding end in the present application.
Fig. 7 is a block diagram of an error-resistant video coding and reconstructing apparatus according to the present application.
Fig. 8 is a schematic structural diagram of an encoding-side device in the present application.
Fig. 9 is a schematic structural diagram of an anti-error video coding re-assembly system according to the present application.
Detailed Description
The following description of the embodiments of the present application is provided by way of specific examples, and other advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure herein. The present application is capable of other and different embodiments and its several details are capable of modifications and/or changes in various respects, all without departing from the spirit of the present application. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only schematic and illustrate the basic idea of the present application, and although the drawings only show the components related to the present application and are not drawn according to the number, shape and size of the components in actual implementation, the type, quantity and proportion of the components in actual implementation may be changed at will, and the layout of the components may be more complex.
Throughout the specification, when a part is referred to as being "connected" to another part, this includes not only a case of being "directly connected" but also a case of being "indirectly connected" with another element interposed therebetween. In addition, when a certain part is referred to as "including" a certain component, unless otherwise stated, other components are not excluded, but it means that other components may be included.
The terms first, second, third, etc. are used herein to describe various elements, components, regions, layers and/or sections, but are not limited thereto. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, component, region, layer or section discussed below could be termed a second element, component, region, layer or section without departing from the scope of the present application.
Also, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," and/or "comprising," when used in this specification, specify the presence of stated features, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, operations, elements, components, items, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions or operations are inherently mutually exclusive in some way.
In recent years, video content in the mobile internet has been increasing. Because the data volume of the original video picture is extremely large, the original video picture can be stored on a computer storage medium or transmitted in a network only through video compression coding, and the decoding end restores the original video picture through corresponding decoding operation.
Existing video encoding and transmission methods generally comprise the steps of:
1. dividing a video picture to be encoded into blocks of pixels of a particular size, such as 4x4,8x8,16x16,32x32, or other sizes;
2. coding the pixel blocks obtained in the last step by using a certain coding algorithm conforming to the Internet video coding standard to obtain corresponding coding units; coding units have different definitions and names in different coding algorithms, and are called as Coding Tree Units (CTUs) and Coding Units (CUs) in HEVC video coding; video coding algorithms generally comprise the following steps:
a) intra-frame prediction coding and inter-frame prediction coding;
b) discrete cosine transform;
c) quantizing;
d) entropy coding;
3. assembling the coding units obtained in the last step into slices (Slice) according to the time sequence among frames and the scanning sequence inside one frame;
4. further packaging the fragments obtained in the last step to facilitate the fragments to be transmitted or stored in a computer storage medium through a network; in HEVC, this step is called Network Abstraction Layer (NAL), and the data Unit obtained after further encapsulation is called Network Abstraction Layer Unit (NAL Unit, NALU).
In network transmission, especially in wireless networks, the capacity of network transmission is unstable, and at the same time, certain bit error rate exists in network transmission, which inevitably brings about information error or loss of some coding units. Because the minimum unit of data loss is a packet, and one packet contains information of a plurality of coding units, different packing modes lead to huge difference of the proportions of decodable coding units when the proportion of correctly transmitted packets is the same.
The existing video coding and decoding system generally transmits video code streams in sequence, so that the sequence of coding units carried by the final data packet is fixed, dependence is not considered, a large number of correctly arrived packets are discarded because the pre-coding units in other error code packets cannot be correctly decoded, and the quality of video pictures under the condition that data are lost in a network is greatly reduced.
In view of the problems in the prior art, the present application provides an anti-error code video coding recombination method, apparatus, device, system, and medium, which, for a Coding Unit (CU) structure in an HEVC video code stream, extracts all coding units from the code stream and analyzes their dependency relationship, and then adjusts the order and the packet mode thereof according to the dependency relationship and a rule specified by a user, thereby reducing error diffusion when packet loss occurs in network transmission, and thus reducing the range size of an affected video image. Meanwhile, the decoding end can use a proper method to reversely operate to restore each coding unit to the original position, so that the coding unit can be correctly decoded by a video decoder conforming to the internet video coding standard.
Fig. 1 is a schematic flow chart of an error-resilient video coding reassembly method according to an embodiment of the present application. The method is mainly applied to an encoding end. As shown, the method comprises:
step S101: the method comprises the steps of obtaining an HEVC coded video code stream of at least one image group from a video source.
Preferably, the video source comprises: the video code stream output by the video encoder in real time according with the internet video coding standard, or the video code stream generated in advance and stored in a computer storage medium, or the video code stream generated by the method provided by a third party through a network, and the like.
Preferably, the size of one group of pictures can be set to 24 frames, the coded video code stream is firstly placed into a buffer, and the code stream in the buffer forms a group of pictures before the next processing.
It should be noted that HEVC is an emerging internet standard video coding, and the video coding has multiple block structures, where the largest block structure is called a Coding Tree Unit (CTU), and the size of the CTU can be specified by the configuration of the encoder and remains unchanged in the code stream, such as 64x64, 32x32 or 16x 16; the CTUs are further divided into Coding Units (CUs) in zero, one or more quarters.
Step S102: analyzing the video code stream to separate a plurality of coding units, and searching the prediction mode of each coding unit and the motion vector information of each prediction unit in the coding unit to obtain a dependency relationship graph among the coding units.
Specifically, the obtained video code stream is analyzed to separate each coding unit corresponding to each image frame; and retrieving the prediction mode of each coding unit and the motion vector information of inter prediction of each prediction unit inside thereof, thereby determining the dependency relationship or dependency information between the coding units to construct a dependency relationship graph. The preferred dependency graph is represented using directed acyclic graphs.
In a video stream, a small number of key frames which can be decoded only by using the frame information are required to exist, and the key frames are called I frames; the other frames are non-key frames, called P-frames. The decoding of P frames depends on the first I frames or P frames not exceeding the first I frame, requiring the correct decoding of all dependent frames ahead. The I frame is an intra-frame coding frame, and the intra-frame coding frame is used for compressing the intra-frame coding image of the transmission data amount by removing image space redundant information as far as possible; p-frames are forward predictive coded frames that compress the amount of transmitted data of a coded picture by substantially reducing the temporal redundancy information below previously coded frames in the picture sequence, also called predictive frames.
In the embodiment, the present application mainly adopts an intra-frame prediction encoding mode for I-frame encoding, and an inter-frame prediction encoding mode for P-frame encoding. Briefly, the intra prediction encoding and the inter prediction encoding mainly use the similarity between the inside of a frame or between adjacent frames to reduce the amount of data by encoding only difference information.
It should be noted that, because the relevance of several adjacent frames in the video stream is very high, the inter-frame prediction technique plays a crucial role in compression. In this technique, each frame image is divided into a plurality of Coding Units (CU), for each Coding Unit (CU), the closest region is found in the adjacent frames (referred to as reference frames), and the position information of the region and the difference information of the two are used as the coding of the Coding Unit (CU), so that the amount of coded data can be greatly reduced.
Preferably, the Coding Unit (CU) typically performs video coding by dividing the captured video pictures into blocks of pixels of a certain size, such as 4x4,8x8,16x16,32x32, or other sizes. This block of pixels is called a Coding Unit (CU), which is called a Macroblock (Macroblock) in h.264, and Coding Tree Unit (CTU) and Coding Unit (CU) in HEVC.
Preferably, the Coding Unit (CU) sizes may not be the same, such as may be minimally divided into 8x 8; the Coding Units (CU) in the same frame are sequentially coded in a zigzag scanning order, and as shown in fig. 2, the coding units are sequentially scanned 1-13 in a left-to-right and top-to-bottom order. The encoding is performed on the basis of the Coding Unit (CU) using an encoding algorithm conforming to an Internet video encoding standard to obtain encoding information of the Coding Unit (CU).
Preferably, the dependency information or the dependency relationship is extracted according to the neighboring region coding units found in all reference frames before the image frame and respectively corresponding to the coding units on the current image frame, so as to obtain the dependency information corresponding to the coding units for coding unit reorganization. The dependency information may be used for reorganization of blocks. Specifically, the dependency information includes: reference frame numbers for inter prediction, motion vectors, and the pre-coding unit on which each coding unit depends.
In brief, a Coding Unit (CU) determines which prediction mode is used for coding an image of a corresponding region, for example, an inter-frame corresponding region is coded by inter-frame prediction, and an intra-frame corresponding region is coded by intra-frame prediction; further, the Coding Unit (CU) may be divided into 1-4 Prediction Units (PU), each of which may use different prediction parameters, wherein the intra prediction may select one of the 35 seed modes, and the inter prediction may select 1-2 reference frames (corresponding to P-frames and B-frames) and corresponding motion vectors.
The HEVC standard provides 35 intra prediction modes, including a DC mode, a Planar mode, and 33 Angular modes. In test model HM10.0 of HEVC, the mode selection for intra prediction can be summarized as: the 35 prediction modes are first selected, according to the STAD approach, using Hadamard transformation, the 3 (for 64 × 64,32 × 32,16 × 16 PUs) or 8 (for 8 × 8, 4 × 4 PUs) with the smallest RD-cost. Meanwhile, calculating the MPM (most Probable mode) of the current PU, and selecting the MPM and the MPM in a manner of passing through Full R-D cost through a reduced prediction mode to obtain an optimal prediction mode.
In one embodiment of the present application, the prediction mode determines a decoding dependency range of a corresponding Prediction Unit (PU); the intra-frame prediction depends on coded Coding Units (CU) in adjacent same image frames of a current area, and in the case that one image frame has a plurality of slices (Slice), the intra-frame prediction dependent range is limited in the slices. Inter-prediction relies on the Coding Unit (CU) in the motion vector pointing region in the reference frame; the dependency range of a Coding Unit (CU) is then the sum of the dependency ranges of the corresponding Prediction Units (PU).
For example, as shown in fig. 3, the image frame including Slice 1 in the image frame is an I-frame, which corresponds to the intra-prediction mode, and the current region P (i.e. the prediction unit) depends on the following ranges: each adjacent coded unit (5 coded units in the figure) in the same image frame which a dotted line frame representing a dependent region passes through; the image frame Slice 2 and the Slice3 belong to image frames of different slices, the image frame of the image frame Slice3 is a P frame, corresponding to the inter-frame prediction mode, the image frame of the image frame Slice 2 is used as a reference frame, and if the range depended by the current region P on the image frame Slice3 is: and pointing to coding units (4 coding units in the figure) passed by a dotted frame representing a dependent region on the image frame Slice 2 according to the motion vector.
Preferably, there are generally three possible forms of the dependency area, as shown in fig. 4, for example, the dependency range of the prediction area P (i.e. the current area as described above) in the first form relates to 4 coding units s1-s 4; the prediction region P in the second form has a dependency range of 2 coding units s1-s 2; the prediction region P in the third form has a dependent range that only relates to 1 coding unit s 3.
In the present application, the dependency graph is represented by a directed acyclic graph. Fig. 5 shows a schematic diagram of a dependency relationship directed acyclic in the present embodiment. As shown, the Coding Units (CU) independent of other coding units are taken as the starting point or the head line, that is, such coding units are located in the image frame of the I frame, and each Coding Unit (CU) in the P frame image frame has a dependency relationship with each Coding Unit (CU) in the I frame image frame or each Coding Unit (CU) in other P frame image frames.
It should be noted that, by inter-frame prediction coding, inter-frame difference information is found, but the dependency relationship between frames is rarely recorded. Just because the dependency relationship between the encoding units between the frames is not considered, in the existing transmission system for sequentially packaging and sending the frames according to the time sequence of the frames, the loss of any data packet is almost independent, so that a large number of subsequent encoding units cannot be decoded, and the proportion of the decodable blocks is greatly reduced.
The dependency information provided by the present application may specifically include: the reference frame number of inter prediction, motion vector, and the pre-coding unit on which each coding unit depends are retrieved through the collected information. Preferably, through the dependency relationship among the encoding units, the encoding of each encoding unit which needs to contain the same image frame in the traditional data packet can be changed, so that the associated encoding units of different image frames are recombined into one data packet, and the data packet can be sequentially encoded according to the positions in the original image at the decoding end according to the dependency relationship, thereby effectively reducing the dependency degree of decoding between the packets, and avoiding the problem that a large number of subsequent encoding units cannot decode due to the loss of a plurality of data packets.
Step S103: and traversing the dependency graph, orderly dividing the dependency graph into a plurality of regions according to a preset rule of a user, and grouping each coding unit according to the regions.
Preferably, the dependency graph is traversed and is divided into a plurality of regions in order according to rules provided by a user. The sum of the packed sizes of the coding units in each region is required to be not more than the maximum allowable range of the network data packet, and the division should separate the dependency relationship between the coding units as much as possible, that is, the coding units with close dependency relationship are classified into the same region as much as possible, otherwise, the coding units are classified into different regions.
Preferably, the specific division rule may be specified by a user, such as a simpler rule: and taking a coding unit which does not depend on other coding units as a starting point, performing depth-first traversal on the dependency graph, dividing regions according to a traversal sequence, and dividing coding units which are accessed adjacently during traversal into the same region.
In the application, coding units in the video code stream acquired before are grouped, namely coding units previously classified into the same area are also grouped into the same group in the step; the coding units previously dropped are also preferably processed in this step.
In an embodiment of the present application, after grouping, metadata added in front of each coding unit in each group is sorted into a format of an original video code stream when being recombined; the metadata includes: frame number and unit number. As shown in table 1 below.
Table 1 data structure after adding source data
Figure GDA0003261720820000081
Wherein the added integer values are all encoded using Exp-Golomb, as in the HEVC standard.
Step S104: and encapsulating the packets into network data packets by taking each packet as a unit and transmitting the network data packets through the network.
Preferably, the above packets are assembled into UDP network packets and transmitted through the network, preferably the corresponding decoding end. And repeating the operations from S101 to S104 for the subsequent video code stream.
Fig. 6 shows a schematic view of the method applied in an embodiment in a scenario at an encoding end and a corresponding decoding end. As shown in the figure, the main flow at the transmitting end is:
firstly, acquiring a video picture of a standard HEVC video code stream recorded by external equipment; the size of the image group is set as 24 frames, and a video code stream obtained by coding is firstly placed into a buffer and delivered to a subsequent unit for processing by taking the image group as a unit.
Then, a dependency information extraction program reads and analyzes the prediction mode of each coding unit of the video code stream and the intra-frame prediction mode or inter-frame prediction motion vector information of each prediction unit inside the video code stream, and constructs a dependency relationship graph among the coding units;
secondly, acquiring the video code stream and the dependency relationship graph, then performing depth-first traversal on the dependency relationship graph by taking a coding unit which does not depend on other coding units as a starting point, dividing regions according to a traversal sequence, and dividing coding units which are accessed adjacently during traversal into the same region; and grouping the coding units according to the division result of the dependency graph, and adding necessary metadata for sequencing into the format of the original video code stream during recombination.
Finally, each group is used as a unit to be encapsulated into a network data packet and transmitted through the network.
A corresponding feasible decoding end can perform the following operations:
firstly, receiving a UDP data packet transmitted by a coding end from an IP network, placing the UDP data packet into a buffer area, and entering the next step of processing after the data packets of the current image group are all received; the judgment standard is that when a data packet from a new image group is received, the current image group is judged to be received;
then processing the data packet of one image group each time, sequencing the coding units therein according to the indication of the added metadata, restoring the coding units to the original sequence, removing the added metadata at the same time, and restoring the original video code stream format;
finally, the inside of the HEVC video decoder comprises an HEVC video decoder and a video playing user interface; and acquiring the video code stream subjected to the reduction processing, decoding and playing.
To sum up, the method of the present application mainly separates and reassembles the coding units in the video code stream before transmission, and has the following advantages:
1) by changing the packing mode of the coding unit in the video code stream, the cross-data packet dependency and error diffusion of the coding unit are reduced, and the damage of the data packet damage and loss in network transmission to the video quality can be reduced; under the same packet loss rate: objectively, the average Peak Signal-to-Noise Ratio (PSNR) value of the video picture restored by the decoding end is higher in most cases; subjectively, the defect area in the video picture is smaller, and the large-area strip defect is converted into scattered small block defect;
2) because the method only plays a role in resisting error codes through the reordering of the coding units, only a small amount of metadata is added, and redundancy is not added to actual video code stream data, the network transmission capacity is saved.
Fig. 7 is a block diagram of an apparatus for error resilient video coding reassembly according to an embodiment of the present application. As shown, applied to the encoding side, the apparatus 700 includes:
an obtaining module 701, configured to obtain an HEVC-encoded video stream of at least one group of pictures from a video source;
an analyzing module 702, configured to analyze the video code stream to separate a plurality of coding units, and retrieve the prediction mode of each coding unit and the motion vector information of each prediction unit inside the coding unit to obtain a dependency graph between the coding units;
the grouping module 703 is configured to traverse the dependency graph and sequentially divide the dependency graph into a plurality of regions according to a user preset rule, so as to group each coding unit;
and a transmission module 704, configured to encapsulate each packet into a network data packet and transmit the network data packet through the network.
It should be noted that, for the information interaction, execution process, and other contents between the modules/units of the system, since the same concept is based on the embodiment of the method applied to the decoding end in fig. 1, the technical effect brought by the method is the same as the embodiment of the method applied to the decoding end in the present application, and specific contents may refer to the description in the foregoing embodiment of the method in the present application, and are not described herein again.
It should be noted that the division of the modules in the above-mentioned apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these units can be implemented entirely in software, invoked by a processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware.
For example, the grouping module 703 may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the processing element of the apparatus calls and executes the functions of the grouping module 703. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
Fig. 8 is a schematic structural diagram of an encoding-side device according to an embodiment of the present application. As shown, the encoding end device 800 includes: a memory 801, a processor 802, and a communicator 803; the memory 801 is used for storing computer programs; the processor 802 runs a computer program to implement the method described in fig. 1; the communicator 803 is configured to be communicatively coupled to a decoding-side device.
In some embodiments, the number of the memories 801 in the encoding-side device 800 may be one or more, the number of the processors 802 may be one or more, the number of the communicators 803 may be one or more, and fig. 8 is taken as an example.
In an embodiment of the present application, the processor 802 in the encoding-side device 800 loads one or more instructions corresponding to a process of an application program into the memory 801 according to the steps described in fig. 1, and the processor 802 runs the application program stored in the memory 802, thereby implementing the method described in fig. 1.
The Memory 801 may include a Random Access Memory (RAM), or may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The memory 801 stores an operating system and operating instructions, executable modules or data structures, or a subset thereof, or an expanded set thereof, wherein the operating instructions may include various operating instructions for implementing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.
The Processor 802 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
The communicator 803 is used for implementing communication connection between the database access device and other devices (e.g., client, read-write library, and read-only library). The communicator 803 may include one or more sets of modules for different communication modes, such as a CAN communication module communicatively coupled to a CAN bus. The communication connection may be one or more wired/wireless communication means and combinations thereof. The communication method comprises the following steps: any one or more of the internet, CAN, intranet, Wide Area Network (WAN), Local Area Network (LAN), wireless network, Digital Subscriber Line (DSL) network, frame relay network, Asynchronous Transfer Mode (ATM) network, Virtual Private Network (VPN), and/or any other suitable communication network. For example: any one or a plurality of combinations of WIFI, Bluetooth, NFC, GPRS, GSM and Ethernet.
Fig. 9 is a schematic structural diagram of an error resilient video coding re-organization system according to an embodiment of the present application. As shown, the encoding side device 910 and the decoding side device 920 shown in fig. 8;
the decoding side device 920; and the data packet receiving module is used for receiving the data packet sent by the encoding end equipment, sequencing each encoding unit based on the metadata and restoring the data packet into the format of the original video code stream for decoding and playing.
For example, a correspondingly feasible decoding-side apparatus 920 may be configured to:
firstly, receiving a UDP data packet transmitted by a coding end from an IP network, placing the UDP data packet into a buffer area, and entering the next step of processing after the data packets of the current image group are all received; the judgment standard is that when a data packet from a new image group is received, the current image group is judged to be received;
then processing the data packet of one image group each time, sequencing the coding units therein according to the indication of the added metadata, restoring the coding units to the original sequence, removing the added metadata at the same time, and restoring the original video code stream format;
finally, the inside of the HEVC video decoder comprises an HEVC video decoder and a video playing user interface; and acquiring the video code stream subjected to the reduction processing, decoding and playing.
In an embodiment of the present application, a computer-readable storage medium is provided, on which computer instructions applied to an encoding end are stored, and the computer instructions are executed to perform the method applied to the encoding end as described in fig. 1.
The present application may be embodied as systems, methods, and/or computer program products, in any combination of technical details. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present application.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable programs described herein may be downloaded from a computer-readable storage medium to a variety of computing/processing devices, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present application may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
To sum up, the present application provides a method, an apparatus, a device, a system, and a medium for error-resistant video coding reassembly, which are directed at a Coding Unit (CU) structure in an HEVC video code stream, to extract all coding units from the code stream and analyze their dependency relationship, and then adjust their order and packet mode according to the dependency relationship and a rule specified by a user, so as to reduce error diffusion when packet loss occurs in network transmission, thereby reducing the range size of an affected video image. Meanwhile, the decoding end can use a proper method to reversely operate to restore each coding unit to the original position, so that the coding unit can be correctly decoded by a video decoder conforming to the internet video coding standard.
The application effectively overcomes various defects in the prior art and has high industrial utilization value.
The above embodiments are merely illustrative of the principles and utilities of the present application and are not intended to limit the application. Any person skilled in the art can modify or change the above-described embodiments without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical concepts disclosed in the present application shall be covered by the claims of the present application.

Claims (10)

1. An error-resistant video coding recombination method is applied to a coding end, and the method comprises the following steps:
the method comprises the steps of obtaining an HEVC (high efficiency video coding) video code stream of at least one image group from a video source;
analyzing the video code stream to separate a plurality of coding units, and retrieving the prediction mode of each coding unit and the motion vector information of inter-frame prediction of each prediction unit in the coding units to obtain a dependency relationship graph among the coding units; wherein the prediction mode determines a decoding dependency range of a corresponding prediction unit;
traversing the dependency graph, orderly dividing the dependency graph into a plurality of regions according to a preset rule of a user, and grouping each coding unit according to the regions;
and encapsulating the packets into network data packets by taking each packet as a unit and transmitting the network data packets through the network.
2. The method of claim 1, wherein the motion vector information comprises motion vector information for inter prediction mode: the dependency graph is represented by a directed acyclic graph.
3. The method of claim 2, wherein the prediction mode determines a decoding dependency range of the corresponding prediction unit, comprising: the intra-frame prediction mode depends on a coded coding unit in the same image frame slice adjacent to the current area, and the inter-frame prediction mode depends on a coding unit in a motion vector pointing area in a reference frame; the dependency range of the coding unit is then the sum of the dependency ranges of the corresponding prediction units.
4. The method according to claim 1, wherein traversing the dependency graph and dividing the dependency graph into a plurality of regions according to a user preset rule in order comprises:
the sum of the packed sizes of the coding units in each region does not exceed the maximum allowable range of the network data packet;
the division is used for separating the dependency relationship between the coding units; wherein, the coding units with close dependence relations are divided into the same area, otherwise, the coding units are divided into different areas.
5. The method of claim 1, wherein the user preset rule comprises:
depth-first traversal is performed on the dependency graph by taking a coding unit which does not depend on other coding units as a starting point;
and dividing the regions according to the traversal sequence, and dividing coding units accessed adjacently during traversal into the same region.
6. The method of claim 1, wherein the metadata added before each coding unit in each group after grouping is sorted into the format of the original video stream for reassembly; the metadata includes: frame number and unit number.
7. An error-resilient video coding recombination device, applied to a coding end, the device comprising:
the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a video code stream of HEVC coding of at least one image group from a video source;
the analysis module is used for analyzing the video code stream to separate a plurality of coding units and retrieving the prediction mode of each coding unit and the inter-frame prediction motion vector information of each prediction unit in the coding units so as to obtain a dependency relationship graph among the coding units; wherein the prediction mode determines a decoding dependency range of a corresponding prediction unit;
the grouping module is used for traversing the dependency graph, orderly dividing the dependency graph into a plurality of regions according to a preset rule of a user and grouping each coding unit according to the regions;
and the transmission module is used for encapsulating each group into a network data packet and transmitting the network data packet through a network.
8. An encoding end device, characterized in that the device comprises: a memory, a processor, and a communicator; the memory is used for storing a computer program; the processor runs a computer program to implement the method of any one of claims 1 to 6; the communicator is used for being in communication connection with the decoding end device.
9. An error resilient video coding re-assembly system, the system comprising: the encoding side device and the decoding side device according to claim 8;
and the decoding end equipment is used for receiving the data packet sent by the encoding end equipment, sequencing each encoding unit based on the metadata and reducing the data packet into the format of the original video code stream for decoding and playing.
10. A computer-readable storage medium having stored thereon computer instructions which, when executed, perform the method of any one of claims 1 to 6.
CN202011622272.7A 2020-12-30 2020-12-30 Method, device, equipment, system and medium for recombining error code resistant video coding Active CN112822492B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011622272.7A CN112822492B (en) 2020-12-30 2020-12-30 Method, device, equipment, system and medium for recombining error code resistant video coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011622272.7A CN112822492B (en) 2020-12-30 2020-12-30 Method, device, equipment, system and medium for recombining error code resistant video coding

Publications (2)

Publication Number Publication Date
CN112822492A CN112822492A (en) 2021-05-18
CN112822492B true CN112822492B (en) 2022-01-07

Family

ID=75854583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011622272.7A Active CN112822492B (en) 2020-12-30 2020-12-30 Method, device, equipment, system and medium for recombining error code resistant video coding

Country Status (1)

Country Link
CN (1) CN112822492B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115348456B (en) * 2022-08-11 2023-06-06 上海久尺网络科技有限公司 Video image processing method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6680976B1 (en) * 1997-07-28 2004-01-20 The Board Of Trustees Of The University Of Illinois Robust, reliable compression and packetization scheme for transmitting video
CN103475878A (en) * 2013-09-06 2013-12-25 同观科技(深圳)有限公司 Video coding method and encoder
CN105407357A (en) * 2015-12-04 2016-03-16 上海交通大学 Neyman-Pearson criterion based SKIP mode rapidly selecting method
CN109120934A (en) * 2018-09-25 2019-01-01 杭州电子科技大学 A kind of frame level quantization parameter calculation method suitable for HEVC Video coding
CN111083484A (en) * 2018-10-22 2020-04-28 北京字节跳动网络技术有限公司 Sub-block based prediction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6680976B1 (en) * 1997-07-28 2004-01-20 The Board Of Trustees Of The University Of Illinois Robust, reliable compression and packetization scheme for transmitting video
CN103475878A (en) * 2013-09-06 2013-12-25 同观科技(深圳)有限公司 Video coding method and encoder
CN105407357A (en) * 2015-12-04 2016-03-16 上海交通大学 Neyman-Pearson criterion based SKIP mode rapidly selecting method
CN109120934A (en) * 2018-09-25 2019-01-01 杭州电子科技大学 A kind of frame level quantization parameter calculation method suitable for HEVC Video coding
CN111083484A (en) * 2018-10-22 2020-04-28 北京字节跳动网络技术有限公司 Sub-block based prediction

Also Published As

Publication number Publication date
CN112822492A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
JP6267778B2 (en) apparatus
JP6271646B2 (en) Grouping tiles for video coding
US20120230397A1 (en) Method and device for encoding image data, and method and device for decoding image data
JP2017508415A (en) Image encoding / decoding method and apparatus
KR102160958B1 (en) Video-encoding method, video-decoding method, and apparatus implementing same
KR101353165B1 (en) Method and arrangement for jointly encoding a plurality of video streams
CN115134602A (en) Low latency two-pass video coding
CN112565815B (en) File packaging method, file transmission method, file decoding method and related equipment
CN112822492B (en) Method, device, equipment, system and medium for recombining error code resistant video coding
CN112822488B (en) Video encoding and decoding system, method, device, terminal and medium based on block recombination
CN112822549B (en) Video stream decoding method, system, terminal and medium based on fragmentation recombination
CN114902670B (en) Method and apparatus for signaling sub-image division information
WO2024078066A1 (en) Video decoding method and apparatus, video encoding method and apparatus, storage medium, and device
CN112822516B (en) Image group transmission method, device, equipment and system based on data block recombination
CN111212288B (en) Video data encoding and decoding method and device, computer equipment and storage medium
CN114846789A (en) Decoder for indicating image segmentation information of a slice and corresponding method
JP2022526770A (en) Conversion unit classification method for video coding
CN111416975A (en) Prediction mode determination method and device
CN112822514B (en) Video stream packet transmission method, system, terminal and medium based on dependency relationship
RU2806278C2 (en) Device and method of video encoding
CN112788344B (en) Video decoding method, device, system, medium and terminal based on coding unit recombination

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant