CN112203090A - Video encoding and decoding method and device, electronic equipment and medium - Google Patents
Video encoding and decoding method and device, electronic equipment and medium Download PDFInfo
- Publication number
- CN112203090A CN112203090A CN202011367747.2A CN202011367747A CN112203090A CN 112203090 A CN112203090 A CN 112203090A CN 202011367747 A CN202011367747 A CN 202011367747A CN 112203090 A CN112203090 A CN 112203090A
- Authority
- CN
- China
- Prior art keywords
- target
- reference frame
- matching block
- data
- decoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000033001 locomotion Effects 0.000 claims description 30
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 16
- 238000005314 correlation function Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000000717 retained effect Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 101100129590 Schizosaccharomyces pombe (strain 972 / ATCC 24843) mcp5 gene Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/423—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
- H04N19/426—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods
Abstract
The application discloses a video coding and decoding method, a video coding and decoding device, electronic equipment and a video coding and decoding medium. In the application, at least one target reference frame and at least one corresponding target matching block can be obtained, a first number of the target reference frame and a second number of the at least one corresponding target matching block are coded into a target video code stream, and data except the target matching block in the target reference frame are eliminated. By applying the technical scheme of the application, only the region data actually used for inter-frame prediction in the determined one or more reference frames can be reserved by the codec, and other data can be discarded. And further, the problem that enough reference images cannot be cached due to limited storage capacity of a coder and a decoder in the related technology can be avoided.
Description
Technical Field
The present application relates to data processing technologies, and in particular, to a method, an apparatus, an electronic device, and a medium for video encoding and decoding.
Background
In the related art, a large amount of redundant information exists in both time and space of consecutive video images, and thus by reducing the redundant information in the video, it is possible to more efficiently store and transmit video image information. Among them, the motion compensation technique is a video information redundancy elimination technique that is widely applied, and is widely applied in various video coding and decoding standards, including MPEG2, MPEG4, h.264, h.265/HEVC, and h.266/VCC.
Coded images are divided into three types: i-frames, P-frames, and B-frames. The picture used to predict the coded picture is called a reference frame. The I frame is an inner coding frame, does not need to refer to other frames, and carries out independent compression coding by utilizing the spatial correlation of the video image. The P frame is a forward predicted frame, the predicted value and the motion vector of a certain point of the P frame are found in the I frame by taking the I frame as a reference frame, and the predicted difference value and the motion vector are transmitted together. B frames are bi-directional interpolated frames with a preceding I or P frame and a following P frame as reference frames.
Motion compensated prediction requires that both the video encoder and decoder store a predetermined number of reference frames for encoding and decoding. However, in both the encoder and the decoder, the number of reference frames is limited due to the limitation of the high-speed memory capacity. However, in many application scenarios, there is a long time correlation between video information, and the distance between the best reference frame and the current frame to be encoded may be long. If all possible reference frames in this period are to be buffered, the storage capacity available to the encoder and decoder may be exceeded.
Disclosure of Invention
The embodiment of the invention provides a video coding and decoding method, a video coding and decoding device, electronic equipment and a medium.
According to an aspect of the embodiments of the present application, there is provided a method for video encoding and decoding, which is applied to an encoding end, and includes:
acquiring at least one target reference frame and at least one corresponding target matching block;
coding a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream;
and clearing data except the target matching block in the target reference frame.
Optionally, in another embodiment based on the foregoing method of the present application, the acquiring at least one target reference frame and at least one corresponding target matching block includes:
sequentially acquiring candidate matching blocks of at least one candidate reference frame according to a time sequence;
predicting a current region to be coded and a candidate matching block with the same size in the candidate reference frame, and determining a first motion vector value of the current region to be coded and the candidate matching block;
calculating a difference value between the candidate matching block and the first motion vector value to obtain a candidate residual block;
and determining at least one target matching block corresponding to the target reference frame based on the candidate residual block.
Optionally, in another embodiment based on the above method of the present application, the determining at least one target matching block corresponding to the target reference frame based on the candidate residual block includes:
and determining at least one target matching block corresponding to the target reference frame from the candidate residual blocks based on a preset standard absolute error value, a preset mean square error value and a preset normalized cross-correlation function.
Optionally, in another embodiment based on the foregoing method of the present application, the encoding a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream includes:
and according to the area of the target matching block in the target reference frame, encoding the indication information of the area containing the target matching block and the first number of the target reference frame into the target video code stream.
Optionally, in another embodiment based on the foregoing method of the present application, the clearing data in the target reference frame except for the target matching block includes:
reserving data where the target matching block is located in the target reference frame; or the like, or, alternatively,
and reserving data of the partial region which contains the target matching block and is segmented in a preset mode in the target reference frame.
According to an aspect of the embodiments of the present application, there is provided a method for video encoding and decoding, which is applied to a decoding end, and includes:
decoding received target video stream data to obtain a target reference frame;
intercepting data of a corresponding target matching block in the target reference frame;
and decoding other frames in the target video stream by using the target matching block data.
Optionally, in another embodiment based on the foregoing method of the present application, the decoding the received target video stream data to obtain the target reference frame includes:
receiving the target video stream data to obtain video data information and coding control information;
determining the target reference frame and corresponding target matching block coding information based on the coding control information;
and extracting the target reference frame and corresponding target matching block information.
Optionally, in another embodiment based on the foregoing method of the present application, the decoding, by using the target matching block data, other frames in the target video stream includes:
determining at least one reference region from the target reference frame buffer according to the target reference frame and the motion vector value;
and combining the motion compensation value with the reference area data to obtain the reconstructed decoding data of other frames.
According to another aspect of the embodiments of the present application, there is provided an apparatus for video encoding and decoding, including:
an acquisition module configured to acquire at least one target reference frame and a corresponding at least one target matching block;
an encoding module configured to encode a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream;
a clearing module configured to clear data in the target reference frame other than the target matching block.
According to another aspect of the embodiments of the present application, there is provided a video encoding and decoding method, including:
the generating module is configured to decode the received target video stream data to obtain a target reference frame;
an intercepting module configured to intercept data of a corresponding target matching block in the target reference frame;
a decoding module configured to decode other frames in the target video stream using the target matching block data.
According to another aspect of the embodiments of the present application, there is provided an electronic device including:
a memory for storing executable instructions; and
a processor for displaying with the memory to execute the executable instructions to perform the operations of any of the above-described methods of video coding.
According to a further aspect of the embodiments of the present application, there is provided a computer-readable storage medium for storing computer-readable instructions, which when executed, perform the operations of any of the video coding and decoding methods described above.
In the application, at least one target reference frame and at least one corresponding target matching block can be obtained, a first number of the target reference frame and a second number of the at least one corresponding target matching block are coded into a target video code stream, and data except the target matching block in the target reference frame are eliminated. By applying the technical scheme of the application, only the region data actually used for inter-frame prediction in the determined one or more reference frames can be reserved by the codec, and other data can be discarded. And further, the problem that enough reference images cannot be cached due to limited storage capacity of a coder and a decoder in the related technology can be avoided.
The technical solution of the present application is further described in detail by the accompanying drawings and examples.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description, serve to explain the principles of the application.
The present application may be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:
FIG. 1 is a flowchart of an embodiment of a video encoding and decoding method of the present application;
FIGS. 2-4 are schematic diagrams of video encoding and decoding according to the present application;
FIG. 5 is a flowchart illustrating a method for video encoding and decoding according to another embodiment of the present application;
FIG. 6 is a schematic structural diagram of an electronic device for video encoding and decoding according to the present application;
fig. 7 is a schematic view of an electronic device according to the present application.
Detailed Description
Various exemplary embodiments of the present application will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present application unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the application, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
It should be noted that all the directional indications (such as up, down, left, right, front, and rear … …) in the embodiment of the present application are only used to explain the relative position relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indication is changed accordingly.
In addition, descriptions in this application as to "first", "second", etc. are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit to the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In this application, unless expressly stated or limited otherwise, the terms "connected," "secured," and the like are to be construed broadly, and for example, "secured" may be a fixed connection, a removable connection, or an integral part; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
In addition, technical solutions between the various embodiments of the present application may be combined with each other, but it must be based on the realization of the technical solutions by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should be considered to be absent and not within the protection scope of the present application.
A method for video coding according to an exemplary embodiment of the present application is described below with reference to fig. 1 to 6. It should be noted that the following application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.
The application also provides a video coding and decoding method, a video coding and decoding device, a target terminal and a medium.
Fig. 1 schematically shows a flow chart of a method for video encoding and decoding according to an embodiment of the present application. As shown in fig. 1, the method includes:
s101, at least one target reference frame and at least one corresponding target matching block are obtained.
Further, in the present application, an image to be encoded may be divided into a plurality of regions to be encoded (macroblocks or CUs, etc.), and during the inter-frame encoding process, the best image or images of multiple frames in the candidate reference frame shown in fig. 2 are selected as the reference frame. The candidate reference frame is generally an image frame that is reconstructed after encoding and is temporally closer to the current image to be encoded. The candidate frame may be defined as a reconstructed image within a certain time from the current image to be encoded. Fig. 3 shows 3 encoding blocks each containing a plurality of pixels. For example, the following steps may be included: traversing all candidate reference frames according to the time sequence from near to far; in the process of traversing, a region to be coded currently and a region (matching block, namely a candidate matching block) with the same size in a candidate reference frame are predicted, and the offset between the two regions is the motion vector. The difference between the candidate matching block and the prediction region is calculated to form a residual block. Within a certain range, the best candidate matching block is determined according to a certain criterion (sum of absolute errors, mean square error, normalized cross-correlation function criterion, etc.), and the current candidate matching block is the matching block (or called matching block). For the encoding of B-frames, at least 1 reference frame before and after the B-frame is needed. The determined reference frame and its matching block are recorded.
Further, in fig. 2, the reference frames a and b contain 2 (1 and 2) and 1 matching block, respectively, as reference blocks of the regions to be coded (or coding blocks) 1, 2 and 3, respectively. Obviously, for the coding or decoding of the regions 1, 2 and 3 to be coded, the reference frame a has only 2 matching blocks of significance and the data of the other regions can be ignored, while the reference frame b has only 1 matching block of significance and the data of the other regions can also be ignored.
In addition, other blocks to be coded may need to use areas other than the above areas as reference blocks, and therefore, other needs need to be further pre-determined to determine whether the blocks are suitable for being used as reference blocks. For example, when a certain region information can be replaced by another reference block, the region information does not need to be buffered as a part of the reference frame. The portion that actually needs to be cached becomes smaller.
In one embodiment, one entire video frame may be buffered as a reference frame. While regions of other reference frames interpolated within a certain threshold from the full reference frame are considered not to be discarded as regions of the matching block, nor as matching traversals.
S102, the first number of the target reference frame and the second number of the corresponding at least one target matching block are coded into a target video code stream.
Further, the target reference frame number and the matching block number (which may be multiple) can be encoded into the video code stream for decoding at the decoding end. The number may be passed in a Picture Parameter Set (PPS) similar to that in the h.264 standard, or may be passed in a slice header (slice header), or as another syntax element.
Specifically, a reference frame number (ref _ frame _ num) and a matching block number (ref _ cu _ num, which may be multiple) may be encoded into the video stream for decoding at the decoding end. The number may be passed in a Picture Parameter Set (PPS) similar to that in the h.264 standard, or may be passed in a slice header (slice header), or as another syntax element.
In one approach, there may be multiple reference matching blocks, as shown in the following table:
ref_frame_num | ref_cu_num1 | ref_cu_num1 |
in one embodiment, a reference frame may be divided into a limited number of reference regions, such as 4 reference regions of the same size. And according to the position of the matching block, encoding the indication information of the reference area containing the matching block into the video code stream. For example, as shown in fig. 4, a region whose lower left corner occupies area 1/4 may be used as a reference region, and the region includes a plurality of motion prediction matching blocks.
Further, the encoding method of the region may be:
ref_frame_num | ref_block_num1 |
wherein, ref _ block _ num1 means:
0: all;
1: the upper left corner 1/4 area;
2: the upper right corner 1/4 area;
3: the lower right corner 1/4 area.
S103, clearing data except the target matching block in the target reference frame.
Alternatively, in the present application, only the data of the matching block may be retained in the reference frame buffer, and the data of the entire reference frame need not be retained. Or only the reference region formed by the part of the reference frame containing the matching block which is cut in other ways.
For example, the present application may store only reference regions formed by otherwise sliced partial regions of the reference frame containing matching blocks. There are three reference frames in the reference frame buffer queue of the encoder or decoder, but only the first reference frame is complete and only 1/4 out of the other 2 reference frames is buffered.
In the application, at least one target reference frame and at least one corresponding target matching block can be obtained, a first number of the target reference frame and a second number of the at least one corresponding target matching block are coded into a target video code stream, and data except the target matching block in the target reference frame are eliminated. By applying the technical scheme of the application, only the region data actually used for inter-frame prediction in the determined one or more reference frames can be reserved by the codec, and other data can be discarded. And further, the problem that enough reference images cannot be cached due to limited storage capacity of a coder and a decoder in the related technology can be avoided.
Optionally, in an embodiment of the present application, in an embodiment of obtaining at least one target reference frame and at least one corresponding target matching block, the following may be obtained:
sequentially acquiring candidate matching blocks of at least one candidate reference frame according to a time sequence;
predicting candidate matching blocks with the same size in a current region to be coded and a target candidate reference frame, and determining a first motion vector value of the target current region to be coded and the target candidate matching blocks;
calculating a difference value between the target candidate matching block and the target first motion vector value to obtain a candidate residual block;
and determining at least one target matching block corresponding to the target reference frame based on the target candidate residual block.
Optionally, in the present application, at least one target matching block corresponding to a target reference frame may be determined from the target candidate residual block based on a preset standard absolute error value, a preset mean square error value, and a preset normalized cross-correlation function.
Further, one or more candidate target reference frames may be selected as one or more optimal target reference frames of the region to be encoded (macroblock or CU, etc.). May include the steps of: traversing all candidate target reference frames according to the time sequence from near to far; in the process of traversing, a current region to be coded and a region (matching block, namely a candidate matching block) with the same size in a candidate target reference frame are predicted, and the offset between the two regions is the motion vector.
Furthermore, the present application also needs to calculate a difference between the candidate matching block and the prediction region to form a candidate residual block. Within a certain range, the best candidate matching block is determined according to a certain criterion (sum of absolute errors, mean square error, normalized cross-correlation function criterion, etc.), and the current candidate matching block is the matching block (or called matching block). For the encoding of B frames, at least 1 target reference frame before and after the B frame is needed. The determined target reference frame and its matching block are recorded.
Optionally, the video stream data includes a plurality of frames of images, each frame of image representing a still image. In actual video data, various algorithms are used to reduce the data size, including key frame I (intra picture) frames and non-key frame P (non-key picture) frames. Further, the I frame is a key frame, and is an intra-frame coded image in which the transmission data amount is compressed by removing image space redundant information as much as possible, and belongs to the complete reservation of an intra-frame compressed picture. And the P frame is a forward search frame indicating the difference between the frame and a previous key frame (or P frame). It will be appreciated that P frames are compressed data generated based on I frames.
Further optionally, since the I-frame picture is the first frame of each GOP (video compression technology used by MPEG), and the P-frame is a reference frame, the predicted value and the motion vector of a certain object are found from the I-frame, so as to obtain a picture taking the predicted difference value and the motion vector. The target video data can therefore be segmented into a number of block video data based on key frame I frames as well as non-key frame P frames.
Optionally, in an implementation manner of the present application, in an implementation manner that a first number of a target reference frame and a second number of at least one corresponding target matching block are encoded into a target video code stream, the first number and the second number may be obtained by:
and according to the area of the target matching block in the target reference frame, encoding the indication information of the area containing the target matching block and the first number of the target reference frame into the target video code stream.
In the application, the reference frame number and the matching block number (there may be a plurality of) can be encoded into the video code stream for decoding at the decoding end. The number may be passed in a Picture Parameter Set (PPS) similar to that in the h.264 standard, or may be passed in a slice header (slice header), or as another syntax element.
In one embodiment, a reference frame may be divided into a limited number of reference regions, such as 4 reference regions of the same size. And according to the position of the matching block, encoding the indication information of the reference area containing the matching block into the video code stream.
In addition, the data of the matching block in the reference frame can be stored, and only the data of the matching block is reserved in the reference frame buffer area, and the data of the whole reference frame is not required to be maintained. Or only the reference region formed by the part of the reference frame containing the matching block which is cut in other ways.
Optionally, in an embodiment of the present application, the data in the target reference frame except for the target matching block may be obtained by:
retaining data where the target matching block is located in the target reference frame; or the like, or, alternatively,
and reserving data of the partial region which contains the target matching block and is segmented in a preset mode in the target reference frame.
The application also provides a video coding and decoding method, a video coding and decoding device, a target terminal and a medium.
Fig. 6 schematically shows a flow chart of a method for video encoding and decoding according to an embodiment of the present application. As shown in fig. 5, the method applied to the decoding end includes:
s201, decoding the received target video stream data to obtain a target reference frame.
Further, the decoding end in the present application may first decode the reconstructed reference frame and obtain the information related to the matching block. Specifically, a video stream may be received for a decoder, which includes video data information and encoding control information. It is determined whether the current frame is a reference frame. And if so, extracting information related to the reference frame and the matching block from the video stream.
S202, intercepting the data of the corresponding target matching block in the target reference frame.
Further, only the data of the matching block may be retained in the reference frame buffer, without maintaining the data of the entire reference frame. Or only the reference region formed by the part of the reference frame containing the matching block which is cut in other ways. This operation is consistent with the operation of the coding section. All or a portion of the plurality of reference frames may be included in the reference frame buffer.
And S203, decoding other frames in the target video stream by using the target matching block data.
Further, the decoder receives the control information to be decoded from the code stream and judges whether the control information is interframe coding. If inter-frame coding is used, 1 or more reference regions are determined from the reference frame buffer according to the received reference frame and the motion vector information. In some embodiments, the reconstructed decoded data is obtained by adding the motion compensation values to the reference region data.
In some embodiments, there are multiple reference frames (frames, requiring linear computations over multiple reference frames, plus motion compensation values to reconstruct the decoded data. for example, in some embodiments, the predicted value for a B frame may be (pre-picture reference frame pixel value + post-picture reference frame pixel value)/2. in some embodiments, deblocking, loop filtering, etc. is also required.
In the application, at least one target reference frame and at least one corresponding target matching block can be obtained, a first number of the target reference frame and a second number of the at least one corresponding target matching block are coded into a target video code stream, and data except the target matching block in the target reference frame are eliminated. By applying the technical scheme of the application, only the region data actually used for inter-frame prediction in the determined one or more reference frames can be reserved by the codec, and other data can be discarded. And further, the problem that enough reference images cannot be cached due to limited storage capacity of a coder and a decoder in the related technology can be avoided.
Optionally, in an embodiment of the present application, decoding received target video stream data to obtain a target reference frame includes:
receiving target video stream data to obtain video data information and coding control information;
determining a target reference frame and corresponding target matching block coding information based on the coding control information;
and extracting the target reference frame and the corresponding target matching block information.
Further optionally, the decoding of other frames in the target video stream by using the target matching block data includes:
determining at least one reference region from a target reference frame buffer according to the target reference frame and the motion vector value;
and combining the motion compensation value with the reference area data to obtain the reconstructed decoding data of other frames.
Further, in the present application, a decoder receives the video code stream, and decodes and reconstructs the image frame. And judging whether the current decoded and reconstructed image frame is a reference frame of the current frame to be decoded or not according to the indication in the code stream. And determining a reference area needing motion decoding according to the related information (the number of the matching block or the number of the reference frame block) of the matching block indicated in the code stream.
In addition, only the data of the matching block can be reserved in the reference frame buffer area, and the data of the whole reference frame is not required to be kept. Or only the reference region formed by the part of the reference frame containing the matching block which is cut in other ways. And decoding other frames using the matched block data
In another embodiment of the present application, as shown in fig. 6, the present application further provides an apparatus for video encoding and decoding, which includes an obtaining module 401, an encoding module 402, and a clearing module 403, wherein,
an obtaining module 401 configured to obtain at least one target reference frame and at least one corresponding target matching block;
an encoding module 402 configured to encode a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream;
a clearing module 403 configured to clear data in the target reference frame except for the target matching block.
In the application, at least one target reference frame and at least one corresponding target matching block can be obtained, a first number of the target reference frame and a second number of the at least one corresponding target matching block are coded into a target video code stream, and data except the target matching block in the target reference frame are eliminated. By applying the technical scheme of the application, only the region data actually used for inter-frame prediction in the determined one or more reference frames can be reserved by the codec, and other data can be discarded. And further, the problem that enough reference images cannot be cached due to limited storage capacity of a coder and a decoder in the related technology can be avoided.
Optionally, in another embodiment of the present application, the obtaining module 401 further includes:
an obtaining module 401 configured to sequentially obtain candidate matching blocks of at least one candidate reference frame in a time order;
an obtaining module 401, configured to predict a current region to be coded and a candidate matching block in the candidate reference frame, where the candidate matching block has the same size, and determine a first motion vector value of the current region to be coded and the candidate matching block;
an obtaining module 401 configured to calculate a difference between the candidate matching block and the first motion vector value to obtain a candidate residual block;
an obtaining module 401 configured to determine at least one target matching block corresponding to the target reference frame based on the candidate residual block.
In another embodiment of the present application, the obtaining module 401 further includes:
an obtaining module 401 configured to determine at least one target matching block corresponding to the target reference frame from the candidate residual blocks based on a preset standard absolute error value, a preset mean square error value, and a normalized cross-correlation function.
In another embodiment of the present application, the obtaining module 401 further includes:
the obtaining module 401 is configured to encode, according to the area of the target matching block in the target reference frame, the indication information including the area of the target matching block and the first number of the target reference frame into the target video code stream.
In another embodiment of the present application, the obtaining module 401 further includes:
an obtaining module 401 configured to retain data where the target matching block is located in the target reference frame; or the like, or, alternatively,
the obtaining module 401 is configured to retain data of a partial region, which includes the target matching block and is segmented in a preset manner, in the target reference frame.
In another embodiment of the present application, the obtaining module 401 further includes:
an obtaining module 401 configured to decode received target video stream data to obtain a target reference frame;
an obtaining module 401 configured to intercept data of a corresponding target matching block in the target reference frame;
an obtaining module 401 configured to decode other frames in the target video stream by using the target matching block data.
In another embodiment of the present application, the obtaining module 401 further includes:
an obtaining module 401 configured to receive the target video stream data, and obtain video data information and encoding control information;
an obtaining module 401 configured to determine the target reference frame and corresponding target matching block coding information based on the coding control information;
an obtaining module 401 configured to extract the target reference frame and corresponding target matching block information.
In another embodiment of the present application, the obtaining module 401 further includes:
an obtaining module 401 configured to determine at least one reference region from the target reference frame buffer according to the target reference frame and a motion vector value;
an obtaining module 401 configured to combine the motion compensation value with the reference region data to obtain the reconstructed decoded data of the other frames.
According to another aspect of the embodiments of the present application, there is provided a video encoding and decoding method, including:
the generating module is configured to decode the received target video stream data to obtain a target reference frame;
an intercepting module configured to intercept data of a corresponding target matching block in the target reference frame;
a decoding module configured to decode other frames in the target video stream using the target matching block data.
Fig. 7 is a block diagram illustrating a logical structure of an electronic device in accordance with an exemplary embodiment. For example, the electronic device 300 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium, such as a memory, including instructions executable by an electronic device processor to perform the method for intelligent generation of a work order, the method comprising: acquiring at least one target reference frame and at least one corresponding target matching block; coding a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream; and clearing data except the target matching block in the target reference frame. Optionally, the instructions may also be executable by a processor of the electronic device to perform other steps involved in the exemplary embodiments described above. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided an application/computer program product including one or more instructions executable by a processor of an electronic device to perform the above method of intelligent generation of a work order, the method comprising: acquiring at least one target reference frame and at least one corresponding target matching block; coding a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream; and clearing data except the target matching block in the target reference frame. Optionally, the instructions may also be executable by a processor of the electronic device to perform other steps involved in the exemplary embodiments described above.
Fig. 7 is an exemplary diagram of the computer device 30. Those skilled in the art will appreciate that the schematic diagram 7 is merely an example of the computer device 30 and does not constitute a limitation of the computer device 30 and may include more or less components than those shown, or combine certain components, or different components, e.g., the computer device 30 may also include input output devices, network access devices, buses, etc.
The Processor 302 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor 302 may be any conventional processor or the like, the processor 302 being the control center for the computer device 30 and connecting the various parts of the overall computer device 30 using various interfaces and lines.
The modules integrated by the computer device 30 may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by hardware related to computer readable instructions, which may be stored in a computer readable storage medium, and when the computer readable instructions are executed by a processor, the steps of the method embodiments may be implemented.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
Claims (12)
1. A method for video encoding and decoding is applied to an encoding end, and comprises the following steps:
acquiring at least one target reference frame and at least one corresponding target matching block;
coding a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream;
and clearing data except the target matching block in the target reference frame.
2. The method of claim 1, wherein said obtaining at least one target reference frame and a corresponding at least one target matching block comprises:
sequentially acquiring candidate matching blocks of at least one candidate reference frame according to a time sequence;
predicting a current region to be coded and a candidate matching block with the same size in the candidate reference frame, and determining a first motion vector value of the current region to be coded and the candidate matching block;
calculating a difference value between the candidate matching block and the first motion vector value to obtain a candidate residual block;
and determining at least one target matching block corresponding to the target reference frame based on the candidate residual block.
3. The method of claim 2, wherein said determining at least one target matching block corresponding to the target reference frame based on the candidate residual block comprises:
and determining at least one target matching block corresponding to the target reference frame from the candidate residual blocks based on a preset standard absolute error value, a preset mean square error value and a preset normalized cross-correlation function.
4. The method of claim 1, wherein encoding a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video bitstream comprises:
and according to the area of the target matching block in the target reference frame, encoding the indication information of the area containing the target matching block and the first number of the target reference frame into the target video code stream.
5. The method of claim 1, wherein said removing data in said target reference frame other than said target matching block comprises:
reserving data where the target matching block is located in the target reference frame; or the like, or, alternatively,
and reserving data of the partial region which contains the target matching block and is segmented in a preset mode in the target reference frame.
6. A method for video encoding and decoding is applied to a decoding end, and comprises the following steps:
decoding received target video stream data to obtain a target reference frame;
intercepting data of a corresponding target matching block in the target reference frame;
and decoding other frames in the target video stream by using the target matching block data.
7. The method of claim 6, wherein said decoding the received target video stream data to obtain the target reference frame comprises:
receiving the target video stream data to obtain video data information and coding control information;
determining the target reference frame and corresponding target matching block coding information based on the coding control information;
and extracting the target reference frame and corresponding target matching block information.
8. The method of claim 6, wherein said decoding other frames in the target video stream using the target match block data comprises:
determining at least one reference region from the target reference frame buffer according to the target reference frame and the motion vector value;
and combining the motion compensation value with the reference area data to obtain the reconstructed decoding data of other frames.
9. A method for video encoding and decoding is applied to an encoding end, and comprises the following steps:
an acquisition module configured to acquire at least one target reference frame and a corresponding at least one target matching block;
an encoding module configured to encode a first number of the target reference frame and a second number of the corresponding at least one target matching block into a target video code stream;
a clearing module configured to clear data in the target reference frame other than the target matching block.
10. A method for video encoding and decoding is applied to a decoding end, and comprises the following steps:
the generating module is configured to decode the received target video stream data to obtain a target reference frame;
an intercepting module configured to intercept data of a corresponding target matching block in the target reference frame;
a decoding module configured to decode other frames in the target video stream using the target matching block data.
11. An electronic device, comprising:
a memory for storing executable instructions; and the number of the first and second groups,
a processor for display with the memory to execute the executable instructions to perform the operations of the method of video coding as claimed in any of claims 1-8.
12. A computer-readable storage medium storing computer-readable instructions that, when executed, perform the operations of the method of video coding as claimed in any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011367747.2A CN112203090A (en) | 2020-11-30 | 2020-11-30 | Video encoding and decoding method and device, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011367747.2A CN112203090A (en) | 2020-11-30 | 2020-11-30 | Video encoding and decoding method and device, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112203090A true CN112203090A (en) | 2021-01-08 |
Family
ID=74033616
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011367747.2A Pending CN112203090A (en) | 2020-11-30 | 2020-11-30 | Video encoding and decoding method and device, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112203090A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6970510B1 (en) * | 2000-04-25 | 2005-11-29 | Wee Susie J | Method for downstream editing of compressed video |
CN101360237A (en) * | 2008-08-13 | 2009-02-04 | 北京中星微电子有限公司 | Reference frame processing method, video decoding method and apparatus |
CN101742289A (en) * | 2008-11-14 | 2010-06-16 | 北京中星微电子有限公司 | Method, system and device for compressing video code stream |
CN103079072A (en) * | 2013-01-15 | 2013-05-01 | 清华大学 | Inter-frame prediction method, encoding equipment and decoding equipment |
CN106385585A (en) * | 2016-09-14 | 2017-02-08 | 苏睿 | Frame coding and decoding method, device and system |
CN109688407A (en) * | 2017-10-18 | 2019-04-26 | 北京金山云网络技术有限公司 | Reference block selection method, device, electronic equipment and the storage medium of coding unit |
CN110636312A (en) * | 2019-09-27 | 2019-12-31 | 腾讯科技(深圳)有限公司 | Video encoding and decoding method and device and storage medium |
CN111698500A (en) * | 2019-03-11 | 2020-09-22 | 杭州海康威视数字技术股份有限公司 | Encoding and decoding method, device and equipment |
-
2020
- 2020-11-30 CN CN202011367747.2A patent/CN112203090A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6970510B1 (en) * | 2000-04-25 | 2005-11-29 | Wee Susie J | Method for downstream editing of compressed video |
CN101360237A (en) * | 2008-08-13 | 2009-02-04 | 北京中星微电子有限公司 | Reference frame processing method, video decoding method and apparatus |
CN101742289A (en) * | 2008-11-14 | 2010-06-16 | 北京中星微电子有限公司 | Method, system and device for compressing video code stream |
CN103079072A (en) * | 2013-01-15 | 2013-05-01 | 清华大学 | Inter-frame prediction method, encoding equipment and decoding equipment |
CN106385585A (en) * | 2016-09-14 | 2017-02-08 | 苏睿 | Frame coding and decoding method, device and system |
CN109688407A (en) * | 2017-10-18 | 2019-04-26 | 北京金山云网络技术有限公司 | Reference block selection method, device, electronic equipment and the storage medium of coding unit |
CN111698500A (en) * | 2019-03-11 | 2020-09-22 | 杭州海康威视数字技术股份有限公司 | Encoding and decoding method, device and equipment |
CN110636312A (en) * | 2019-09-27 | 2019-12-31 | 腾讯科技(深圳)有限公司 | Video encoding and decoding method and device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11496732B2 (en) | Video image encoding and decoding method, apparatus, and device | |
WO2019192152A1 (en) | Method and device for obtaining motion vector of video image | |
US20180367811A1 (en) | Method and an apparatus for processing a video signal | |
JP6293788B2 (en) | Device and method for scalable coding of video information based on high efficiency video coding | |
KR101571341B1 (en) | Methods and apparatus for implicit block segmentation in video encoding and decoding | |
KR101999091B1 (en) | Video encoding and decoding with improved error resilience | |
JP2019115060A (en) | Encoder, encoding method, decoder, decoding method and program | |
US6757330B1 (en) | Efficient implementation of half-pixel motion prediction | |
US20140064369A1 (en) | Video encoding and decoding | |
KR101390620B1 (en) | Power efficient motion estimation techniques for video encoding | |
US9473790B2 (en) | Inter-prediction method and video encoding/decoding method using the inter-prediction method | |
CN113852815B (en) | Video encoding method, apparatus and medium using triangle shape prediction unit | |
WO2017005141A1 (en) | Method for encoding and decoding reference image, encoding device, and decoding device | |
US20210120264A1 (en) | Affine motion prediction-based image decoding method and device using affine merge candidate list in image coding system | |
US9210447B2 (en) | Method and apparatus for video error concealment using reference frame selection rules | |
CN107105255B (en) | Method and device for adding label in video file | |
US20210368163A1 (en) | Method for reference picture processing in video coding | |
CN111372088B (en) | Video coding method, video coding device, video coder and storage device | |
US20140169476A1 (en) | Method and Device for Encoding a Sequence of Images and Method and Device for Decoding a Sequence of Image | |
JPH07143494A (en) | Coding method for moving image | |
CN112203090A (en) | Video encoding and decoding method and device, electronic equipment and medium | |
CN109672889B (en) | Method and device for constrained sequence data headers | |
CN114071159B (en) | Inter prediction method, encoder, decoder, and computer-readable storage medium | |
KR100968808B1 (en) | Variable length code decoding system and decoding method thereof | |
JP2003179931A (en) | Device, method and program for encoding moving image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210108 |
|
RJ01 | Rejection of invention patent application after publication |