CN111669600A - Video coding method, video coding device, video coder and storage device - Google Patents

Video coding method, video coding device, video coder and storage device Download PDF

Info

Publication number
CN111669600A
CN111669600A CN202010507153.0A CN202010507153A CN111669600A CN 111669600 A CN111669600 A CN 111669600A CN 202010507153 A CN202010507153 A CN 202010507153A CN 111669600 A CN111669600 A CN 111669600A
Authority
CN
China
Prior art keywords
frame
image
image group
pixel
background frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010507153.0A
Other languages
Chinese (zh)
Other versions
CN111669600B (en
Inventor
江东
方诚
曾飞洋
林聚财
殷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202010507153.0A priority Critical patent/CN111669600B/en
Publication of CN111669600A publication Critical patent/CN111669600A/en
Application granted granted Critical
Publication of CN111669600B publication Critical patent/CN111669600B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction

Abstract

The application discloses a video coding method, a video coding device, a video coder and a storage device. The video encoding method includes: acquiring a first image group of a video to be coded, wherein the first image group comprises at least one frame of image; coding the first image group, and acquiring a first background frame from a reconstructed frame of the first image group; encoding at least one frame image after the first image group according to the first background frame; acquiring an updated background frame based on a plurality of coded images after the first image group; and encoding at least one frame image in the subsequent uncoded images according to the updated background frame. By the mode, the background frame is selected or generated, and then the subsequent images are coded on the basis of the background frame, so that the coding performance is favorably improved.

Description

Video coding method, video coding device, video coder and storage device
Technical Field
The present application relates to the field of video encoding and decoding, and in particular, to a video encoding method, an apparatus, an encoder, and a storage apparatus.
Background
Because the data volume of the video image is large, when the video image interaction is carried out, the video image needs to be coded and decoded, and the video coding mainly has the function of compressing video pixel data (RGB, YUV and the like) into a video code stream, so that the data volume of the video is reduced, and the purposes of reducing the network bandwidth in the transmission process and reducing the storage space are achieved.
The video coding system mainly comprises video acquisition, prediction, transformation quantization and entropy coding, wherein the prediction comprises an intra-frame prediction part and an inter-frame prediction part, and the two parts are respectively used for removing the redundancy of video images in space and time.
For application scenes such as video monitoring, most of the application scenes are still scenes, and the existing traditional video coding method usually needs to code a large amount of background redundant information during coding, so that the compression efficiency of scene videos such as monitoring videos is provided with a further improved space.
Disclosure of Invention
The application at least provides a video coding method, a video coding device, a video coder and a storage device.
A first aspect of the present application provides a video encoding method, including: acquiring a first image group of a video to be coded, wherein the first image group comprises at least one frame of image;
coding the first image group, and acquiring a first background frame from a reconstructed frame of the first image group;
encoding at least one frame image after the first image group according to the first background frame;
acquiring an updated background frame based on a number of encoded images after the first image group;
and encoding at least one frame image in the subsequent uncoded images according to the updated background frame.
A second aspect of the present application provides a video encoding apparatus comprising:
the video encoding device comprises an acquisition module, a decoding module and a decoding module, wherein the acquisition module is used for acquiring a first image group of a video to be encoded, and the first image group comprises at least one frame of image;
the background frame selection module is used for coding the first image group and acquiring a first background frame from a reconstructed frame of the first image group; the image processing device is also used for encoding at least one frame of image after the first image group according to the first background frame and acquiring an updated background frame based on a plurality of encoded images after the first image group;
and the coding module is used for coding at least one frame of image in the subsequent uncoded images according to the updated background frame.
A third aspect of the present application provides an encoder comprising a processor, a memory coupled to the processor, wherein the memory stores program instructions for implementing the method of the first aspect; the processor is configured to execute the program instructions stored by the memory to encode the video to be encoded.
A fourth aspect of the present application provides a storage device storing program instructions capable of implementing the method of the first aspect.
According to the scheme, a video coding device acquires a first image group of a video to be coded, wherein the first image group comprises at least one frame of image; coding the first image group, and acquiring a first background frame from a reconstructed frame of the first image group; encoding at least one frame image after the first image group according to the first background frame; acquiring an updated background frame based on a plurality of coded images after the first image group; and encoding at least one frame image in the subsequent uncoded images according to the updated background frame. By the mode, the background frame is selected or generated, and then the subsequent images are coded on the basis of the background frame, so that the coding performance is favorably improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and, together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flowchart illustrating an embodiment of a video encoding method provided in the present application;
FIG. 2 is a flowchart illustrating an embodiment of step S14 in FIG. 1;
FIG. 3 is a schematic diagram illustrating another embodiment of step S14 in FIG. 1;
FIG. 4 is a schematic diagram illustrating a detailed flowchart of another embodiment of step S14 in FIG. 1;
FIG. 5 is a schematic diagram of a detailed flowchart of another embodiment of step S14 in FIG. 1
FIG. 6 is a block diagram of an embodiment of a frame reference provided herein;
FIG. 7 is a block diagram of an embodiment of a video encoding apparatus provided in the present application;
FIG. 8 is a schematic block diagram of an embodiment of an encoder provided herein;
fig. 9 is a schematic structural diagram of an embodiment of a memory device provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first", "second" and "third" in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any indication of the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. All directional indications (such as up, down, left, right, front, and rear … …) in the embodiments of the present application are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indication is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments. It should be noted that, in the following method examples, the method of the present application is not limited to the flow sequence shown in the drawings if the results are substantially the same.
The following describes embodiments of the present application.
Referring to fig. 1, fig. 1 is a flowchart illustrating a video encoding method according to an embodiment of the present disclosure. As shown in fig. 1, the video encoding method includes the steps of:
s11: the method comprises the steps of obtaining a first image group of a video to be coded, wherein the first image group comprises at least one frame of image.
The video coding device acquires a first image group of a video to be coded, namely a first image group according to the video time stamp. Taking the IPPP coding structure as an example, the first group of pictures is a group of pictures consisting of the first I frame and the following n1(n1 > ═ 0) P frames. In some possible embodiments, the video encoding apparatus may acquire images of a preset frame number from a start frame of a video to be encoded to constitute the first image group, where the preset frame number is at least one frame.
S12: the first image group is encoded, and a first background frame is acquired from the reconstructed frame of the first image group.
Where the video encoding device encodes the pictures in the first group of pictures in a conventional encoding manner, for example, the video encoding device first intra-frames the first I-frame and then inter-frames the following P-frame with reference to the previously encoded frame.
The method for acquiring the background frame does not need to extract the background image from a frame reconstruction frame, but uses a complete reconstruction frame as the background frame. Specifically, the generation of the first background frame in this step may be selected from, but not limited to, the following methods:
(1) the first frame image of the first image group is directly selected, for example, the first I frame is intra-coded, and the coded reconstructed frame is used as the first background frame, at this time, the first image group only includes the first frame image.
(2) If the first image group contains an I frame and the following n 2P frames, the weighted average result of the reconstructed frames of the first I frame and the following n2(n2 > 0) P frames, that is, the reconstructed frame obtained by weighted average is used as the first background frame.
For example, in the embodiment of the present disclosure, a first background frame is obtained by performing weighted average operation on reconstructed frames of a first I frame and next 2P frames, where the weight of the first I frame is 1, the weight of the first P frame is 1, and the weight of the second P frame is 2, each pixel in the first background frame is weighted from co-located pixels in the I frame and the two P frames, where the co-located pixels are pixels with the same coordinates in an image. The specific calculation formula of the weighted average is as follows:
Pixbg(x,y)=(1*PixI(x,y)+1*PixP1(x,y)+2*PixP2(x,y))/4
wherein, Pixbg(x, y) represents the pixel at the (x, y) position in the first background frame, PixI(x, y) represents the pixel at the (x, y) position in the I-frame, PixP1(x, y) represents the pixel at the (x, y) position in the first P frame, PixP2(x, y) represents the pixel at the (x, y) position in the second P frame.
S13: and encoding at least one frame image after the first image group according to the first background frame.
S14: an updated background frame is acquired based on a number of encoded images following the first group of images.
In a surveillance video scene, the background of a continuous image shot by a surveillance device at a fixed angle is almost unchanged, but a few pixels of the background portion of the actually shot surveillance video are still changed due to the influence of the surrounding environment, so that the video coding apparatus needs to perform pixel adjustment on the image of the background portion in the first background frame in time when coding the subsequent image.
Alternatively, when the surveillance video is completely changed, such as when the surveillance device is turned to another direction, the background frame needs to be completely updated. The difference between the background frame refresh and the background frame pixel adjustment is that the background frame refresh is to completely replace the whole background frame with a new frame, and the background frame pixel adjustment is to adjust a part of pixels in the original background frame. The two processing modes of background frame refreshing and background frame pixel adjustment are both background frame updating modes.
The coded image refers to a reconstructed image obtained by coding the current image.
In this regard, the background frame may be processed in the following manner:
A. when the motion vector of the pixel block is taken into consideration for the adjustment, step S14 may specifically be composed of the sub-steps in the flowchart of the specific embodiment shown in fig. 2 or the sub-steps in the flowchart of the specific embodiment shown in fig. 3. As shown in fig. 2, step S14 specifically includes the following sub-steps:
s21: the coded images after the first image group are respectively divided into pixel blocks.
Wherein the operation of acquiring a plurality of encoded images after the first image group is: the video encoding device takes M frames (M > ═ 1) consecutive encoded pictures following the first group of pictures as a second group of pictures, and the second group of pictures and the first group of pictures do not overlap. The specific case that the second image group and the first image group do not coincide is: an i-frame image is arranged between the first image group and the second image group, wherein i > -0.
S22: motion vectors for a number of blocks of pixels are obtained.
The video coding device searches each frame image in the second image group according to the pixel block, and obtains MV (motion vector) information of each minimum prediction block. It should be noted that the minimum prediction block refers to a prediction unit with the smallest size into which a picture can be divided.
S23: and under the condition that the motion vectors of the pixel blocks at the same positions in a plurality of coded images after the first image group are all smaller than a first preset threshold value, acquiring an updated background frame according to the pixel values of the pixel blocks at the same positions in the plurality of coded images after the first image group.
When the motion vector of a certain pixel block is smaller than the first preset threshold TH1, it represents that the pixel block belongs to the background part.
Further, the background frame pixel adjustment is specifically operated as: when the motion vectors of the minimum prediction block at the current spatial position of each frame image in the reconstructed images of the second image group exist and are all smaller than the first preset threshold TH1, copying the reconstructed pixel values of the current minimum prediction block of the last frame in the reconstructed images of the second image group into the background frame before the updated background frame, for example, the collocated minimum prediction block in the first background frame, to obtain the updated background frame. Or when the motion vectors of the minimum prediction block at the current spatial domain position of each frame of image in the reconstructed image of the second image group exist and are all smaller than the first preset threshold TH1, performing pixel value weighted average on pixel blocks at the current spatial domain position of each frame of image in the reconstructed image of the second image group, and copying the pixel blocks to the co-located minimum prediction block in the background frame before the update background frame to obtain the update background frame.
The parity block indicates a pixel block having the same spatial domain coordinates in different frames.
In some possible embodiments, when the video to be encoded further includes more image groups such as a third image group following the second image group, the video encoding apparatus further acquires the update background frame according to the reconstructed frames of the third image group after acquiring all the reconstructed frames of the third image group. The manner of obtaining the updated background frame is as described in step S23. When more image groups follow, the manner of acquiring the updated background frame is also analogized. The interval between two adjacent image groups can be uniform fixed interval or different interval. The number of image frames in the second image group and all subsequent image groups may be the same or different.
As shown in fig. 3, step S14 specifically includes the following sub-steps:
s31: the image currently being encoded in each frame after the first group of pictures is divided into blocks of pixels.
S32: motion vectors for a number of blocks of pixels are obtained.
The video coding device searches the current coded image of each frame after the first image group according to the pixel block, and obtains the MV information of each minimum prediction block.
S33: and taking the reconstructed frame of the current frame image as an updated background frame under the condition that the motion vectors of pixel blocks in a preset proportion in the current coded image of each frame after the first image group are all larger than or equal to a first preset threshold value.
When the motion vector of a certain pixel block is greater than or equal to the first preset threshold TH1, it represents that the pixel block belongs to the foreground part.
Further, the specific operations for background frame refreshing are as follows: when a certain current frame is coded, a preset proportion exists in the current frame, that is, when the motion vectors of num% of the pixel blocks are all greater than or equal to the first preset threshold TH1, the background frame before the background frame is updated needs to be completely updated, wherein num% represents the proportion of the pixel blocks whose motion vectors are greater than or equal to the first preset threshold TH1 in all the pixel blocks. And thoroughly updating the background frame before updating the background frame, namely removing the original background frame before updating the background frame, and taking the reconstructed frame coded by the current frame as the updated background frame.
B. When the pixel value of the pixel block is taken into consideration for the adjustment, step S14 may specifically be composed of the sub-steps in the flowchart of the specific embodiment shown in fig. 4 or the sub-steps in the flowchart of the specific embodiment shown in fig. 5. As shown in fig. 4, step S14 specifically includes the following sub-steps:
s41: the coded images after the first image group are respectively divided into pixel blocks.
Wherein the operation of acquiring a plurality of encoded images after the first image group is: the video encoding device takes M frames (M > ═ 1) consecutive encoded pictures following the first group of pictures as a second group of pictures, and the second group of pictures and the first group of pictures do not overlap. The specific case that the second image group and the first image group do not coincide is: an i-frame image is arranged between the first image group and the second image group, wherein i > -0.
The video coding device divides each frame image in the second image group and the background frame before updating the background frame into a plurality of pixel blocks with width w x h.
S42: and calculating the pixel difference value of each pixel block of each frame image in a plurality of coded images after the first image group and the co-located pixel block in the background frame before the background frame is updated.
The video coding device calculates pixel difference values of each pixel block of each frame image in the reconstructed image of the second image group and the homothetic pixel block in the background frame before the background frame is updated, and the pixel difference values can be measured in calculation modes such as SAD/SATD and the like. The smaller the pixel difference value, the more similar the pixel block and the co-located pixel block.
S43: and under the condition that the pixel difference values of the pixel blocks at the same position of each frame of image in a plurality of coded images after the first image group and the pixel blocks at the same position of each frame of image in the plurality of coded images after the first image group are smaller than a second preset threshold value, acquiring an updated background frame according to the pixel values of the pixel blocks at the same position of each frame of image in the plurality of coded images after the first image group.
When the pixel difference between a certain pixel block and the co-located pixel block is smaller than the second predetermined threshold TH2, it indicates that the pixel block belongs to the background portion.
Further, the background frame pixel adjustment is specifically operated as: when the pixel difference value between the pixel block of the current spatial domain position in each frame of image of the reconstructed image of the second image group and the collocated pixel block is smaller than the second preset threshold TH2, the pixel block pixel value of the current spatial domain position of the last frame of image in the reconstructed image of the second image group is copied to the collocated pixel block of the background frame before updating the background frame. Or when the pixel difference values between the pixel block of the current spatial domain position in each frame of image in the reconstructed image of the second image group and the collocated pixel block are both smaller than the second preset threshold TH2, the pixel block of the current spatial domain position of each frame of image in the reconstructed image of the second image group is weighted and averaged and then copied to the collocated pixel block in the background frame before the background frame is updated.
In some possible embodiments, when the video to be encoded further includes more image groups such as a third image group following the second image group, the video encoding apparatus further acquires the update background frame according to the reconstructed frames of the third image group after acquiring all the reconstructed frames of the third image group. The background frame is acquired in the same manner as described in step S43. When more image groups follow, the background frame is acquired in the same way. The interval between two adjacent image groups can be uniform fixed interval or different interval. The number of image frames in the second image group and all subsequent image groups may be the same or different.
As shown in fig. 5, step S14 specifically includes the following sub-steps:
s51: the image currently being encoded in each frame after the first group of pictures is divided into blocks of pixels.
The video coding device divides each frame image in a plurality of coded images after the first image group and a background frame before updating the background frame into a plurality of pixel blocks with width w h.
S52: and calculating the pixel difference value of each pixel block of each frame image in a plurality of coded images after the first image group and the co-located pixel block in the background frame before the background frame is updated.
The video coding device calculates pixel difference values of each pixel block of each frame image in a plurality of coded images after the first image group and the homothetic pixel block in the background frame before updating the background frame, wherein the pixel difference values can be measured by calculation modes such as SAD/SATD and the like. The smaller the pixel difference value, the more similar the pixel block and the co-located pixel block.
S53: and under the condition that the pixel difference values corresponding to the pixel blocks with the preset proportion in each frame of image in a plurality of coded images after the first image group are all larger than or equal to a second preset threshold value, taking the reconstructed frame of the current frame of image as an updated background frame.
When the pixel difference between a certain pixel block and the co-located pixel block is greater than or equal to the second preset threshold TH2, it represents that the pixel block belongs to the foreground portion.
Further, the specific operations for background frame refreshing are as follows: when a certain current frame is coded, a preset proportion exists in the current frame, that is, num% of pixel blocks and the pixel difference value of the co-located pixel block are both greater than or equal to the second preset threshold TH2, the background frame before the background frame is updated needs to be completely updated, wherein num% represents the proportion of the pixel blocks with the pixel difference value of the co-located pixel block greater than or equal to the second preset threshold TH2 in all the pixel blocks. And thoroughly updating the background frame before updating the background frame, namely removing the original background frame before the second updated background frame, and taking the reconstructed frame coded by the current frame as the updated background frame.
It should be noted that, when the video encoding apparatus performs pixel adjustment or refresh on the background frame, the motion vector and/or the pixel value of the pixel block may be taken into consideration, that is, the video encoding apparatus may select the processing manners in fig. 2 to 5, or may combine the processing manners in fig. 2 to 5, which is not described herein again.
In some possible embodiments, the video encoding apparatus may also refresh the background frame directly according to the image frame interval or the type of the image frame, without considering the pixel block, as follows:
(a) the background frame can be updated once by N frames at fixed inter-frame intervals, each update is carried out by using continuous m-frame images for weighted average, and the weighted average result is used as an updated background frame; or, directly selecting one frame of the m-frame images as an update background frame, for example, using the first frame of the m-frame images as the update background frame.
(b) The video coding device updates the background frame to the reconstructed frame of the I frame every time an I frame is coded.
It should be noted that the video encoding apparatus may select one or more of the above background frame processing methods for combination, and the details are not repeated herein.
After the background frame is determined, a frame reference mechanism is described below, that is, the reference relationship of each frame image in the image group is determined:
specifically, after a background frame is obtained, an I frame in a video to be coded directly refers to the background frame, and is coded in an interframe coding mode; the P frame in the video to be encoded needs to refer to the background frame and/or a previous adjacent encoded frame, and is encoded by using an inter-frame encoding method.
For example, please refer to fig. 6, fig. 6 is a block diagram illustrating an embodiment of a frame reference provided in the present application. This embodiment acquires the background frame in the weighted average manner in the manner (a) described above, where m is 4 and N is 0, and the I frame (frame a1) and the P frame (frame B1) in the first group of pictures are encoded in a conventional encoding manner, for example, the video encoding apparatus first performs intra-frame encoding on the first I frame, and then performs inter-frame encoding with reference to the previous encoded frame for the following 3P frames. The video encoding apparatus obtains a first background frame (frame C1) by weighted-averaging the encoded I frame and P frame, and encodes at least one frame image following the first group of pictures based on the first background frame.
Specifically, the I frame (frame A2) following the first group of pictures directly references the first background frame, and the P frame (frame B2) following the first group of pictures references the first background frame and the previously adjacent one of the reconstructed frames. After at least one frame of image after the first group of images is encoded, the first background frame needs to be updated to obtain an updated background frame (frame C2), wherein the updated background frame is used as a reference frame of the I frame (frame A3) in the unencoded image.
S15: and encoding at least one frame image in the subsequent uncoded images according to the updated background frame.
When transmitting the video code stream to the decoding end, the video encoding apparatus may further set a corresponding syntax element based on the video encoding mode of the embodiment of the present disclosure.
Specifically, the video encoding apparatus may set a syntax element flag for the background frame, indicate to the decoding side that a background frame refresh operation is required, and transmit the syntax element to the decoding side.
According to the scheme, a video coding device acquires a first image group of a video to be coded, wherein the first image group comprises at least one frame of image; coding the first image group, and acquiring a first background frame from a reconstructed frame of the first image group; encoding at least one frame image after the first image group according to the first background frame; acquiring an updated background frame based on a plurality of coded images after the first image group; and encoding at least one frame image in the subsequent uncoded images according to the updated background frame. By the mode, the background frame is selected or generated, and then the subsequent images are coded on the basis of the background frame, so that the coding performance is favorably improved.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of a video encoding apparatus according to the present application. As shown in fig. 7, the video encoding apparatus 50 includes:
the obtaining module 51 is configured to obtain a first group of pictures of a video to be encoded, where the first group of pictures includes at least one frame of picture.
A background frame selecting module 52, configured to encode the first image group, and obtain a first background frame from a reconstructed frame of the first image group; and the image processing device is further configured to encode at least one frame of image after the first image group according to the first background frame, and acquire an updated background frame based on a number of encoded images after the first image group.
And an encoding module 53, configured to encode at least one frame of image in the subsequent uncoded images according to the updated background frame.
Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of an encoder provided in the present application. As shown in fig. 8, the encoder 60 includes a processor 61 and a memory 62 coupled to the processor 61.
The memory 62 stores program instructions for implementing the video encoding method or the encoding method described in any of the above embodiments. Processor 61 is operative to execute program instructions stored in memory 62 to encode video to be encoded.
The processor 61 may also be referred to as a CPU (Central Processing Unit). The processor 61 may be an integrated circuit chip having signal processing capabilities. The processor 61 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a memory device provided in the present application. The storage device of the embodiment of the present application stores program instructions 71 capable of implementing all the methods described above, where the program instructions 71 may be stored in the storage device in the form of a software product, and include several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. The aforementioned storage device includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above embodiments are merely examples and are not intended to limit the scope of the present disclosure, and all modifications, equivalents, and flow charts using the contents of the specification and drawings of the present disclosure or those directly or indirectly applied to other related technical fields are intended to be included in the scope of the present disclosure.

Claims (13)

1. A video encoding method, the video encoding method comprising:
acquiring a first image group of a video to be coded, wherein the first image group comprises at least one frame of image;
coding the first image group, and acquiring a first background frame from a reconstructed frame of the first image group;
encoding at least one frame image after the first image group according to the first background frame;
acquiring an updated background frame based on a number of encoded images after the first image group;
and encoding at least one frame image in the subsequent uncoded images according to the updated background frame.
2. The video encoding method of claim 1,
the step of obtaining a first background frame from the reconstructed frame of the first image group includes:
and taking a reconstructed frame obtained by encoding a first frame image of the first image group as the first background frame.
3. The video encoding method of claim 1,
the step of obtaining a first background frame from the reconstructed frame of the first image group includes:
and adopting the weighted average result of the reconstructed frames of all the frame images in the first image group as the first background frame.
4. The video encoding method of claim 1,
the step of obtaining an updated background frame based on a number of encoded images following the first group of images comprises:
dividing a plurality of coded images after the first image group into a plurality of pixel blocks respectively;
acquiring motion vectors of the pixel blocks;
and under the condition that the motion vectors of the pixel blocks at the same positions in a plurality of coded images after the first image group are all smaller than a first preset threshold value, acquiring the updating background frame according to the pixel values of the pixel blocks at the same positions in the plurality of coded images after the first image group.
5. The video encoding method of claim 4,
the step of acquiring the update background frame according to the pixel values of the pixel blocks at the same positions in the encoded images after the first image group when the motion vectors of the pixel blocks at the same positions in the encoded images after the first image group are all smaller than a first preset threshold includes:
under the condition that the motion vectors of the pixel blocks at the current spatial domain positions of the plurality of coded images after the first image group are all smaller than the first preset threshold value, copying the pixel block pixel values at the current spatial domain positions of the last frame images of the plurality of coded images after the first image group to the collocated pixel block of the background frame before the updated background frame, or copying the pixel block pixel values at the current spatial domain positions of the plurality of coded images after the first image group to the collocated pixel block of the background frame before the updated background frame after weighted averaging.
6. The video encoding method of claim 1,
the step of obtaining an updated background frame based on a number of encoded images following the first group of images comprises:
dividing each frame of the image which is currently coded after the first image group into a plurality of pixel blocks;
acquiring motion vectors of the pixel blocks;
and taking the reconstructed frame of the current frame image as the update background frame under the condition that the motion vectors of pixel blocks in a preset proportion in the current coded image of each frame after the first image group are all larger than or equal to the first preset threshold value.
7. The video encoding method of claim 1,
the step of obtaining an updated background frame based on a number of encoded images following the first group of images comprises:
dividing a plurality of coded images after the first image group into a plurality of pixel blocks respectively;
calculating the pixel difference value of each pixel block of each frame image in a plurality of coded images after the first image group and the co-located pixel block in the background frame before the background frame is updated, wherein the width and the height of the pixel block are the same as the width and the height of the co-located pixel block;
and under the condition that the pixel difference values of the pixel blocks at the same position of each frame of image in the plurality of coded images after the first image group and the pixel blocks at the same position of the same position in the plurality of coded images after the first image group are smaller than a second preset threshold value, acquiring the updated background frame according to the pixel values of the pixel blocks at the same position of each frame of image in the plurality of coded images after the first image group.
8. The video encoding method of claim 7,
the step of obtaining the updated background frame according to the pixel value of the pixel block at the same position of each frame image in the plurality of encoded images after the first image group under the condition that the pixel difference value between the pixel block at the same position of each frame image in the plurality of encoded images after the first image group and the pixel block at the same position of each frame image in the plurality of encoded images after the first image group is smaller than a second preset threshold value includes:
under the condition that the pixel block of the current spatial domain position of each frame of image in the plurality of coded images after the first image group and the pixel difference value of the collocated pixel block are both smaller than the second preset threshold, copying the pixel block pixel value of the current spatial domain position of the last frame of image in the plurality of coded images after the first image group to the collocated pixel block of the background frame before updating the background frame, or copying the weighted average of the pixel block pixel value of the current spatial domain position of each frame of image in the plurality of coded images after the first image group to the collocated pixel block of the background frame before updating the background frame.
9. The video encoding method of claim 1,
the step of obtaining an updated background frame based on a number of encoded images following the first group of images comprises:
dividing each frame of the image which is currently coded after the first image group into a plurality of pixel blocks;
calculating the pixel difference value of each pixel block of each frame image in a plurality of coded images after the first image group and the co-located pixel block in the background frame before the background frame is updated, wherein the width and the height of the pixel block are the same as the width and the height of the co-located pixel block;
and under the condition that the pixel difference values corresponding to the pixel blocks with preset proportions in each frame of image in the plurality of coded images after the first image group are all larger than or equal to a second preset threshold value, taking the reconstructed frame of the current frame of image as the updating background frame.
10. The video encoding method of claim 1,
the step of obtaining an updated background frame based on a number of encoded images following the first group of images comprises:
taking the weighted average result of the reconstructed frames of the continuous frames of images after the first image group as the updated background frame;
or, taking a reconstructed frame of any frame of image after the first image group as the update background frame;
or, the first intra-frame key frame after the first image group is used as the update background frame.
11. A video encoding apparatus, comprising:
the video encoding device comprises an acquisition module, a decoding module and a decoding module, wherein the acquisition module is used for acquiring a first image group of a video to be encoded, and the first image group comprises at least one frame of image;
the background frame selection module is used for coding the first image group and acquiring a first background frame from a reconstructed frame of the first image group; the image processing device is also used for encoding at least one frame of image after the first image group according to the first background frame and acquiring an updated background frame based on a plurality of encoded images after the first image group;
and the coding module is used for coding at least one frame of image in the subsequent uncoded images according to the updated background frame.
12. An encoder comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the method of any one of claims 1-10;
the processor is configured to execute the program instructions stored by the memory to encode video to be encoded.
13. A storage device storing program instructions executable by a processor to perform the method of any one of claims 1 to 10.
CN202010507153.0A 2020-06-05 2020-06-05 Video coding method, device, coder and storage device Active CN111669600B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010507153.0A CN111669600B (en) 2020-06-05 2020-06-05 Video coding method, device, coder and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010507153.0A CN111669600B (en) 2020-06-05 2020-06-05 Video coding method, device, coder and storage device

Publications (2)

Publication Number Publication Date
CN111669600A true CN111669600A (en) 2020-09-15
CN111669600B CN111669600B (en) 2024-03-29

Family

ID=72386846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010507153.0A Active CN111669600B (en) 2020-06-05 2020-06-05 Video coding method, device, coder and storage device

Country Status (1)

Country Link
CN (1) CN111669600B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114245145A (en) * 2021-12-18 2022-03-25 杭州视洞科技有限公司 Monitoring equipment video compression method based on background frame
CN117710893A (en) * 2023-12-25 2024-03-15 上海盛煌智能科技有限公司 Multidimensional digital image intelligent campus digitizing system
CN117710893B (en) * 2023-12-25 2024-05-10 上海盛煌智能科技有限公司 Multidimensional digital image intelligent campus digitizing system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6249613B1 (en) * 1997-03-31 2001-06-19 Sharp Laboratories Of America, Inc. Mosaic generation and sprite-based coding with automatic foreground and background separation
CN101127912A (en) * 2007-09-14 2008-02-20 浙江大学 Video coding method for dynamic background frames
CN101465955A (en) * 2009-01-05 2009-06-24 北京中星微电子有限公司 Method and apparatus for updating background
KR20110023468A (en) * 2009-08-31 2011-03-08 주식회사 이미지넥스트 Apparatus and method for detecting and tracking object based on adaptive background
US20160212444A1 (en) * 2015-01-16 2016-07-21 Hangzhou Hikvision Digital Technology Co., Ltd. Systems, Devices and Methods for Video Encoding and Decoding
CN105847871A (en) * 2015-01-16 2016-08-10 杭州海康威视数字技术股份有限公司 Video encoding/decoding method and device thereof
US20160269734A1 (en) * 2015-03-10 2016-09-15 Hangzhou Hikvision Digital Technology Co., Ltd. Systems and Methods for Hybrid Video Encoding
CN106851302A (en) * 2016-12-22 2017-06-13 国网浙江省电力公司杭州供电公司 A kind of Moving Objects from Surveillance Video detection method based on intraframe coding compression domain
CN110062235A (en) * 2019-04-08 2019-07-26 上海大学 Background frames generate and update method, system, device and medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6249613B1 (en) * 1997-03-31 2001-06-19 Sharp Laboratories Of America, Inc. Mosaic generation and sprite-based coding with automatic foreground and background separation
CN101127912A (en) * 2007-09-14 2008-02-20 浙江大学 Video coding method for dynamic background frames
CN101465955A (en) * 2009-01-05 2009-06-24 北京中星微电子有限公司 Method and apparatus for updating background
US20100277586A1 (en) * 2009-01-05 2010-11-04 Vimicro Corporation Method and apparatus for updating background
KR20110023468A (en) * 2009-08-31 2011-03-08 주식회사 이미지넥스트 Apparatus and method for detecting and tracking object based on adaptive background
US20160212444A1 (en) * 2015-01-16 2016-07-21 Hangzhou Hikvision Digital Technology Co., Ltd. Systems, Devices and Methods for Video Encoding and Decoding
CN105847793A (en) * 2015-01-16 2016-08-10 杭州海康威视数字技术股份有限公司 Video coding method and device and video decoding method and device
CN105847871A (en) * 2015-01-16 2016-08-10 杭州海康威视数字技术股份有限公司 Video encoding/decoding method and device thereof
US20160269734A1 (en) * 2015-03-10 2016-09-15 Hangzhou Hikvision Digital Technology Co., Ltd. Systems and Methods for Hybrid Video Encoding
CN106851302A (en) * 2016-12-22 2017-06-13 国网浙江省电力公司杭州供电公司 A kind of Moving Objects from Surveillance Video detection method based on intraframe coding compression domain
CN110062235A (en) * 2019-04-08 2019-07-26 上海大学 Background frames generate and update method, system, device and medium

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
LULU ZHOU: "An adaptive background-frame based video coding method", "2014 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP)" *
胡国庆: "基于HEVC的监控视频动态背景模型", 《软件导刊》 *
胡国庆: "基于HEVC的监控视频动态背景模型", 《软件导刊》, no. 07, 27 July 2016 (2016-07-27) *
胡国庆;: "基于HEVC的监控视频动态背景模型", 软件导刊, no. 07 *
蒋刚毅;费跃;邵枫;彭宗举;郁梅;: "面向编码和绘制的多视点图像颜色校正", 光子学报, no. 09 *
赵占杰;林小竹;张金燕;: "基于背景重建的运动目标检测算法", 北京石油化工学院学报, no. 02 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114245145A (en) * 2021-12-18 2022-03-25 杭州视洞科技有限公司 Monitoring equipment video compression method based on background frame
CN117710893A (en) * 2023-12-25 2024-03-15 上海盛煌智能科技有限公司 Multidimensional digital image intelligent campus digitizing system
CN117710893B (en) * 2023-12-25 2024-05-10 上海盛煌智能科技有限公司 Multidimensional digital image intelligent campus digitizing system

Also Published As

Publication number Publication date
CN111669600B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
US11375229B2 (en) Method, device, and computer program for optimizing transmission of motion vector related information when transmitting a video stream from an encoder to a decoder
US11196989B2 (en) Video encoding method, device and storage medium using resolution information
CN111837397B (en) Error-cancelling code stream indication in view-dependent video coding based on sub-picture code streams
RU2307478C2 (en) Method for compensating global movement for video images
TWI381739B (en) Video encoding method and decoding method, apparatuses therefor, programs therefor, and storage media for storing the programs
US20180131960A1 (en) Video coding method, video decoding method, video coding apparatus, and video decoding apparatus
US9414086B2 (en) Partial frame utilization in video codecs
US9560379B2 (en) Inter-prediction method and video encoding/decoding method using the inter-prediction method
KR101377528B1 (en) Motion Vector Coding and Decoding Method and Apparatus
KR20110123651A (en) Apparatus and method for image coding and decoding using skip coding
US20200021850A1 (en) Video data decoding method, decoding apparatus, encoding method, and encoding apparatus
KR20160087208A (en) Method and apparatus for encoding/decoding video
JP2007036888A (en) Coding method
CN111447442B (en) Intra-frame encoding method, intra-frame decoding method, device, and medium
CN111669600B (en) Video coding method, device, coder and storage device
CN112218087B (en) Image encoding and decoding method, encoding and decoding device, encoder and decoder
CN110753231A (en) Method and apparatus for a multi-channel video processing system
CN111586415B (en) Video coding method, video coding device, video coder and storage device
CN109672889B (en) Method and device for constrained sequence data headers
US9451285B2 (en) Inter-prediction method and video encoding/decoding method using the inter-prediction method
WO2020181540A1 (en) Video processing method and device, encoding apparatus, and decoding apparatus
CN117640940A (en) Video encoding method, video decoding method, computer device, and storage medium
CN115514975A (en) Encoding and decoding method and device
Lei et al. Direct migration motion estimation and mode decision to decoder for a low-complexity decoder Wyner–Ziv video coding
KR20190027405A (en) Method and apparatus for omnidirectional security video coding in using padding technique

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant