CN111669600B

CN111669600B - Video coding method, device, coder and storage device

Info

Publication number: CN111669600B
Application number: CN202010507153.0A
Authority: CN
Inventors: 江东; 方诚; 曾飞洋; 林聚财; 殷俊
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2024-03-29
Anticipated expiration: 2040-06-05
Also published as: CN111669600A

Abstract

The application discloses a video coding method, a video coding device, an encoder and a storage device. The video encoding method includes: acquiring a first image group of a video to be encoded, wherein the first image group comprises at least one frame of image; encoding the first image group, and acquiring a first background frame from the reconstructed frame of the first image group; encoding at least one frame of image after the first image group according to the first background frame; acquiring an updated background frame based on a number of encoded images following the first image group; and encoding at least one frame of image in the subsequent uncoded image according to the updated background frame. By the method, the background frame is selected or generated, and then the subsequent image is encoded on the basis of the background frame, so that the encoding performance is improved.

Description

Video coding method, device, coder and storage device

Technical Field

The present invention relates to the field of video encoding and decoding, and in particular, to a video encoding method, apparatus, encoder and storage device.

Background

Because the data volume of the video image is relatively large, when the video image interaction is carried out, the video image is required to be encoded and decoded, and the main function of video coding is to compress video pixel data (RGB, YUV and the like) into a video code stream, thereby reducing the data volume of the video and realizing the purposes of reducing network bandwidth and reducing storage space in the transmission process.

The video coding system is mainly divided into a video acquisition part, a prediction part, a transformation quantization part and an entropy coding part, wherein the prediction part is divided into an intra-frame prediction part and an inter-frame prediction part, and the two parts are respectively used for removing redundancy of video images in space and time.

For application scenes such as video monitoring, most of the scenes are stationary, and a large amount of background redundant information is often required to be encoded in the conventional video encoding method, so that the compression efficiency of the scene videos such as the monitoring video is still further improved.

Disclosure of Invention

The application provides at least one video coding method, device, coder and storage device.

A first aspect of the present application provides a video encoding method, including: acquiring a first image group of a video to be encoded, wherein the first image group comprises at least one frame of image;

encoding the first image group, and acquiring a first background frame from the reconstructed frame of the first image group;

encoding at least one frame of image after the first image group according to the first background frame;

acquiring an updated background frame based on a number of encoded images following the first image group;

and encoding at least one frame of image in the subsequent uncoded image according to the updated background frame.

A second aspect of the present application provides a video encoding apparatus, comprising:

the acquisition module is used for acquiring a first image group of the video to be encoded, wherein the first image group comprises at least one frame of image;

the background frame selecting module is used for encoding the first image group and acquiring a first background frame from the reconstructed frame of the first image group; the method is also used for encoding at least one frame of image after the first image group according to the first background frame, and acquiring updated background frames based on a plurality of encoded images after the first image group;

and the encoding module is used for encoding at least one frame of image in the subsequent uncoded image according to the updated background frame.

A third aspect of the present application provides an encoder comprising a processor, a memory coupled to the processor, wherein the memory stores program instructions for implementing the method of the first aspect described above; the processor is configured to execute the program instructions stored by the memory to encode video to be encoded.

A fourth aspect of the present application provides a storage device storing program instructions capable of implementing the method of the first aspect.

According to the scheme, the video coding device acquires a first image group of a video to be coded, wherein the first image group comprises at least one frame of image; encoding the first image group, and acquiring a first background frame from the reconstructed frame of the first image group; encoding at least one frame of image after the first image group according to the first background frame; acquiring an updated background frame based on a number of encoded images following the first image group; and encoding at least one frame of image in the subsequent uncoded image according to the updated background frame. By the method, the background frame is selected or generated, and then the subsequent image is encoded on the basis of the background frame, so that the encoding performance is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

FIG. 1 is a flowchart illustrating an embodiment of a video encoding method provided herein;

FIG. 2 is a schematic flow chart of an embodiment of step S14 in FIG. 1;

FIG. 3 is a schematic flow chart of another embodiment of step S14 in FIG. 1;

FIG. 4 is a schematic flow chart of a further embodiment of step S14 in FIG. 1;

FIG. 5 is a flowchart showing a further embodiment of step S14 in FIG. 1

FIG. 6 is a schematic diagram of a frame of one embodiment of a frame reference provided herein;

fig. 7 is a schematic structural diagram of an embodiment of a video encoding device provided in the present application;

FIG. 8 is a schematic diagram of an embodiment of an encoder provided herein;

fig. 9 is a schematic structural diagram of an embodiment of a memory device provided in the present application.

Detailed Description

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The terms "first," "second," "third," and the like in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", and "a third" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise. All directional indications (such as up, down, left, right, front, back … …) in the embodiments of the present application are merely used to explain the relative positional relationship, movement, etc. between the components in a particular gesture (as shown in the drawings), and if the particular gesture changes, the directional indication changes accordingly. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments. It should be noted that, for the following method embodiments, if there are substantially the same results, the method of the present application is not limited to the illustrated flow sequence.

The following describes various embodiments of the present application.

Referring to fig. 1, fig. 1 is a flowchart of an embodiment of a video encoding method provided in the present application. As shown in fig. 1, the video encoding method includes the steps of:

s11: a first group of images of a video to be encoded is acquired, wherein the first group of images includes at least one frame of image.

The video encoding device acquires a first image group, namely a first image group, of the video to be encoded according to the video time stamp. Taking the IPPP coding structure as an example, the first group of pictures is the group of pictures consisting of the first I frame and n1 (n1 > =0) P frames later. In some possible embodiments, the video encoding apparatus may acquire images of a preset number of frames from a start frame of the video to be encoded to form the first image group, where the preset number of frames is at least one frame.

S12: the first group of images is encoded and a first background frame is acquired from the reconstructed frame of the first group of images.

Wherein the video encoding device encodes the pictures in the first group of pictures in a conventional manner, e.g. the video encoding device first intra-codes the first I-frame and then inter-codes the following P-frame with reference to the previous encoded frame.

The method for acquiring the background frame in the embodiment of the disclosure does not need to scratch the background image from a frame reconstructed frame, but takes a complete reconstructed frame as the background frame. Specifically, the generation of the first background frame in this step may be selected from, but not limited to, the following methods:

(1) And directly selecting a first frame image of the first image group, such as a first I frame, for intra-frame coding, and taking the coded reconstructed frame as a first background frame, wherein the first image group only comprises the first frame image.

(2) The weighted average of all the image reconstruction frames in the first image group is adopted as the first background frame, for example, when the first image group contains one I frame and n 2P frames at the back, the weighted average result of the reconstruction frames of the first I frame and n2 (n 2 > 0) P frames at the back, namely, the reconstruction frame obtained by the weighted average is adopted as the first background frame.

For example, in the embodiment of the present disclosure, a weighted average operation is performed on a reconstructed frame of a first I frame and 2P frames to obtain a first background frame, where the weight of the first I frame is set to 1, the weight of the first P frame is set to 1, and the weight of the second P frame is set to 2, so that each pixel in the first background frame is derived from the weights of the I frame and the co-located pixels in the two P frames, where the co-located pixels are pixels with the same coordinates in the image. The specific calculation formula of the weighted average is as follows:

Pix _bg (x,y)＝(1*Pix _I (x,y)+1*Pix _P1 (x,y)+2*Pix _P2 (x,y))/4

wherein, pix _bg (x, y) represents a pixel at the (x, y) position in the first background frame, pix _I (x, y) represents the pixel at the (x, y) position in the I frame, pix _P1 (x, y) represents the pixel at the (x, y) position in the first P frame, pix _P2 (x, y) represents the pixel at the (x, y) position in the second P frame.

S13: at least one frame of image following the first group of images is encoded according to the first background frame.

S14: an updated background frame is acquired based on a number of encoded images following the first image group.

In the video scene to be monitored, the background of continuous images shot by the monitoring device at a fixed angle is almost unchanged, but due to the influence of the surrounding environment, few pixels of the background part of the actually shot monitoring video still change, so that the video coding device needs to adjust the pixels of the image of the background part in the first background frame in time when coding the later image.

Alternatively, when the monitoring video is completely changed, for example, the monitoring device is turned to another direction, the background frame needs to be completely updated. The difference between the background frame refresh and the background frame pixel adjustment is that the background frame refresh is to completely replace the entire background frame with a new frame, and the background frame pixel adjustment is to adjust a part of pixels in the original background frame. The background frame refreshing and the background frame pixel adjusting are both updating modes of the background frame.

Wherein the encoded image refers to a reconstructed image after the current image has been encoded.

In this regard, the background frames may be processed in the following manner:

A. when the motion vector of the pixel block is taken as a consideration for adjustment, the step S14 may specifically be composed of a sub-step in the flow chart of the embodiment shown in fig. 2 or a sub-step in the flow chart of the embodiment shown in fig. 3. As shown in fig. 2, step S14 specifically includes the following sub-steps:

s21: the encoded images following the first group of images are divided into blocks of pixels, respectively.

Wherein the operation of acquiring a number of encoded images after the first group of images is: the video encoding device uses M frames (M > =1) of continuous encoded images after the first image group as a second image group, and the second image group is not overlapped with the first image group. The case where the second image group and the first image group do not overlap is specifically: an i-frame image is arranged between the first image group and the second image group, wherein i > =0.

S22: motion vectors for a number of pixel blocks are acquired.

The video encoding device searches each frame of image in the second image group according to the pixel blocks to obtain MV (motion vector) information of each minimum prediction block. The minimum prediction block refers to a prediction unit having a minimum size into which an image can be divided.

S23: and under the condition that the motion vectors of the pixel blocks at the same position in the plurality of coded images after the first image group are smaller than a first preset threshold value, acquiring an updated background frame according to the pixel values of the pixel blocks at the same position in the plurality of coded images after the first image group.

When the motion vector of a certain pixel block is smaller than the first preset threshold TH1, the pixel block is represented to belong to the background part.

Further, the specific operation of background frame pixel adjustment is: when motion vectors of the minimum predicted block of the current spatial position of each frame image in the reconstructed image of the second image group exist and are smaller than the first preset threshold value TH1, the reconstructed pixel value of the current minimum predicted block of the last frame in the reconstructed image of the second image group is copied to a background frame before updating the background frame, for example, the same-position minimum predicted block in the first background frame, so as to obtain the updated background frame. Or when the motion vector of the minimum prediction block of the current spatial position of each frame of image in the reconstructed image of the second image group exists and is smaller than the first preset threshold value TH1, the pixel block of the current spatial position of each frame of image in the reconstructed image of the second image group is copied into the co-located minimum prediction block in the background frame before the background frame is updated after the pixel value weighted average is carried out, so that the updated background frame is obtained.

Note that the above-mentioned parity block represents a pixel block having the same spatial domain coordinates in different frames.

In some possible embodiments, when the video to be encoded further includes more image groups such as a third image group subsequent to the second image group, the video encoding apparatus acquires all the reconstructed frames of the third image group, and further acquires the updated background frame according to the reconstructed frames of the third image group. The method for obtaining the updated background frame is the same as that in step S23. The same applies to the way to obtain updated background frames when more image groups follow. The interval between two adjacent image groups can be uniform fixed interval or different interval. The number of image frames in the second image group and in all subsequent image groups may be the same or different.

As shown in fig. 3, step S14 specifically includes the following sub-steps:

s31: each frame following the first group of images is divided into blocks of pixels, the image currently being encoded.

S32: motion vectors for a number of pixel blocks are acquired.

The video coding device searches the current coded image of each frame after the first image group according to pixel blocks to acquire MV information of each minimum prediction block.

S33: and taking the reconstructed frame of the current frame image as an updated background frame under the condition that the motion vector of a pixel block with a preset proportion is larger than or equal to a first preset threshold value in the current coded image of each frame after the first image group.

When the motion vector of a certain pixel block is greater than or equal to a first preset threshold value TH1, the pixel block is represented to belong to a foreground part.

Further, the specific operation of refreshing the background frame is as follows: when the motion vector of the pixel block with the num% representing the motion vector greater than or equal to the first preset threshold value TH1 occupies all the pixel blocks, the background frame before the background frame is updated needs to be thoroughly updated. And thoroughly updating the background frame before updating the background frame, namely removing the background frame before the original updated background frame, and taking the reconstructed frame coded by the current frame as the updated background frame.

B. When the pixel values of the pixel block are taken as the adjustment consideration, the step S14 may specifically be composed of a sub-step in the flow chart of the embodiment shown in fig. 4 or a sub-step in the flow chart of the embodiment shown in fig. 5. As shown in fig. 4, step S14 specifically includes the following sub-steps:

s41: the encoded images following the first group of images are divided into blocks of pixels, respectively.

The video coding device divides each frame image in the second image group and the background frame before updating the background frame into a plurality of pixel blocks with width and height of w x h.

S42: and calculating pixel difference values of each pixel block of each frame image in the plurality of coded images after the first image group and the co-located pixel blocks in the background frame before updating the background frame.

The video encoding device calculates pixel difference values of each pixel block of each frame image in the reconstructed image of the second image group and the co-located pixel blocks in the background frame before updating the background frame, and the pixel difference values can be measured by SAD/SATD and other calculation modes. The smaller the pixel difference value, the more similar the pixel block and the co-located pixel block are.

S43: and under the condition that the pixel difference values of the pixel blocks at the same position of each frame of image in the plurality of encoded images behind the first image group and the pixel difference values of the pixel blocks at the same position of each frame of image in the plurality of encoded images behind the first image group are smaller than a second preset threshold value, acquiring an updated background frame according to the pixel values of the pixel blocks at the same position of each frame of image in the plurality of encoded images behind the first image group.

When the pixel difference value between a certain pixel block and a co-located pixel block is smaller than the second preset threshold TH2, it is indicated that the pixel block belongs to the background portion.

Further, the specific operation of background frame pixel adjustment is: when the pixel difference value between the pixel block at the current spatial position and the co-located pixel block in each frame of the reconstructed image of the second image group is smaller than the second preset threshold value TH2, copying the pixel value of the pixel block at the current spatial position of the last frame of the reconstructed image of the second image group to the co-located pixel block of the background frame before updating the background frame. Or when the pixel difference value between the pixel block at the current spatial position in each frame of image in the reconstructed image of the second image group and the pixel block at the co-located pixel block is smaller than the second preset threshold value TH2, the pixel value of the pixel block at the current spatial position in each frame of image in the reconstructed image of the second image group is weighted and averaged, and then copied to the co-located pixel block in the background frame before updating the background frame.

In some possible embodiments, when the video to be encoded further includes more image groups such as a third image group subsequent to the second image group, the video encoding apparatus acquires all the reconstructed frames of the third image group, and further acquires the updated background frame according to the reconstructed frames of the third image group. The background frame is acquired in the same manner as in step S43. The way the background frame is acquired is the same as more image groups follow. The interval between two adjacent image groups can be uniform fixed interval or different interval. The number of image frames in the second image group and in all subsequent image groups may be the same or different.

As shown in fig. 5, step S14 specifically includes the following sub-steps:

s51: each frame following the first group of images is divided into blocks of pixels, the image currently being encoded.

The video encoding device divides each frame image in a plurality of encoded images after the first image group and a background frame before updating the background frame into a plurality of pixel blocks with width and height of w.h.

S52: and calculating pixel difference values of each pixel block of each frame image in the plurality of coded images after the first image group and the co-located pixel blocks in the background frame before updating the background frame.

The video encoding device calculates pixel difference values of each pixel block of each frame image in a plurality of encoded images after the first image group and the pixel difference values of the co-located pixel blocks in the background frame before updating the background frame, and the pixel difference values can be measured through SAD/SATD and other calculation modes. The smaller the pixel difference value, the more similar the pixel block and the co-located pixel block are.

S53: and taking the reconstructed frame of the current frame image as an updated background frame under the condition that pixel difference values corresponding to pixel blocks with preset proportions in each frame image in a plurality of coded images behind the first image group are larger than or equal to a second preset threshold value.

When the pixel difference value between a certain pixel block and a co-located pixel block is greater than or equal to a second preset threshold value TH2, it is indicated that the pixel block belongs to the foreground portion.

Further, the specific operation of refreshing the background frame is as follows: when the current frame is encoded, a preset proportion exists in the current frame, namely, when the pixel difference value of a num% pixel block and a co-located pixel block is larger than or equal to a second preset threshold value TH2, the background frame before the background frame is updated is required to be thoroughly updated, wherein num% represents the proportion of the pixel block with the pixel difference value larger than or equal to the second preset threshold value TH2 to all the pixel blocks. And thoroughly updating the background frame before updating the background frame, namely removing the background frame before the original second updated background frame, and taking the reconstructed frame coded by the current frame as the updated background frame.

It should be noted that, when the video encoding device performs pixel adjustment or refreshing on the background frame, the motion vector and/or the pixel value of the pixel block may be taken into consideration, that is, the video encoding device may select the processing modes in fig. 2 to 5, or may combine the processing modes in fig. 2 to 5, which are not described herein again.

In some possible embodiments, the video encoding device may also directly refresh the background frame according to the image frame interval or the type of the image frame without considering the factors of the pixel blocks, specifically as follows:

(a) The background frames can be updated once at fixed inter-frame intervals N frames, each update is weighted and averaged by using continuous m frame images, and the weighted and averaged result is used as the updated background frame; or directly selecting one frame in the m frame images as an updated background frame, for example, taking the first frame image in the m frame images as the updated background frame.

(b) Each time the video encoding device encodes an I frame, the background frame is updated to be a reconstructed frame of the I frame.

It should be noted that the video encoding apparatus may select one or several modes from the above background frame processing modes for combination, which is not described herein.

After determining the background frame, the frame reference mechanism is described below, that is, the reference relationship of each frame image in the image group is determined:

specifically, after a background frame is obtained, an I frame in a video to be encoded directly references the background frame and is encoded in an inter-frame encoding mode; the P-frames in the video to be encoded are encoded by inter-frame coding with reference to the background frame and/or a previously adjacent encoded frame.

For example, referring to fig. 6, fig. 6 is a schematic diagram of a frame reference embodiment provided in the present application. This embodiment acquires the background frame in the weighted average manner in the above manner (a), where m=4, n=0, and the I frame (frame A1) and the P frame (frame B1) in the first image group are encoded in a conventional encoding manner, for example, the video encoding apparatus first intra-encodes the first I frame, and then inter-encodes the following 3P frames with reference to the previous encoded frame. The video encoding device obtains a first background frame (frame C1) after weighted-averaging the encoded I-frame and P-frame, and encodes at least one frame image after the first image group based on the first background frame.

Specifically, the I frame (frame A2) following the first group of pictures directly references the first background frame, and the P frame (frame B2) following the first group of pictures references the first background frame and the previously adjacent one of the reconstructed frames. After at least one frame of image after the first image group is encoded, the first background frame needs to be updated to obtain an updated background frame (frame C2), wherein the updated background frame is used as a reference frame of an I frame (frame A3) in the uncoded image.

S15: and encoding at least one frame of image in the subsequent uncoded image according to the updated background frame.

When the video code stream is transmitted to the decoding end, the video encoding device may set a corresponding syntax element based on the video encoding mode according to the embodiments of the present disclosure.

Specifically, the video encoding device may set a syntax element identifier for the background frame, indicate to the decoding end that the background frame refresh operation is required, and transmit the syntax element to the decoding end.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of a video encoding device provided in the present application. As shown in fig. 7, the video encoding apparatus 50 includes:

the obtaining module 51 is configured to obtain a first image group of the video to be encoded, where the first image group includes at least one frame of image.

A background frame selection module 52, configured to encode the first image group, and obtain a first background frame from the reconstructed frame of the first image group; and the method is also used for encoding at least one frame of image after the first image group according to the first background frame, and acquiring updated background frames based on a plurality of encoded images after the first image group.

And the encoding module 53 is configured to encode at least one frame of image in the subsequent uncoded image according to the updated background frame.

Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of an encoder provided in the present application. As shown in fig. 8, the encoder 60 includes a processor 61 and a memory 62 coupled to the processor 61.

The memory 62 stores program instructions for implementing the video encoding method or methods described in any of the embodiments above. The processor 61 is configured to execute program instructions stored in the memory 62 to encode video to be encoded.

The processor 61 may also be referred to as a CPU (Central Processing Unit ). The processor 61 may be an integrated circuit chip with signal processing capabilities. Processor 61 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a memory device provided in the present application. The storage device of the embodiments of the present application stores program instructions 71 capable of implementing all the methods described above, where the program instructions 71 may be stored in the storage device as a software product, and include several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. The aforementioned storage device includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a terminal device such as a computer, a server, a mobile phone, a tablet, or the like.

In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The foregoing is only the embodiments of the present application, and not the patent scope of the present application is limited by the foregoing description, but all equivalent structures or equivalent processes using the contents of the present application and the accompanying drawings, or directly or indirectly applied to other related technical fields, which are included in the patent protection scope of the present application.

Claims

1. A video encoding method, the video encoding method comprising:

acquiring a first image group of a video to be encoded, wherein the first image group comprises at least one frame of image;

coding at least one frame of image in the subsequent uncoded image according to the updated background frame;

the step of acquiring a first background frame from the reconstructed frames of the first image group includes:

adopting a weighted average result of reconstructed frames of all frame images in the first image group as the first background frame;

or taking the reconstructed frame after the first frame image of the first image group is coded as the first background frame;

the step of updating the background frame based on a number of encoded image acquisitions following the first image group comprises:

dividing an image currently being encoded of each frame following the first image group into a plurality of pixel blocks; searching the current coded image of each frame behind the first image group according to the pixel blocks to obtain the motion vectors of the pixel blocks; updating a background frame under the condition that the motion vector of a pixel block with a preset proportion is larger than or equal to a first preset threshold value in the current coded image of each frame behind the first image group, and taking a reconstructed frame of the current frame image as the updated background frame;

or dividing a plurality of coded images after the first image group into a plurality of pixel blocks respectively; obtaining motion vectors of the pixel blocks; under the condition that the motion vectors of the pixel blocks at the same positions in the plurality of coded images after the first image group are smaller than a first preset threshold value, acquiring the updated background frame according to the pixel values of the pixel blocks at the same positions in the plurality of coded images after the first image group;

or dividing a plurality of coded images after the first image group into a plurality of pixel blocks respectively; calculating pixel difference values of each pixel block of each frame image in a plurality of coded images after the first image group and a co-located pixel block in a background frame before the updated background frame, wherein the width and the height of the pixel block are the same as those of the co-located pixel block; acquiring the updated background frame according to the pixel values of the pixel blocks at the same position of each frame of image in a plurality of encoded images after the first image group under the condition that the pixel difference values of the pixel blocks at the same position of each frame of image in a plurality of encoded images after the first image group and the pixel difference values of the pixel blocks at the same position are smaller than a second preset threshold value;

or dividing the image currently being encoded of each frame after the first image group into a plurality of pixel blocks; calculating pixel difference values of each pixel block of each frame image in a plurality of coded images after the first image group and a co-located pixel block in a background frame before the updated background frame, wherein the width and the height of the pixel block are the same as those of the co-located pixel block; and taking the reconstructed frame of the current frame image as the updated background frame under the condition that pixel difference values corresponding to pixel blocks with preset proportion in each frame image in a plurality of coded images behind the first image group are larger than or equal to a second preset threshold value.

2. The video coding method of claim 1, wherein,

the step of obtaining the updated background frame according to the pixel values of the pixel blocks at the same position in the plurality of encoded images after the first image group under the condition that the motion vectors of the pixel blocks at the same position in the plurality of encoded images after the first image group are smaller than a first preset threshold value comprises the following steps:

and copying pixel values of pixel blocks of the current spatial positions of the plurality of coded images after the first image group to a co-located pixel block of a background frame before the updating background frame or copying pixel values of pixel blocks of the current spatial positions of the plurality of coded images after the first image group to a co-located pixel block of the background frame before the updating background frame after weighted average under the condition that the motion vectors of the pixel blocks of the current spatial positions of the plurality of coded images after the first image group are smaller than the first preset threshold value.

3. The video coding method of claim 1, wherein,

the step of obtaining the updated background frame according to the pixel value of the pixel block at the same position of each frame of image in the plurality of encoded images after the first image group when the pixel difference value between the pixel block at the same position of each frame of image in the plurality of encoded images after the first image group and the pixel difference value of the co-located pixel block are smaller than a second preset threshold value comprises the following steps:

and copying pixel values of pixel blocks of the current spatial position of the last frame image of the plurality of encoded images after the first image group to a parity pixel block of a background frame before the updating background frame or copying pixel values of pixel blocks of the current spatial position of each frame image of the plurality of encoded images after the first image group to a parity pixel block of the background frame before the updating background frame after weighted average under the condition that pixel difference values of the pixel blocks of the current spatial position of each frame image and the parity pixel blocks of each frame image are smaller than the second preset threshold value.

4. A video encoding apparatus, comprising:

or dividing the image currently being encoded of each frame after the first image group into a plurality of pixel blocks; calculating pixel difference values of each pixel block of each frame image in a plurality of coded images after the first image group and a co-located pixel block in a background frame before the updated background frame, wherein the width and the height of the pixel block are the same as those of the co-located pixel block; taking a reconstructed frame of the current frame image as the updated background frame under the condition that pixel difference values corresponding to pixel blocks with preset proportions in each frame image in a plurality of coded images behind the first image group are larger than or equal to a second preset threshold value;

5. An encoder comprising a processor, a memory coupled to the processor, wherein,

the memory stores program instructions for implementing the method of any one of claims 1-3;

the processor is configured to execute the program instructions stored by the memory to encode video to be encoded.

6. A storage device storing program instructions executable by a processor to implement the method of any one of claims 1-3.