CN110636373B

CN110636373B - Image processing method and device and electronic equipment

Info

Publication number: CN110636373B
Application number: CN201910995121.7A
Authority: CN
Inventors: 孙彪; 刘挺; 田兴业; 张伟; 朱鹏飞
Original assignee: Xiamen Meitu Technology Co Ltd
Current assignee: Xiamen Meitu Technology Co Ltd
Priority date: 2019-10-18
Filing date: 2019-10-18
Publication date: 2022-02-01
Anticipated expiration: 2039-10-18
Also published as: CN110636373A

Abstract

The application provides an image processing method and device and electronic equipment, and relates to the technical field of image processing. The image processing method comprises the following steps: acquiring position information of a watermark area existing in a current video frame aiming at the current video frame in a video to be processed; smearing the watermark region according to the position information to form a smearing region; and filling the smearing region according to the first N frames of video frames in the video to be processed. By the method, the problem that the consistency of the filling content between the video frames is insufficient can be solved.

Description

Image processing method and device and electronic equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, and an electronic device.

Background

Some watermarks will typically be present in the video, for example for displaying subtitles in a logo or similar movie of the copyright owner, etc. Where in some cases a watermark in the video may affect the use of the video. Therefore, in the prior art, the watermark in the video is generally removed according to the requirement.

However, the inventor has found that in the existing watermark removal processing technology, watermark regions in each video frame are generally detected first, and then difference calculation is performed on each watermark region based on edge pixel points to obtain filling content, so as to remove the watermark. In this way, in the video from which the watermark is removed, there is a problem that the continuity of the filling content between the video frames is insufficient.

Disclosure of Invention

In view of the above, an object of the present application is to provide an image processing method, an image processing apparatus and an electronic device, so as to solve the problems in the prior art.

In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:

an image processing method comprising:

acquiring position information of a watermark area existing in a current video frame aiming at the current video frame in a video to be processed;

smearing the watermark region according to the position information to form a smearing region;

and filling the smearing region according to the first N frames of video frames in the video to be processed.

In a preferred option of the embodiment of the present application, the step of performing filling processing on the smear region according to the first N frames of video frames in the video to be processed includes:

respectively acquiring feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing, and calculating optical flow between the current video frame and the previous N frames of video frames according to the feature information;

obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing;

and filling the smearing region according to the characteristic information of the first N frames of video frames after filling.

In a preferred option of the embodiment of the present application, the step of respectively obtaining feature information of the current video frame and the previous N frames of video frames in the video to be processed before performing the filling processing includes:

and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.

In a preferred option of the embodiment of the present application, the step of performing filling processing on the smear region according to the feature information of the previous N frames of video frames after performing filling processing includes:

and calculating to obtain the characteristic information of the current video frame after the filling processing according to the characteristic information of the current video frame and the characteristic information of the previous N frames after the filling processing.

In a preferred option of the embodiment of the present application, the step of calculating the feature information of the current video frame after the padding processing according to the feature information of the current video frame and the feature information of the previous N video frames after the padding processing includes:

acquiring mask information of a watermark area existing in the current video frame;

and calculating according to the mask information, the characteristic information of the current video frame and the characteristic information of the previous N frames of video frames after filling processing according to a preset calculation formula to obtain target characteristic information of the current video frame.

In a preferred selection of the embodiment of the present application, the calculation formula includes:

F′_t＝(1-m)*F′_t-1+m*F_t；

f't represents target characteristic information of the current video frame, F't-1 represents characteristic information of the previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information.

An embodiment of the present application further provides an image processing apparatus, including:

the information acquisition module is used for acquiring the position information of a watermark area existing in a current video frame in a video to be processed;

the smearing processing module is used for smearing the watermark area according to the position information to form a smearing area;

and the filling processing module is used for filling the smearing region according to the first N frames of video frames in the video to be processed.

In a preferred option of the embodiment of the present application, the filling processing module includes:

the characteristic information acquisition submodule is used for respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing is carried out, and calculating optical flow between the current video frame and the previous N frames of video frames according to the characteristic information;

the characteristic information calculation submodule is used for obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing;

and the filling processing submodule is used for filling the smearing area according to the characteristic information of the previous N frames of video frames after filling processing.

In a preferred option of the embodiment of the present application, the feature information obtaining sub-module is specifically configured to:

An embodiment of the present application further provides an electronic device, which includes a memory and a processor, where the processor is configured to execute an executable computer program stored in the memory to implement the above-mentioned image processing method.

According to the image processing method, the image processing device and the electronic equipment, the previous N frames of video frames in the video to be processed are used for filling the coating area formed by coating the watermark area in the current video frame, so that the image of the current video frame after the watermark is removed is obtained, and the problem of insufficient continuity of the filling content among the video frames caused by the fact that the difference value calculation is carried out on each video frame by independently utilizing the pixel points at the edge of the watermark area to obtain the filling content in the prior art is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure.

Fig. 2 is a schematic flowchart of an image processing method according to an embodiment of the present application.

Fig. 3 is a schematic flowchart of step S130 according to an embodiment of the present application.

Fig. 4 is a block flow diagram of an image processing apparatus according to an embodiment of the present application.

Icon: 10-an electronic device; 12-a memory; 14-a processor; 100-an image processing apparatus; 110-an information acquisition module; 120-a smearing processing module; 130-fill process module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the present application, it is further noted that, unless expressly stated or limited otherwise, the terms "disposed," "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.

As shown in fig. 1, an embodiment of the present application provides an electronic device 10. The specific type of the electronic device 10 is not limited, and may be set according to the actual application requirements. For example, devices may include, but are not limited to, computers, tablets, cell phones, and the like.

The electronic device 10 may include, among other things, a memory 12, a processor 14, and an image processing apparatus 100. In detail, the memory 12 and the processor 14 are electrically connected directly or indirectly to enable data transmission or interaction. For example, they may be electrically connected to each other via one or more communication buses or signal lines. The image processing apparatus 100 includes at least one software functional module that can be stored in the memory 12 in the form of software or firmware (firmware). The processor 14 is used for executing executable computer programs stored in the memory 12, such as software functional modules and computer programs included in the image processing apparatus 100, so as to implement the image processing method.

The Memory 12 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.

The processor 14 may be an integrated circuit chip having signal processing capabilities. The Processor 14 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), a System on Chip (SoC), and the like.

It will be appreciated that the configuration shown in FIG. 1 is merely illustrative and that the electronic device 10 may include more or fewer components than shown in FIG. 1 or may have a different configuration than shown in FIG. 1.

With reference to fig. 2, an embodiment of the present application further provides an image processing method applicable to the electronic device 10. The method steps defined by the flow related to the image processing method can be implemented by the electronic device 10, and the specific flow shown in fig. 2 will be described in detail below.

Step S110, for a current video frame in a video to be processed, obtaining position information of a watermark region existing in the current video frame.

In this embodiment, the video to be processed may be obtained by inputting or uploading a video by a user. After obtaining the video to be processed, the electronic device 10 may respectively process each frame of video frame in the video to be processed, for example, obtain position information of a watermark region existing in the frame of video frame.

And step S120, smearing the watermark region according to the position information to form a smearing region.

In this embodiment, after the position information of the watermark region is obtained in step S110, the position of the watermark region may be determined based on the position information, and then the position is subjected to smearing processing, so that the watermark of the watermark region may be removed, thereby forming a smeared region.

And step S130, filling the smearing region according to the first N frames of video frames in the video to be processed.

In this embodiment, after the smear region is formed in step S120, the smear region may be filled according to the first N frames of the video frames in the video to be processed, so as to eliminate the smear region, so that no blank region exists in the obtained video frames.

By the method, after the watermark area existing in the current video frame is smeared to form the smearing area, the smearing area can be filled by combining the previous N frames of video frames to obtain the image of the current video frame after the watermark is removed. Therefore, the problem that in the prior art, the difference value calculation is carried out on each video frame by independently utilizing the pixel points at the edge of the watermark area of the video frame so as to complete the filling of the smearing area, and the continuity of filling content between the video frames is insufficient can be solved.

Optionally, the manner of obtaining the location information in step S110 is not limited, and may be selected according to the actual application requirement.

For example, in an alternative example, step S110 may include the following sub-steps: and acquiring information generated by framing the watermark area by a user to obtain the position information of the watermark area.

For another example, in another alternative example, step S110 may include the following sub-steps: and detecting the position information of the watermark area through a preset detection model.

The detection model may include a first detection convolution layer group, a second detection convolution layer group, a third detection convolution layer group, and a fourth detection convolution layer group.

In detail, the first detection convolution layer group performs convolution operation on the current video frame by a convolution kernel 3 × 3 to generate a first detection convolution feature map. And the second detection convolution layer group performs convolution operation on the first detection convolution characteristic diagram by a convolution kernel 3 x 3 to generate a second detection convolution characteristic diagram. And the third detection convolution layer group performs convolution operation on the second detection convolution characteristic diagram by a convolution kernel 3 x 3 to generate a third detection convolution characteristic diagram. And the fourth detection convolution layer group performs convolution operation on the third detection convolution characteristic graph by a convolution kernel 3 x 3 to generate a minimum rectangular surrounding frame of the watermark region in the current video frame, so as to obtain the position information of the watermark region.

It should be noted that the first detection convolution layer group may include a convolution layer and an activation function layer, leak relu. The second detection convolution layer group may include a convolution layer 121 and a convolution layer 122, the convolution layer 121 may include a convolution layer and an activation function layer, leak relu, and the step size may be set to 2. The convolutional layer 122 may include a convolutional layer and an activation function layer, LeakyReLU, and the step size may be set to 1. The third detection convolution layer group may include a convolution layer 131, a convolution layer 132, and a convolution layer 133, and the convolution layer 131 may include a convolution layer and an activation function layer, leak _ relu, and the step size may be set to 2. The convolutional layers 132 and 133 may include a convolutional layer and an activation function layer, leakage relu, respectively, and the step size may be set to 1, respectively. The fourth detection convolution layer set may include a convolution layer 141, a convolution layer 142, a convolution layer 143, a convolution layer 144, and a convolution layer 145, the convolution layer 141, the convolution layer 142, the convolution layer 143, and the convolution layer 144 may include a convolution layer and an activation function layer, leakage relu, respectively, the step size may be set to 1, and the expansion coefficients may be 2, 4, 8, and 16, respectively. The convolution layer 145 may regress the bounding box of the watermark region, and each position may regress 6 parameters, which are the probability of having a watermark, the probability of not having a watermark, the abscissa of the upper left corner point of the smallest rectangular bounding box of the watermark region, the ordinate of the upper left corner point of the smallest rectangular bounding box of the watermark region, the width of the smallest rectangular bounding box of the watermark region, and the height of the smallest rectangular bounding box of the watermark region.

Optionally, the way of smearing the watermark region in step S120 is not limited, and may be selected according to the actual application requirement.

For example, in an alternative example, step S120 may include the following sub-steps: and eliminating all pixel points included in the watermark region according to the position information of the watermark region to obtain a smearing region which does not include the pixel information of the pixel points.

In step S130, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.

For example, in an alternative example, in order to improve the efficiency of image processing and avoid an excessive amount of computation, the specific number of the first N frames of video frames may be one, that is, the smearing region may be filled according to a previous frame of video frame in the video to be processed.

For another example, in another alternative example, in order to improve the accuracy of the filling process and the consistency of the filling content between the video frames, the specific number of the first N frames of video frames may be two, that is, the filling process may be performed on the smear region according to the first two frames of video frames in the video to be processed.

Optionally, the filling processing manner of the smearing region through step S130 is not limited, and may be selected according to the actual application requirement.

For example, in an alternative example, in conjunction with fig. 3, step S130 may include step S131, step S132, and step S133.

Step S131, respectively obtaining the feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing, and calculating the optical flow between the current video frame and the previous N frames of video frames according to the feature information.

Step S132, obtaining the feature information of the previous N frames of video frames after the filling processing according to the optical flow and the feature information of the previous N frames of video frames before the filling processing.

And step S133, filling the smearing region according to the characteristic information of the previous N frames of video frames after filling.

In this embodiment, based on step S131, step S132 and step S133, the smearing region can be effectively filled, and an image of the current video frame after removing the watermark is obtained.

In step S131, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.

For example, in an alternative example, if the specific number of the first N frames of the video frame is one, step S131 may specifically be: respectively acquiring the characteristic information of the current video frame and the previous frame of video frame in the video to be processed before filling processing, and calculating the optical flow between the current video frame and the previous frame of video frame according to the characteristic information.

It should be noted that the optical flow may represent the instantaneous speed of the pixel motion of a spatially moving object on the observation imaging plane, and is a method for finding the correspondence between the previous frame and the current frame by using the change of the pixels in the image sequence in the time domain and the correlation between the adjacent frames, so as to calculate the motion information of the object between the adjacent frames. In general, optical flow is due to movement of the foreground objects themselves in the scene, motion of the camera, or both.

The specific way of calculating the optical flow is not limited, and may be set according to the actual application requirements.

For example, in an alternative example, feature information of the current video frame and a previous video frame in the video to be processed before the filling process is performed may be calculated by FlowNet2, so as to obtain the optical flow between frames.

Based on different manners of obtaining the feature information of the current video frame and the previous N frames of video frames in the video to be processed before performing the padding processing, step S131 may include different sub-steps.

For example, in an alternative example, step S131 may include the following sub-steps: and respectively acquiring the characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model.

Wherein the first neural network model may include a first set of convolution layers, a second set of convolution layers, a third set of convolution layers, and a fourth set of convolution layers.

In detail, the first convolution layer performs a gate convolution operation on the current video frame by a convolution kernel 3 × 3 to generate a first convolution feature map. And the second convolution layer group performs gate convolution operation on the first convolution feature map by a convolution kernel 3 x 3 to generate a second convolution feature map. And the third detection convolution layer group performs gate convolution operation on the second convolution characteristic diagram by a convolution kernel 3 x 3 to generate a third convolution characteristic diagram. And the fourth convolution layer group performs gate convolution operation on the third convolution characteristic diagram by a convolution kernel 3 x 3 to obtain the characteristic information of the current video frame.

It should be noted that the first scrolling layer set may include a door scrolling layer. The second convolution layer group may include a convolution layer 221 and a convolution layer 222, the convolution layer 221 may include a gate convolution layer, and the step size may be set to 2. The convolutional layer 222 may comprise a layer of gate convolutional layer, and the step size may be set to 1. The third detection convolution layer group may include a convolution layer 231, a convolution layer 232, and a convolution layer 233, the convolution layer 231 may include a gate convolution layer, and the step size may be set to 2. The convolutional layers 232 and 233 may include one gate convolutional layer, respectively, and the step size may be set to 1, respectively. The fourth detection convolution layer group may include a convolution layer 241, a convolution layer 242, a convolution layer 243, a convolution layer 244 and a convolution layer 245, the convolution layer 241, the convolution layer 242, the convolution layer 243, the convolution layer 244 and the convolution layer 245 may include a gate convolution layer respectively, the step size may be set to 1 respectively, and the expansion coefficients may be 2, 4, 8, 16 and 1 respectively.

In step S132, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.

For example, in an alternative example, if the specific number of the first N frames of the video frame is one, step S132 may specifically be: and obtaining the characteristic information of the previous frame video frame after the filling processing according to the optical flow and the characteristic information of the previous frame video frame before the filling processing.

In detail, the feature information of the previous frame of video frame after the filling processing is the feature information corresponding to the video frame after the watermark is removed from the previous frame of video frame.

In step S133, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.

For example, in an alternative example, if the specific number of the first N frames of the video frame is one, step S133 may specifically be: and filling the smearing region according to the characteristic information of the previous video frame after filling.

Based on the different filling processing modes of the smear region according to the feature information of the previous N frames of video frames after the filling processing, step S133 may include different sub-steps.

For example, in an alternative example, step S133 may include the following sub-steps: and calculating to obtain the characteristic information of the current video frame after the filling processing according to the characteristic information of the current video frame and the characteristic information of the previous N frames after the filling processing.

Optionally, the specific manner of obtaining the feature information of the current video frame after the filling processing is performed through the calculation is not limited, and may be set according to the actual application requirements.

For example, in an alternative example, the step of calculating the feature information of the current video frame after the padding process may include the following sub-steps:

firstly, acquiring mask information of a watermark area existing in the current video frame; and secondly, calculating according to the mask information, the characteristic information of the current video frame and the characteristic information of the previous N frames of video frames after filling processing according to a preset calculation formula to obtain target characteristic information of the current video frame.

In an alternative example, the calculation formula may include:

F′_t＝(1-m)*F′_t-1+m*F_t；

Further, after calculating the feature information of the current video frame after the padding process, step S133 may further include the following sub-steps: and processing the characteristic information of the current video frame after the filling processing through a preset second neural network model to obtain the current video frame after the filling processing.

Wherein the second neural network model may include a first set of deconvolution layers, a second set of deconvolution layers, and a third set of deconvolution layers.

In detail, the first deconvolution layer group performs deconvolution operation on the feature information of the current video frame after the padding processing, so as to generate a first deconvolution feature map. And the second convolution layer performs deconvolution operation on the first convolution characteristic graph to generate a second deconvolution characteristic graph. And the third detection convolution layer group performs deconvolution operation on the second convolution characteristic graph to obtain the current video frame subjected to filling processing.

It should be noted that the first deconvolution layer group may include a convolution layer 311 and a convolution layer 312, the convolution layer 311 may include a bilinear difference value upsampling layer, a convolution layer, and an activation function layer leak relu, and the convolution layer 312 may include a convolution layer and an activation function layer leak relu. The second set of deconvolution layers may include a convolution layer 321 and a convolution layer 322, the convolution layer 321 may include a bilinear difference value upsampling layer, a convolution layer, and an activation function layer, leak relu, and the convolution layer 322 may include a convolution layer and an activation function layer, leak relu. The third set of deconvolution layers may comprise convolutional layers.

With reference to fig. 4, an embodiment of the present application further provides an image processing apparatus 100, which can be applied to the electronic device 10. The image processing apparatus 100 may include an information acquisition module 110, a smearing processing module 120, and a filling processing module 130, among others.

The information obtaining module 110 is configured to obtain, for a current video frame in a video to be processed, location information of a watermark region existing in the current video frame. In an alternative example, the information obtaining module 110 may be configured to perform step S110 shown in fig. 2, and reference may be made to the foregoing detailed description of step S110 regarding the relevant content of the information obtaining module 110.

The smearing processing module 120 is configured to smear the watermark region according to the location information to form a smeared region. In an alternative example, the smearing processing module 120 may be configured to execute step S120 shown in fig. 2, and reference may be made to the foregoing detailed description of step S120 for relevant contents of the smearing processing module 120.

The filling processing module 130 is configured to perform filling processing on the smearing region according to the first N frames of video frames in the video to be processed. In an alternative example, the filling processing module 130 may be configured to execute step S130 shown in fig. 2, and reference may be made to the foregoing detailed description of step S130 for relevant contents of the filling processing module 130.

Further, the filling processing module 130 may include a feature information obtaining sub-module, a feature information calculating sub-module, and a filling processing sub-module.

The feature information acquisition submodule is used for respectively acquiring feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing is carried out, and calculating optical flow between the current video frame and the previous N frames of video frames according to the feature information. In an alternative example, the characteristic information obtaining sub-module may be configured to perform step S131 shown in fig. 3, and reference may be made to the foregoing detailed description of step S131 for relevant content of the characteristic information obtaining sub-module.

And the characteristic information calculation submodule is used for obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing. In an alternative example, the characteristic information calculating submodule may be configured to perform step S132 shown in fig. 3, and the detailed description of step S132 may be referred to for relevant contents of the characteristic information calculating submodule.

And the filling processing submodule is used for filling the smearing area according to the characteristic information of the previous N frames of video frames after filling processing. In an alternative example, the filling processing sub-module may be configured to perform step S133 shown in fig. 3, and reference may be made to the foregoing detailed description of step S133 for relevant contents of the filling processing sub-module.

Further, the feature information obtaining sub-module may be specifically configured to: and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.

In summary, according to the image processing method, the image processing apparatus, and the electronic device 10 provided in the embodiment of the present application, the previous N frames of video frames in the video to be processed are used to fill the smearing region formed by smearing the watermark region existing in the current video frame, so as to obtain the image of the current video frame after removing the watermark, so as to solve the problem in the prior art that the continuity of the filling content between the video frames is insufficient due to that each video frame separately uses the pixel points at the edge of the watermark region to perform the difference calculation to obtain the filling content.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An image processing method, comprising:

filling the smearing region according to the first N frames of video frames in the video to be processed;

the step of filling the smearing region according to the first N frames of video frames in the video to be processed comprises the following steps:

filling the smearing region according to the characteristic information of the first N frames of video frames after filling;

the step of filling the smearing region according to the characteristic information of the first N frames of video frames after filling comprises the following steps:

and calculating to obtain the target characteristic information of the current video frame according to a preset calculation formula, wherein the calculation formula comprises:

F′_t＝(1-m)*F′_t-1+m*F_t

f't represents target characteristic information of a current video frame, F't-1 represents characteristic information of a previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information of a watermark area existing in the current video frame;

and processing the target characteristic information after the current video frame is filled through a preset second neural network model to obtain the current video frame after the filling.

2. The image processing method according to claim 1, wherein the step of respectively obtaining the feature information of the current video frame and the first N frames of video frames in the video to be processed before performing the padding process comprises:

3. The image processing method according to claim 1, wherein the step of performing the filling process on the smear region according to the feature information of the previous N frames of video frames after performing the filling process comprises:

4. The image processing method according to claim 3, wherein the step of calculating the feature information of the current video frame after the padding processing according to the feature information of the current video frame and the feature information of the previous N frames of video frames after the padding processing comprises:

5. The image processing method according to claim 4, wherein the calculation formula includes:

F′_t＝(1-m)*F′_t-1+m*F_t；

6. An image processing apparatus characterized by comprising:

the filling processing module is used for filling the smearing region according to the first N frames of video frames in the video to be processed;

the filling processing module includes:

the filling processing submodule is used for filling the smearing area according to the characteristic information of the previous N frames of video frames after filling processing;

the filling processing submodule is used for filling the smearing region according to the characteristic information of the previous N frames of video frames after filling processing, and comprises the following steps:

F′_t＝(1-m)*F′_t-1+m*F_t

7. The image processing apparatus according to claim 6, wherein the feature information acquisition sub-module is specifically configured to:

8. An electronic device comprising a memory and a processor for executing an executable computer program stored in the memory to implement the image processing method of any one of claims 1 to 5.