CN110636373B - Image processing method and device and electronic equipment - Google Patents

Image processing method and device and electronic equipment Download PDF

Info

Publication number
CN110636373B
CN110636373B CN201910995121.7A CN201910995121A CN110636373B CN 110636373 B CN110636373 B CN 110636373B CN 201910995121 A CN201910995121 A CN 201910995121A CN 110636373 B CN110636373 B CN 110636373B
Authority
CN
China
Prior art keywords
frames
characteristic information
video frame
filling
current video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910995121.7A
Other languages
Chinese (zh)
Other versions
CN110636373A (en
Inventor
孙彪
刘挺
田兴业
张伟
朱鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meitu Technology Co Ltd
Original Assignee
Xiamen Meitu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meitu Technology Co Ltd filed Critical Xiamen Meitu Technology Co Ltd
Priority to CN201910995121.7A priority Critical patent/CN110636373B/en
Publication of CN110636373A publication Critical patent/CN110636373A/en
Application granted granted Critical
Publication of CN110636373B publication Critical patent/CN110636373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/835Generation of protective data, e.g. certificates
    • H04N21/8358Generation of protective data, e.g. certificates involving watermark

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Security & Cryptography (AREA)
  • Image Processing (AREA)

Abstract

The application provides an image processing method and device and electronic equipment, and relates to the technical field of image processing. The image processing method comprises the following steps: acquiring position information of a watermark area existing in a current video frame aiming at the current video frame in a video to be processed; smearing the watermark region according to the position information to form a smearing region; and filling the smearing region according to the first N frames of video frames in the video to be processed. By the method, the problem that the consistency of the filling content between the video frames is insufficient can be solved.

Description

Image processing method and device and electronic equipment
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, and an electronic device.
Background
Some watermarks will typically be present in the video, for example for displaying subtitles in a logo or similar movie of the copyright owner, etc. Where in some cases a watermark in the video may affect the use of the video. Therefore, in the prior art, the watermark in the video is generally removed according to the requirement.
However, the inventor has found that in the existing watermark removal processing technology, watermark regions in each video frame are generally detected first, and then difference calculation is performed on each watermark region based on edge pixel points to obtain filling content, so as to remove the watermark. In this way, in the video from which the watermark is removed, there is a problem that the continuity of the filling content between the video frames is insufficient.
Disclosure of Invention
In view of the above, an object of the present application is to provide an image processing method, an image processing apparatus and an electronic device, so as to solve the problems in the prior art.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
an image processing method comprising:
acquiring position information of a watermark area existing in a current video frame aiming at the current video frame in a video to be processed;
smearing the watermark region according to the position information to form a smearing region;
and filling the smearing region according to the first N frames of video frames in the video to be processed.
In a preferred option of the embodiment of the present application, the step of performing filling processing on the smear region according to the first N frames of video frames in the video to be processed includes:
respectively acquiring feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing, and calculating optical flow between the current video frame and the previous N frames of video frames according to the feature information;
obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing;
and filling the smearing region according to the characteristic information of the first N frames of video frames after filling.
In a preferred option of the embodiment of the present application, the step of respectively obtaining feature information of the current video frame and the previous N frames of video frames in the video to be processed before performing the filling processing includes:
and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.
In a preferred option of the embodiment of the present application, the step of performing filling processing on the smear region according to the feature information of the previous N frames of video frames after performing filling processing includes:
and calculating to obtain the characteristic information of the current video frame after the filling processing according to the characteristic information of the current video frame and the characteristic information of the previous N frames after the filling processing.
In a preferred option of the embodiment of the present application, the step of calculating the feature information of the current video frame after the padding processing according to the feature information of the current video frame and the feature information of the previous N video frames after the padding processing includes:
acquiring mask information of a watermark area existing in the current video frame;
and calculating according to the mask information, the characteristic information of the current video frame and the characteristic information of the previous N frames of video frames after filling processing according to a preset calculation formula to obtain target characteristic information of the current video frame.
In a preferred selection of the embodiment of the present application, the calculation formula includes:
F′t=(1-m)*F′t-1+m*Ft
f't represents target characteristic information of the current video frame, F't-1 represents characteristic information of the previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information.
An embodiment of the present application further provides an image processing apparatus, including:
the information acquisition module is used for acquiring the position information of a watermark area existing in a current video frame in a video to be processed;
the smearing processing module is used for smearing the watermark area according to the position information to form a smearing area;
and the filling processing module is used for filling the smearing region according to the first N frames of video frames in the video to be processed.
In a preferred option of the embodiment of the present application, the filling processing module includes:
the characteristic information acquisition submodule is used for respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing is carried out, and calculating optical flow between the current video frame and the previous N frames of video frames according to the characteristic information;
the characteristic information calculation submodule is used for obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing;
and the filling processing submodule is used for filling the smearing area according to the characteristic information of the previous N frames of video frames after filling processing.
In a preferred option of the embodiment of the present application, the feature information obtaining sub-module is specifically configured to:
and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.
An embodiment of the present application further provides an electronic device, which includes a memory and a processor, where the processor is configured to execute an executable computer program stored in the memory to implement the above-mentioned image processing method.
According to the image processing method, the image processing device and the electronic equipment, the previous N frames of video frames in the video to be processed are used for filling the coating area formed by coating the watermark area in the current video frame, so that the image of the current video frame after the watermark is removed is obtained, and the problem of insufficient continuity of the filling content among the video frames caused by the fact that the difference value calculation is carried out on each video frame by independently utilizing the pixel points at the edge of the watermark area to obtain the filling content in the prior art is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Fig. 2 is a schematic flowchart of an image processing method according to an embodiment of the present application.
Fig. 3 is a schematic flowchart of step S130 according to an embodiment of the present application.
Fig. 4 is a block flow diagram of an image processing apparatus according to an embodiment of the present application.
Icon: 10-an electronic device; 12-a memory; 14-a processor; 100-an image processing apparatus; 110-an information acquisition module; 120-a smearing processing module; 130-fill process module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present application, it is further noted that, unless expressly stated or limited otherwise, the terms "disposed," "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.
As shown in fig. 1, an embodiment of the present application provides an electronic device 10. The specific type of the electronic device 10 is not limited, and may be set according to the actual application requirements. For example, devices may include, but are not limited to, computers, tablets, cell phones, and the like.
The electronic device 10 may include, among other things, a memory 12, a processor 14, and an image processing apparatus 100. In detail, the memory 12 and the processor 14 are electrically connected directly or indirectly to enable data transmission or interaction. For example, they may be electrically connected to each other via one or more communication buses or signal lines. The image processing apparatus 100 includes at least one software functional module that can be stored in the memory 12 in the form of software or firmware (firmware). The processor 14 is used for executing executable computer programs stored in the memory 12, such as software functional modules and computer programs included in the image processing apparatus 100, so as to implement the image processing method.
The Memory 12 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 14 may be an integrated circuit chip having signal processing capabilities. The Processor 14 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), a System on Chip (SoC), and the like.
It will be appreciated that the configuration shown in FIG. 1 is merely illustrative and that the electronic device 10 may include more or fewer components than shown in FIG. 1 or may have a different configuration than shown in FIG. 1.
With reference to fig. 2, an embodiment of the present application further provides an image processing method applicable to the electronic device 10. The method steps defined by the flow related to the image processing method can be implemented by the electronic device 10, and the specific flow shown in fig. 2 will be described in detail below.
Step S110, for a current video frame in a video to be processed, obtaining position information of a watermark region existing in the current video frame.
In this embodiment, the video to be processed may be obtained by inputting or uploading a video by a user. After obtaining the video to be processed, the electronic device 10 may respectively process each frame of video frame in the video to be processed, for example, obtain position information of a watermark region existing in the frame of video frame.
And step S120, smearing the watermark region according to the position information to form a smearing region.
In this embodiment, after the position information of the watermark region is obtained in step S110, the position of the watermark region may be determined based on the position information, and then the position is subjected to smearing processing, so that the watermark of the watermark region may be removed, thereby forming a smeared region.
And step S130, filling the smearing region according to the first N frames of video frames in the video to be processed.
In this embodiment, after the smear region is formed in step S120, the smear region may be filled according to the first N frames of the video frames in the video to be processed, so as to eliminate the smear region, so that no blank region exists in the obtained video frames.
By the method, after the watermark area existing in the current video frame is smeared to form the smearing area, the smearing area can be filled by combining the previous N frames of video frames to obtain the image of the current video frame after the watermark is removed. Therefore, the problem that in the prior art, the difference value calculation is carried out on each video frame by independently utilizing the pixel points at the edge of the watermark area of the video frame so as to complete the filling of the smearing area, and the continuity of filling content between the video frames is insufficient can be solved.
Optionally, the manner of obtaining the location information in step S110 is not limited, and may be selected according to the actual application requirement.
For example, in an alternative example, step S110 may include the following sub-steps: and acquiring information generated by framing the watermark area by a user to obtain the position information of the watermark area.
For another example, in another alternative example, step S110 may include the following sub-steps: and detecting the position information of the watermark area through a preset detection model.
The detection model may include a first detection convolution layer group, a second detection convolution layer group, a third detection convolution layer group, and a fourth detection convolution layer group.
In detail, the first detection convolution layer group performs convolution operation on the current video frame by a convolution kernel 3 × 3 to generate a first detection convolution feature map. And the second detection convolution layer group performs convolution operation on the first detection convolution characteristic diagram by a convolution kernel 3 x 3 to generate a second detection convolution characteristic diagram. And the third detection convolution layer group performs convolution operation on the second detection convolution characteristic diagram by a convolution kernel 3 x 3 to generate a third detection convolution characteristic diagram. And the fourth detection convolution layer group performs convolution operation on the third detection convolution characteristic graph by a convolution kernel 3 x 3 to generate a minimum rectangular surrounding frame of the watermark region in the current video frame, so as to obtain the position information of the watermark region.
It should be noted that the first detection convolution layer group may include a convolution layer and an activation function layer, leak relu. The second detection convolution layer group may include a convolution layer 121 and a convolution layer 122, the convolution layer 121 may include a convolution layer and an activation function layer, leak relu, and the step size may be set to 2. The convolutional layer 122 may include a convolutional layer and an activation function layer, LeakyReLU, and the step size may be set to 1. The third detection convolution layer group may include a convolution layer 131, a convolution layer 132, and a convolution layer 133, and the convolution layer 131 may include a convolution layer and an activation function layer, leak _ relu, and the step size may be set to 2. The convolutional layers 132 and 133 may include a convolutional layer and an activation function layer, leakage relu, respectively, and the step size may be set to 1, respectively. The fourth detection convolution layer set may include a convolution layer 141, a convolution layer 142, a convolution layer 143, a convolution layer 144, and a convolution layer 145, the convolution layer 141, the convolution layer 142, the convolution layer 143, and the convolution layer 144 may include a convolution layer and an activation function layer, leakage relu, respectively, the step size may be set to 1, and the expansion coefficients may be 2, 4, 8, and 16, respectively. The convolution layer 145 may regress the bounding box of the watermark region, and each position may regress 6 parameters, which are the probability of having a watermark, the probability of not having a watermark, the abscissa of the upper left corner point of the smallest rectangular bounding box of the watermark region, the ordinate of the upper left corner point of the smallest rectangular bounding box of the watermark region, the width of the smallest rectangular bounding box of the watermark region, and the height of the smallest rectangular bounding box of the watermark region.
Optionally, the way of smearing the watermark region in step S120 is not limited, and may be selected according to the actual application requirement.
For example, in an alternative example, step S120 may include the following sub-steps: and eliminating all pixel points included in the watermark region according to the position information of the watermark region to obtain a smearing region which does not include the pixel information of the pixel points.
In step S130, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.
For example, in an alternative example, in order to improve the efficiency of image processing and avoid an excessive amount of computation, the specific number of the first N frames of video frames may be one, that is, the smearing region may be filled according to a previous frame of video frame in the video to be processed.
For another example, in another alternative example, in order to improve the accuracy of the filling process and the consistency of the filling content between the video frames, the specific number of the first N frames of video frames may be two, that is, the filling process may be performed on the smear region according to the first two frames of video frames in the video to be processed.
Optionally, the filling processing manner of the smearing region through step S130 is not limited, and may be selected according to the actual application requirement.
For example, in an alternative example, in conjunction with fig. 3, step S130 may include step S131, step S132, and step S133.
Step S131, respectively obtaining the feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing, and calculating the optical flow between the current video frame and the previous N frames of video frames according to the feature information.
Step S132, obtaining the feature information of the previous N frames of video frames after the filling processing according to the optical flow and the feature information of the previous N frames of video frames before the filling processing.
And step S133, filling the smearing region according to the characteristic information of the previous N frames of video frames after filling.
In this embodiment, based on step S131, step S132 and step S133, the smearing region can be effectively filled, and an image of the current video frame after removing the watermark is obtained.
In step S131, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.
For example, in an alternative example, if the specific number of the first N frames of the video frame is one, step S131 may specifically be: respectively acquiring the characteristic information of the current video frame and the previous frame of video frame in the video to be processed before filling processing, and calculating the optical flow between the current video frame and the previous frame of video frame according to the characteristic information.
It should be noted that the optical flow may represent the instantaneous speed of the pixel motion of a spatially moving object on the observation imaging plane, and is a method for finding the correspondence between the previous frame and the current frame by using the change of the pixels in the image sequence in the time domain and the correlation between the adjacent frames, so as to calculate the motion information of the object between the adjacent frames. In general, optical flow is due to movement of the foreground objects themselves in the scene, motion of the camera, or both.
The specific way of calculating the optical flow is not limited, and may be set according to the actual application requirements.
For example, in an alternative example, feature information of the current video frame and a previous video frame in the video to be processed before the filling process is performed may be calculated by FlowNet2, so as to obtain the optical flow between frames.
Based on different manners of obtaining the feature information of the current video frame and the previous N frames of video frames in the video to be processed before performing the padding processing, step S131 may include different sub-steps.
For example, in an alternative example, step S131 may include the following sub-steps: and respectively acquiring the characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model.
Wherein the first neural network model may include a first set of convolution layers, a second set of convolution layers, a third set of convolution layers, and a fourth set of convolution layers.
In detail, the first convolution layer performs a gate convolution operation on the current video frame by a convolution kernel 3 × 3 to generate a first convolution feature map. And the second convolution layer group performs gate convolution operation on the first convolution feature map by a convolution kernel 3 x 3 to generate a second convolution feature map. And the third detection convolution layer group performs gate convolution operation on the second convolution characteristic diagram by a convolution kernel 3 x 3 to generate a third convolution characteristic diagram. And the fourth convolution layer group performs gate convolution operation on the third convolution characteristic diagram by a convolution kernel 3 x 3 to obtain the characteristic information of the current video frame.
It should be noted that the first scrolling layer set may include a door scrolling layer. The second convolution layer group may include a convolution layer 221 and a convolution layer 222, the convolution layer 221 may include a gate convolution layer, and the step size may be set to 2. The convolutional layer 222 may comprise a layer of gate convolutional layer, and the step size may be set to 1. The third detection convolution layer group may include a convolution layer 231, a convolution layer 232, and a convolution layer 233, the convolution layer 231 may include a gate convolution layer, and the step size may be set to 2. The convolutional layers 232 and 233 may include one gate convolutional layer, respectively, and the step size may be set to 1, respectively. The fourth detection convolution layer group may include a convolution layer 241, a convolution layer 242, a convolution layer 243, a convolution layer 244 and a convolution layer 245, the convolution layer 241, the convolution layer 242, the convolution layer 243, the convolution layer 244 and the convolution layer 245 may include a gate convolution layer respectively, the step size may be set to 1 respectively, and the expansion coefficients may be 2, 4, 8, 16 and 1 respectively.
In step S132, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.
For example, in an alternative example, if the specific number of the first N frames of the video frame is one, step S132 may specifically be: and obtaining the characteristic information of the previous frame video frame after the filling processing according to the optical flow and the characteristic information of the previous frame video frame before the filling processing.
In detail, the feature information of the previous frame of video frame after the filling processing is the feature information corresponding to the video frame after the watermark is removed from the previous frame of video frame.
In step S133, it should be noted that, the specific number of the first N frames of video frames is not limited, and may be set according to the actual application requirement.
For example, in an alternative example, if the specific number of the first N frames of the video frame is one, step S133 may specifically be: and filling the smearing region according to the characteristic information of the previous video frame after filling.
Based on the different filling processing modes of the smear region according to the feature information of the previous N frames of video frames after the filling processing, step S133 may include different sub-steps.
For example, in an alternative example, step S133 may include the following sub-steps: and calculating to obtain the characteristic information of the current video frame after the filling processing according to the characteristic information of the current video frame and the characteristic information of the previous N frames after the filling processing.
Optionally, the specific manner of obtaining the feature information of the current video frame after the filling processing is performed through the calculation is not limited, and may be set according to the actual application requirements.
For example, in an alternative example, the step of calculating the feature information of the current video frame after the padding process may include the following sub-steps:
firstly, acquiring mask information of a watermark area existing in the current video frame; and secondly, calculating according to the mask information, the characteristic information of the current video frame and the characteristic information of the previous N frames of video frames after filling processing according to a preset calculation formula to obtain target characteristic information of the current video frame.
In an alternative example, the calculation formula may include:
F′t=(1-m)*F′t-1+m*Ft
f't represents target characteristic information of the current video frame, F't-1 represents characteristic information of the previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information.
Further, after calculating the feature information of the current video frame after the padding process, step S133 may further include the following sub-steps: and processing the characteristic information of the current video frame after the filling processing through a preset second neural network model to obtain the current video frame after the filling processing.
Wherein the second neural network model may include a first set of deconvolution layers, a second set of deconvolution layers, and a third set of deconvolution layers.
In detail, the first deconvolution layer group performs deconvolution operation on the feature information of the current video frame after the padding processing, so as to generate a first deconvolution feature map. And the second convolution layer performs deconvolution operation on the first convolution characteristic graph to generate a second deconvolution characteristic graph. And the third detection convolution layer group performs deconvolution operation on the second convolution characteristic graph to obtain the current video frame subjected to filling processing.
It should be noted that the first deconvolution layer group may include a convolution layer 311 and a convolution layer 312, the convolution layer 311 may include a bilinear difference value upsampling layer, a convolution layer, and an activation function layer leak relu, and the convolution layer 312 may include a convolution layer and an activation function layer leak relu. The second set of deconvolution layers may include a convolution layer 321 and a convolution layer 322, the convolution layer 321 may include a bilinear difference value upsampling layer, a convolution layer, and an activation function layer, leak relu, and the convolution layer 322 may include a convolution layer and an activation function layer, leak relu. The third set of deconvolution layers may comprise convolutional layers.
With reference to fig. 4, an embodiment of the present application further provides an image processing apparatus 100, which can be applied to the electronic device 10. The image processing apparatus 100 may include an information acquisition module 110, a smearing processing module 120, and a filling processing module 130, among others.
The information obtaining module 110 is configured to obtain, for a current video frame in a video to be processed, location information of a watermark region existing in the current video frame. In an alternative example, the information obtaining module 110 may be configured to perform step S110 shown in fig. 2, and reference may be made to the foregoing detailed description of step S110 regarding the relevant content of the information obtaining module 110.
The smearing processing module 120 is configured to smear the watermark region according to the location information to form a smeared region. In an alternative example, the smearing processing module 120 may be configured to execute step S120 shown in fig. 2, and reference may be made to the foregoing detailed description of step S120 for relevant contents of the smearing processing module 120.
The filling processing module 130 is configured to perform filling processing on the smearing region according to the first N frames of video frames in the video to be processed. In an alternative example, the filling processing module 130 may be configured to execute step S130 shown in fig. 2, and reference may be made to the foregoing detailed description of step S130 for relevant contents of the filling processing module 130.
Further, the filling processing module 130 may include a feature information obtaining sub-module, a feature information calculating sub-module, and a filling processing sub-module.
The feature information acquisition submodule is used for respectively acquiring feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing is carried out, and calculating optical flow between the current video frame and the previous N frames of video frames according to the feature information. In an alternative example, the characteristic information obtaining sub-module may be configured to perform step S131 shown in fig. 3, and reference may be made to the foregoing detailed description of step S131 for relevant content of the characteristic information obtaining sub-module.
And the characteristic information calculation submodule is used for obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing. In an alternative example, the characteristic information calculating submodule may be configured to perform step S132 shown in fig. 3, and the detailed description of step S132 may be referred to for relevant contents of the characteristic information calculating submodule.
And the filling processing submodule is used for filling the smearing area according to the characteristic information of the previous N frames of video frames after filling processing. In an alternative example, the filling processing sub-module may be configured to perform step S133 shown in fig. 3, and reference may be made to the foregoing detailed description of step S133 for relevant contents of the filling processing sub-module.
Further, the feature information obtaining sub-module may be specifically configured to: and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.
In summary, according to the image processing method, the image processing apparatus, and the electronic device 10 provided in the embodiment of the present application, the previous N frames of video frames in the video to be processed are used to fill the smearing region formed by smearing the watermark region existing in the current video frame, so as to obtain the image of the current video frame after removing the watermark, so as to solve the problem in the prior art that the continuity of the filling content between the video frames is insufficient due to that each video frame separately uses the pixel points at the edge of the watermark region to perform the difference calculation to obtain the filling content.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (8)

1. An image processing method, comprising:
acquiring position information of a watermark area existing in a current video frame aiming at the current video frame in a video to be processed;
smearing the watermark region according to the position information to form a smearing region;
filling the smearing region according to the first N frames of video frames in the video to be processed;
the step of filling the smearing region according to the first N frames of video frames in the video to be processed comprises the following steps:
respectively acquiring feature information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing, and calculating optical flow between the current video frame and the previous N frames of video frames according to the feature information;
obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing;
filling the smearing region according to the characteristic information of the first N frames of video frames after filling;
the step of filling the smearing region according to the characteristic information of the first N frames of video frames after filling comprises the following steps:
and calculating to obtain the target characteristic information of the current video frame according to a preset calculation formula, wherein the calculation formula comprises:
F′t=(1-m)*F′t-1+m*Ft
f't represents target characteristic information of a current video frame, F't-1 represents characteristic information of a previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information of a watermark area existing in the current video frame;
and processing the target characteristic information after the current video frame is filled through a preset second neural network model to obtain the current video frame after the filling.
2. The image processing method according to claim 1, wherein the step of respectively obtaining the feature information of the current video frame and the first N frames of video frames in the video to be processed before performing the padding process comprises:
and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.
3. The image processing method according to claim 1, wherein the step of performing the filling process on the smear region according to the feature information of the previous N frames of video frames after performing the filling process comprises:
and calculating to obtain the characteristic information of the current video frame after the filling processing according to the characteristic information of the current video frame and the characteristic information of the previous N frames after the filling processing.
4. The image processing method according to claim 3, wherein the step of calculating the feature information of the current video frame after the padding processing according to the feature information of the current video frame and the feature information of the previous N frames of video frames after the padding processing comprises:
acquiring mask information of a watermark area existing in the current video frame;
and calculating according to the mask information, the characteristic information of the current video frame and the characteristic information of the previous N frames of video frames after filling processing according to a preset calculation formula to obtain target characteristic information of the current video frame.
5. The image processing method according to claim 4, wherein the calculation formula includes:
F′t=(1-m)*F′t-1+m*Ft
f't represents target characteristic information of the current video frame, F't-1 represents characteristic information of the previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information.
6. An image processing apparatus characterized by comprising:
the information acquisition module is used for acquiring the position information of a watermark area existing in a current video frame in a video to be processed;
the smearing processing module is used for smearing the watermark area according to the position information to form a smearing area;
the filling processing module is used for filling the smearing region according to the first N frames of video frames in the video to be processed;
the filling processing module includes:
the characteristic information acquisition submodule is used for respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing is carried out, and calculating optical flow between the current video frame and the previous N frames of video frames according to the characteristic information;
the characteristic information calculation submodule is used for obtaining the characteristic information of the previous N frames of video frames after the filling processing according to the optical flow and the characteristic information of the previous N frames of video frames before the filling processing;
the filling processing submodule is used for filling the smearing area according to the characteristic information of the previous N frames of video frames after filling processing;
the filling processing submodule is used for filling the smearing region according to the characteristic information of the previous N frames of video frames after filling processing, and comprises the following steps:
and calculating to obtain the target characteristic information of the current video frame according to a preset calculation formula, wherein the calculation formula comprises:
F′t=(1-m)*F′t-1+m*Ft
f't represents target characteristic information of a current video frame, F't-1 represents characteristic information of a previous N frames of video frames after filling processing, Ft represents characteristic information of the current video frame, and m represents mask information of a watermark area existing in the current video frame;
and processing the target characteristic information after the current video frame is filled through a preset second neural network model to obtain the current video frame after the filling.
7. The image processing apparatus according to claim 6, wherein the feature information acquisition sub-module is specifically configured to:
and respectively acquiring characteristic information of the current video frame and the previous N frames of video frames in the video to be processed before filling processing through a preset first neural network model, wherein the first neural network model sequentially comprises a first convolution layer group, a second convolution layer group, a third convolution layer group and a fourth convolution layer group according to the direction from input to output.
8. An electronic device comprising a memory and a processor for executing an executable computer program stored in the memory to implement the image processing method of any one of claims 1 to 5.
CN201910995121.7A 2019-10-18 2019-10-18 Image processing method and device and electronic equipment Active CN110636373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910995121.7A CN110636373B (en) 2019-10-18 2019-10-18 Image processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910995121.7A CN110636373B (en) 2019-10-18 2019-10-18 Image processing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110636373A CN110636373A (en) 2019-12-31
CN110636373B true CN110636373B (en) 2022-02-01

Family

ID=68976735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910995121.7A Active CN110636373B (en) 2019-10-18 2019-10-18 Image processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110636373B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233055B (en) * 2020-10-15 2021-09-10 北京达佳互联信息技术有限公司 Video mark removing method and video mark removing device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108470326A (en) * 2018-03-27 2018-08-31 北京小米移动软件有限公司 Image completion method and device
CN108985192A (en) * 2018-06-29 2018-12-11 东南大学 A kind of video smoke recognition methods based on multitask depth convolutional neural networks
CN109214999A (en) * 2018-09-21 2019-01-15 传线网络科技(上海)有限公司 A kind of removing method and device of video caption
CN109472260A (en) * 2018-10-31 2019-03-15 成都索贝数码科技股份有限公司 A method of logo and subtitle in the removal image based on deep neural network
CN109615593A (en) * 2018-11-29 2019-04-12 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110278439A (en) * 2019-06-28 2019-09-24 北京云摄美网络科技有限公司 De-watermarked algorithm based on inter-prediction

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7548659B2 (en) * 2005-05-13 2009-06-16 Microsoft Corporation Video enhancement
US20170024843A1 (en) * 2015-07-24 2017-01-26 Le Holdings (Beijing) Co., Ltd. Method and device for removing video watermarks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108470326A (en) * 2018-03-27 2018-08-31 北京小米移动软件有限公司 Image completion method and device
CN108985192A (en) * 2018-06-29 2018-12-11 东南大学 A kind of video smoke recognition methods based on multitask depth convolutional neural networks
CN109214999A (en) * 2018-09-21 2019-01-15 传线网络科技(上海)有限公司 A kind of removing method and device of video caption
CN109472260A (en) * 2018-10-31 2019-03-15 成都索贝数码科技股份有限公司 A method of logo and subtitle in the removal image based on deep neural network
CN109615593A (en) * 2018-11-29 2019-04-12 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110278439A (en) * 2019-06-28 2019-09-24 北京云摄美网络科技有限公司 De-watermarked algorithm based on inter-prediction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
【论文系列】光流/LK光流/FlowNet/FlowNet2;Min220;《CSDN,https://blog.csdn.net/jucilan2220/article/details/84331626》;20181121;全文 *

Also Published As

Publication number Publication date
CN110636373A (en) 2019-12-31

Similar Documents

Publication Publication Date Title
CN108833785B (en) Fusion method and device of multi-view images, computer equipment and storage medium
US9111389B2 (en) Image generation apparatus and image generation method
US9076234B2 (en) Super-resolution method and apparatus for video image
US9959600B2 (en) Motion image compensation method and device, display device
EP2209087B1 (en) Apparatus and method of obtaining high-resolution image
US11151704B2 (en) Apparatus and methods for artifact detection and removal using frame interpolation techniques
EP3076364A1 (en) Image filtering based on image gradients
EP1843294A1 (en) Motion vector calculation method, hand-movement correction device using the method, imaging device, and motion picture generation device
KR20200044108A (en) Method and apparatus for estimating monocular image depth, device, program and storage medium
CN109992226A (en) Image display method and device and spliced display screen
JP2021507388A (en) Instance segmentation methods and devices, electronics, programs and media
Parsania et al. A review: Image interpolation techniques for image scaling
US20120280996A1 (en) Method and system for rendering three dimensional views of a scene
US9197891B1 (en) Systems and methods for periodic structure handling for motion compensation
EP4322109A1 (en) Green screen matting method and apparatus, and electronic device
US20150206289A1 (en) Joint Video Deblurring and Stabilization
CN111882578A (en) Foreground image acquisition method, foreground image acquisition device and electronic equipment
CN112308797A (en) Corner detection method and device, electronic equipment and readable storage medium
KR101834512B1 (en) Super-resolution image restoration apparatus and method based on consecutive-frame
CN110636373B (en) Image processing method and device and electronic equipment
CN111598088A (en) Target detection method and device, computer equipment and readable storage medium
US20120019677A1 (en) Image stabilization in a digital camera
CN114049488A (en) Multi-dimensional information fusion remote weak and small target detection method and terminal
EP3070670A2 (en) Using frequency decomposition for better color consistency in a synthesized region
CN112954454B (en) Video frame generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant