CN113643173A

CN113643173A - Watermark removing method, watermark removing device, terminal equipment and readable storage medium

Info

Publication number: CN113643173A
Application number: CN202110955089.7A
Authority: CN
Inventors: 李�浩
Original assignee: Guangdong Imoo Electronic Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2021-08-19
Filing date: 2021-08-19
Publication date: 2021-11-12
Also published as: WO2023019682A1

Abstract

The present application is applicable to the field of image processing technologies, and in particular, to a watermark removing method, apparatus, terminal device, and readable storage medium. The method comprises the following steps: acquiring a file containing a watermark; determining the position information of the watermark in the file; acquiring a target area in the file according to the position information, wherein the target area is an area containing the watermark in the file; and carrying out watermark removal processing on the target area to obtain the watermark-removed file. Namely, the application determines the position information of the watermark in the file; and acquiring a target area in the file according to the position information, and only performing watermark removal processing on the target area without processing the whole file to obtain the file with the watermark removed, so that the watermark removing speed is increased, and the calculation amount of the watermark removing is reduced.

Description

Watermark removing method, watermark removing device, terminal equipment and readable storage medium

Technical Field

The present application belongs to the field of image processing technologies, and in particular, to a watermark removing method, an apparatus, a terminal device, and a readable storage medium.

Background

The method for removing the watermark of the book on the current market mainly comprises the following steps: and removing by using late image restoration software, or purchasing corresponding payment software to realize batch removal of specific watermarks, or removing the watermarks by using a traditional image processing scheme.

However, when the existing book watermark removing method is used for removing the watermark, all pixel points of the whole image containing the watermark need to be processed, so that the speed of removing the watermark is low, and the calculation amount of removing the watermark is large.

Disclosure of Invention

The watermark removing method, the watermark removing device, the terminal equipment and the readable storage medium, which are provided by the embodiment of the application, can improve the watermark removing speed and reduce the calculated amount of the removed watermark.

In a first aspect, an embodiment of the present application provides a watermark removing method, where the method includes:

acquiring a file containing a watermark;

determining the position information of the watermark in the file;

acquiring a target area in the file according to the position information, wherein the target area is an area containing the watermark in the file;

and carrying out watermark removal processing on the target area to obtain the watermark-removed file.

In a possible implementation manner of the first aspect, the determining location information of the watermark in the file includes:

and inputting the file into a first neural network model for processing to obtain the position information of the watermark output by the first neural network model in the file.

Wherein the first neural network model is a YOLO v3 model.

Wherein, the obtaining the target area in the file according to the position information includes:

and cutting the file according to the position information to obtain the target area.

The watermark removing processing of the target area to obtain the watermark-removed file comprises the following steps:

inputting the target area into a second neural network model for processing to obtain a first area output by the second neural network model, wherein the first area is an area of the target area after the watermark is removed;

and fusing the first area and a second area in the file to obtain the watermark-removed file, wherein the second area is an area except the target area in the file.

Wherein the processing the target region by the second neural network model comprises:

acquiring a third area corresponding to the target area, wherein the third area is a background area corresponding to the target area and does not contain the watermark;

determining location information of the watermark in the target area;

determining a fourth area corresponding to the watermark in the third area according to the position information of the watermark in the target area;

replacing the watermark in the target area with the fourth area.

Wherein the determining the position information of the watermark in the target area comprises:

performing mask processing on the target area to obtain mask information of the target area, and determining position information of the watermark in the target area according to the mask information.

In a second aspect, an embodiment of the present application provides a watermark removing apparatus, where the apparatus includes:

the first acquisition module is used for acquiring a file containing a watermark;

a determining module, configured to determine location information of the watermark in the file;

a second obtaining module, configured to obtain a target area in the file according to the location information, where the target area is an area that includes the watermark in the file;

and the processing module is used for carrying out watermark removal processing on the target area to obtain the watermark-removed file.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the watermark removing method according to the first aspect when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the watermark removing method according to the first aspect.

Compared with the prior art, the embodiment of the application has the advantages that: the method comprises the steps of obtaining a file containing the watermark; determining the position information of the watermark in the file; acquiring a target area in the file according to the position information, wherein the target area is an area containing the watermark in the file; and carrying out watermark removal processing on the target area to obtain the watermark-removed file. Namely, the application determines the position information of the watermark in the file; and acquiring a target area in the file according to the position information, and only performing watermark removal processing on the target area without processing the whole file to obtain the file with the watermark removed, so that the watermark removing speed is increased, and the calculation amount of the watermark removing is reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic diagram of a network architecture of a watermark removal method according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a watermark removing method according to an embodiment of the present application;

FIG. 3a is an exemplary diagram of a network structure of the YOLO v3 model provided in an embodiment of the present application;

FIG. 3b is a schematic flowchart illustrating a method for training a target detection model according to an embodiment of the present application;

FIG. 3c is a diagram illustrating an example of position information of a rectangular frame according to an embodiment of the present application;

fig. 4 is a flowchart illustrating a method for performing watermark removal processing on a target area according to an embodiment of the present application;

FIG. 5a is an exemplary diagram of a network structure of a second neural network model provided by an embodiment of the present application;

FIG. 5b is a diagram illustrating an example network structure of an encoder provided in one embodiment of the present application;

fig. 5c is a diagram of an example of a network structure of a decoder according to an embodiment of the present application;

FIG. 6 is a schematic flow chart illustrating a training method for a second neural network model according to an embodiment of the present application;

fig. 7 is an exemplary diagram of obtaining a first region after removing a watermark according to an embodiment of the present application;

FIG. 8 is a flowchart illustrating a method for obtaining a first watermarked region by applying a second neural network model according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a watermark removing apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail, and in other instances, specific technical details of various embodiments may be mutually referenced, and a specific system not described in one embodiment may be referenced in other embodiments.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

Reference throughout this specification to "one embodiment of the present application" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in other embodiments," "an embodiment of the present application," "other embodiments of the present application," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

When the existing book watermark removing method is used for removing watermarks, all pixel points of the whole image containing the watermarks need to be processed, so that the speed of removing the watermarks is low, and the calculated amount of removing the watermarks is large.

In order to solve the above defects, the inventive concept of the present application is:

the position information of the watermark in the file is determined, the target area in the file is obtained according to the position information, the whole file is not required to be processed, the watermark removing processing is only required to be carried out on the target area, the file with the watermark removed can be obtained, the speed of removing the watermark is improved, and the calculation amount of removing the watermark is reduced.

In order to explain the technical means of the present application, the following description will be given by way of specific examples.

Referring to fig. 1, fig. 1 is a schematic diagram of a network architecture of a watermark removing method according to an embodiment of the present application. For convenience of explanation, only portions related to the present application are shown. The network architecture includes: a terminal device 100 and a server 200.

In the network architecture, the terminal device 100 may include, but is not limited to, a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), and the like. The terminal device 100 may be used to deploy a first neural network model and a second neural network model.

In the network architecture, the server 200 is essentially an electronic device with computing power, and the server 200 is deployed in the cloud and can also be used for deploying the first neural network model and the second neural network model. The server 200 mainly provides a service to the terminal device 100.

The terminal device 100 accesses the internet through a known network access method, and establishes a data communication link with the cloud server 200, so as to start operations such as training the first neural network model and the second neural network model, and performing watermarking on a file containing a watermark region.

Referring to fig. 2, fig. 2 is a flowchart illustrating a watermark removing method according to an embodiment of the present application. As an implementation, the main execution body of the method in fig. 2 may be the terminal device 100 in fig. 1, and as another implementation, the main execution body of the method in fig. 2 may also be the server 200 in fig. 1, as shown in fig. 2, the method includes: s201 to S204.

S201, obtaining a file containing the watermark.

Specifically, the file containing the watermark in the embodiment of the present application may be an image containing the watermark, a Portable Document Format (PDF) containing the watermark, a webpage containing the watermark, and the like, and the type of the file containing the watermark is not limited in the embodiment of the present application.

In the embodiment of the application, when the watermark removing operation needs to be performed on a group of files containing watermarks, the group of files containing the watermarks is input into the terminal device or the server, and the terminal device or the server can obtain the group of files containing the watermarks.

In other embodiments of the present application, when watermark removal operation needs to be performed on multiple groups of files containing watermarks, the multiple groups of files containing watermarks are input to a terminal device or a server, and the terminal device or the server can obtain the multiple groups of files containing watermarks. That is, the terminal device or the server in the embodiment of the present application can perform batch processing on multiple groups of files containing watermarks, thereby increasing the speed of removing the watermarks.

S202, determining the position information of the watermark in the file.

Specifically, in the embodiment of the present application, after a file containing a watermark is acquired, the position information of the watermark in the file is determined through the first neural network model.

In this embodiment of the application, the first neural network model is a target detection model, and the target detection model includes a Faster region-based convolutional neural network model (fast R-CNN model), a Single deep neural network detection model (SSD model), an advanced Real-Time Object detection model (Real-Time Object detection, YOLO model), and the like. Wherein the YOLO model comprises a YOLO v1 model, a YOLO v2 model and a YOLO v3 model. The examples of the present application are illustrated by the YOLOv3 model. Referring to fig. 3a, fig. 3a is an exemplary diagram of a network structure of the YOLO v3 model according to an embodiment of the present application.

The underlying network of the YOLO v3 model is the Darknet53 network, the Darknet53 network comprises 52 convolutional layers, 1 average pooling layer, 1 full connection layer and 1 activation function layer (softmax).

Wherein, 52 lamination layers include: the convolutional coding comprises 1 convolutional core with 32 filters, 5 downsampling layers and 5 repeated Residual unit resblock _ body (the 5 Residual units adopt a layer-skipping connection mode of a Residual Network (ResNet), each unit comprises 1 independent convolutional layer and a group of repeatedly executed convolutional layers, the repeatedly executed convolutional layers are respectively repeated for 1 time, 2 times, 8 times and 4 times, and in each repeatedly executed convolutional layer, the convolution operation of 1x1 is executed firstly, and then the convolution operation of 3x3 is executed), and the total number is 52.

The convolutional layer calculation method comprises the following steps:

52＝1+5+(1*2)+(2*2)+(8*2)+(8*2)+(4*2)。

the Darknet53 network provided in the Yolo v3 model is to derive features of different sizes of input files. Illustratively, the input file is an image file, the size of the image file is 416 × 3, 416 × 416 represents the resolution of the image, and 3 represents the number of channels of the image. The image file is processed by 5 down-sampling layers and 5 groups of repeated residual units, namely, the image is respectively processed by 2 times of down-sampling (2) to obtain a characteristic image with the size of 208 x 64, and is processed by 4 times of down-sampling (2)²) Obtain feature images of size 104 x 128, 8 times down-sampling (2)³) Feature images of 52 x 256 size were obtained, and 16-fold down-sampling (2)⁴) Feature images of 26 x 512 size were obtained, and 32-fold down-sampling (2)⁵) Resulting in a 13 x 1024 size feature image.

The network structure of the YOLO v3 model further comprises 3 prediction layers, and the 3 prediction layers are connected with the last 3-layer residual error unit in the Darknet53 network through a plurality of convolutional layers, a plurality of upsampling layers and a plurality of tensor splicing layers.

The reason for providing 3 prediction layers in the YOLO v3 model is to detect the multi-sized features of the input file. For example, the input file is detected 3 times by using 3 prediction layers in the YOLO v3 model, and the detection is performed respectively at 32 times of down-sampling, 16 times of down-sampling and 8 times of down-sampling, so that the features of the input file with different sizes are detected, and the detection result is output.

The reason for providing the upsampling layer in the YOLO v3 model is to enlarge the features obtained by the downsampling process, so that the feature obtained by the downsampling process has a better expression effect. Illustratively, the feature obtained after the low-power down-sampling process is 13 × 13, and the feature obtained after the up-sampling layer expansion operation is 26 × 26.

In the embodiment of the application, the tensor splicing layer is set in the YOLO v3 model to splice the feature image output by the Darknet53 network and the feature image obtained through the upsampling processing.

Referring to fig. 3b, fig. 3b is a schematic flowchart illustrating a method for training a target detection model according to an embodiment of the present disclosure. As shown in fig. 3a, the method comprises: s301 to S303.

S301, acquiring a plurality of groups of sample files.

Specifically, the sample file is a file containing a watermark. In some embodiments, a batch of files to be de-watermarked may be collected in advance, for example, 1000 to 3000 (e.g., 1500) images to be de-watermarked may be collected in advance. The number of files needing to be watermarked is not limited in the embodiment of the application.

Acquiring a plurality of groups of sample files, namely acquiring 1000 to 3000 pre-collected images needing watermark removal.

And S302, marking the watermark in the various documents to obtain the target area information of the various documents.

Specifically, the target area is an area containing a watermark in the sample file. After obtaining a plurality of groups of sample files, each sample file needs to be labeled, for example, in the embodiment of the present application, a general text labeling tool (Labelme) may be used to label watermarks in 1500 images (for example, a labeling box may be used to label the watermarks), and a labeling result is stored in a json file format, so that 1500 groups of json files may be obtained, where the json files include position information of the labeling box of the watermarks. In the embodiment of the present application, the information included in the 1500 json files is referred to as target area information of various files.

In the embodiment of the present application, the form of the labeling box includes: polygonal, rectangular, circular, etc., and the labeled frame is exemplified as a rectangular frame in the embodiments of the present application.

In the embodiment of the present application, the position information of the rectangular frame is the position information of the watermark in the sample file, and the position information of the watermark in the sample file can be represented by coordinates of four vertices of the rectangular frame. For example, please refer to fig. 3c, fig. 3c is an exemplary diagram of position information of a rectangular frame according to an embodiment of the present application. In fig. 3c, w represents a rectangular box. H denotes a sample file. The coordinate system is established by taking the upper left corner of the image as the origin of coordinates, the image width direction as the positive direction of the x axis, and the image height direction as the positive direction of the y axis, and in the coordinate system, the coordinates of the four vertices are expressed as (x _ top _ left, y _ top _ left), (x _ top _ right, y _ top _ right), (x _ bottom _ left, y _ bottom _ left), and (x _ bottom _ right, y _ bottom _ right).

In the embodiment of the present application, the target area information of the various documents acquired in S301 and the target area information of the various documents acquired in S302 are used as a data set, a certain proportion of data is randomly selected from the data set as a training set of the first neural network model, and the remaining proportion of data is used as a verification set of the first neural network model. Illustratively, 70% to 90%, for example 80%, of the data is randomly selected from the data set as a training set of the first neural network model, and 30% to 10%, for example 20%, of the data is selected as a validation set of the first neural network model.

S303, inputting the training set and the verification set into the first neural network model for training, and storing the training parameters.

Specifically, the target area information of the various files in the training set and each sample file is input into the YOLO v3 model. The Darknet53 network in the YOLO v3 model generates feature images of different sizes from the various files and the target area information of the various files. When detecting the feature images with different sizes, the feature images are divided into S × S grid cells (for example, the feature images with the size of 16 × 16 are divided into 16 × 16 grid cells), and when a target area in the feature images falls into any grid cell, the grid cell detects the target area.

In the embodiment of the present application, 3 bounding boxes are set for each grid cell. When the YOLO v3 model detects a target area, the bounding box of each feature image is used to calculate with a rectangular frame of the target area labeled in advance, so as to obtain a measurement standard (IOU), and only the bounding box with the largest IOU can be used to predict the target area.

The IOU is a standard for measuring the accuracy of detecting a corresponding object in a particular data set, and this standard is used to measure the correlation between reality and prediction, the greater the value of the IOU, the higher the correlation.

The calculation formula of the IOU is as follows:

in the embodiment of the application, the IOU of the bounding box and the target area can be respectively calculated through the calculation formula of the IOU.

The YOLO v3 model in the embodiment of the present application is detected on image features of multiple sizes. The output features obtained by prediction have two dimensions (such as 13 × 13) of the extracted feature image, and have one dimension (depth) of B × 5+ C, where B represents the number of predicted bounding boxes of each grid cell, C represents the number of classes of bounding boxes, and 5 represents the coordinate information of 4 bounding boxes and the confidence of a target region.

In the embodiment of the application, the target area information of various files and each sample file in the training set and the verification set is input into the model for training, and when the training times (epoch) reach 100 times or the accuracy of the training verification set reaches a certain threshold (for example, 90%), the model training is considered to be finished. And saving the optimal model weight parameters for extracting the position information of the watermark in the sample file.

In the embodiment of the application, the file is input into the first neural network model for processing, and the position information of the watermark output by the first neural network model in the file is obtained.

Specifically, a file to be watermarked is input into a trained YOLO v3 model for processing, so that a target area including the watermark in the file can be identified and information of the target area can be obtained, where the information of the target area includes position information of a mark frame of the watermark, that is, vertex coordinate information of the mark frame.

And S203, acquiring a target area in the file according to the position information.

Specifically, the file is cut according to the position information to obtain the target area.

In some embodiments, the target area of the file may be obtained by clipping the file according to the 4 vertex coordinate area information of the rectangular frame obtained in S202.

In some embodiments, after the file is clipped according to the position information and the target area is obtained, the mask processing is performed on the target area to obtain mask information of the target area.

In the mask information, the pixel value of the watermark region is not 0, and the pixel values of the other regions are 0.

And S204, carrying out watermark removal processing on the target area to obtain the watermark-removed file.

Specifically, please refer to fig. 4, where fig. 4 is a flowchart illustrating a method for performing watermark removal processing on a target area according to an embodiment of the present application. As an implementation, the main body of execution of the method in fig. 4 may be the terminal device 100 in fig. 1, and as another implementation, the main body of execution of the method in fig. 4 may also be the server 200 in fig. 1. As shown in fig. 4, the method includes: s401 to S402.

S401, inputting the target region into a second neural network model for processing to obtain a first region output by the second neural network model.

Specifically, in this embodiment of the application, the first area is an area where the watermark is removed from the target area.

In the embodiment of the present application, the second neural network model is a neural network model having a codec structure. Referring to fig. 5a, fig. 5a is a diagram illustrating a network structure of a second neural network model according to an embodiment of the present application.

The second neural network model comprises a contraction path composed of a plurality of encoders, an expansion path composed of a plurality of decoders and a replacement module, wherein the number of the encoders in the embodiment of the present application is 6, the number of the 6 encoders is A, B, C, D, E, F for convenience of description, the number of the decoders in the embodiment of the present application is 5, the number of the 5 decoders is a, b, c, d and e for convenience of description, and the number of the replacement module is G for convenience of description.

The contraction path mainly carries out down-sampling through all levels of encoders to gradually carry out feature extraction, and the expansion path mainly carries out up-sampling through all levels of decoders to gradually restore feature images with higher and higher resolutions. The context information is selected during the progressive down-sampling of the extension path, so that, to compensate for the loss of features, each decoder takes as input the concatenation of the up-sampling feature of its previous stage with the up-sampling feature of its peer encoder, for compensating for the context information, in order to ensure the quality of the restored image. The picture restored by the last-stage decoder can restore the color closer to the original image through a post-processing step.

In the embodiment of the present application, please refer to fig. 5b for a network structure of each encoder, and fig. 5b is an exemplary diagram of a network structure of an encoder according to an embodiment of the present application.

Each encoder contains a convolutional layer, an activation function layer (Relu), a batch normalization layer (BatchNorm), and a max pooling layer. In each encoder, the number of channels of the feature image becomes 2 times the number of original channels and increases layer by layer, but the spatial size decreases layer by layer to become 1/2 of the spatial size of the original feature image.

In the embodiment of the present application, please refer to fig. 5c for a network structure of each decoder, and fig. 5c is an exemplary diagram of a network structure of a decoder according to an embodiment of the present application.

Each decoder comprises a tensor splicing layer, a transposition convolutional layer, a convolutional layer, an activation function layer (Relu) and a batch normalization layer (BatchNorm), the tensor splicing layer is used for connecting an encoder which is the same as the decoder, and the transposition convolutional layer is used for enlarging the size of the characteristic image. In each decoder, the number of channels of the feature image becomes 1/2, which is the number of original channels, and decreases layer by layer, but the spatial size increases layer by layer, becoming 2 times the spatial size of the original feature image.

For the second neural network model, the residual connection of which is present in the network structure of each encoder and decoder, as shown in fig. 5b and 5c, repeat x3 means that the residual connection is repeated 3 times, which has the advantage of enlarging the receptive field and at the same time helps to improve the quality of the restored image.

In the embodiment of the present application, the replacement module G is connected to the decoder, and the replacement module includes a convolution layer of 1 × 1 and an activation function layer (sigmoid). The replacement module processes the feature images that process the output of the encoder and decoder.

Referring to fig. 6, fig. 6 is a schematic flowchart illustrating a training method of a second neural network model according to an embodiment of the present application. As shown in fig. 6, the method includes: s601 to S604.

And S601, obtaining a training sample.

Specifically, when the watermark in the file is removed, in order to improve the speed of removing the watermark, the whole file is not required to be processed, and the position information of the watermark in the file is determined according to the output result of the first neural network model. And acquiring a target area in the file according to the position information, and further only carrying out watermark removal processing on the target area in the second neural network model. Therefore, in the embodiment of the present application, the multiple sets of sample files in S301 may be processed by the first neural network model, and the target regions in the multiple sets of sample files obtained may be used as training samples of the second neural network model.

And S602, inputting the training samples into a plurality of encoders in a contraction path of the second neural network model, and gradually down-sampling to extract the characteristic images of the training samples in multiple sizes.

For example, referring to fig. 5a, the dimensional features of each target region in the training sample, which are 128 × 3 (length × width × height, where height may be understood as the number of channels), are down-sampled by the encoder a to extract features, and then the features are converted into 64 × 32 feature images, and the feature images are output to the encoder B, and similarly, after the feature images are output by the encoder C, the feature images are output by 32 × 64, and then the feature images are converted into 16 × 128 feature images by the encoder D, and then the feature images are converted into 8 × 256 feature images by the encoder E, and the feature images are converted into 4 × 512 feature images by the encoder F, and the obtained multi-size feature images are transmitted to the extension path.

And step S603, inputting the characteristic graphs of multiple sizes into a plurality of decoders in an extension path of the second neural network model, and gradually up-sampling and restoring the characteristic graphs into characteristic images with higher resolution.

Specifically, each decoder performs restoration with reference to the feature image obtained by its corresponding encoder.

Referring to fig. 5a, except for the middle encoder F, the encoders and decoders on both sides of the second neural network model are in a symmetrical structure, where each decoder obtains two inputs, one is the upsampled image feature of the decoder on the previous stage, and the other is the cascade of the upsampled image features of the encoder on the symmetrical stage.

In the embodiment of the present application, after passing through the decoder a, the feature images of 4 × 512 are restored to feature images of 8 × 256, then restored to feature images of 16 × 128 through the decoder b, restored to feature images of 32 × 64 through the decoder c, restored to feature images of 64 × 32 through the encoder d, and finally restored to feature images of 128 × 3 through the encoder e, which are the same as the size of the feature images at the time of input by the encoder.

S604, processing the output of the decoder and the output of the last stage of the encoder to obtain the first area after the watermark is removed.

Specifically, in the embodiment of the present application, both the feature image output by the decoder and the feature image output by the encoder are transmitted to a replacement module G for replacement, and in the replacement module, the feature image output by the decoder and the feature image output by the encoder are processed by applying a sigmoid function and regularization processing according to the inherent principle of the second neural network model, so as to obtain the first region from which the watermark is removed.

Referring to fig. 7, fig. 7 is an exemplary diagram of obtaining a first region after removing a watermark according to an embodiment of the present application, and in fig. 7, a target region C_rThe feature map output after being processed by the decoder includes a third region corresponding to the target region

Characteristic image of target area

And the like.

In some embodiments, the third area is a background area corresponding to the target area, and the third area does not include a watermark. In some embodiments, the feature image of the target region is a mask image in which the pixel values of the watermark region are not 0 and the pixel values of the other regions are 0.

In the embodiment of the present application, in the replacing module G, the position information of the watermark in the target area is determined.

Specifically, mask processing is performed on the target area to obtain mask information of the target area, and position information of the watermark in the target area is determined according to the mask information

In some embodiments, a mask image is obtained by performing mask processing on the target area, and the position information of the watermark in the target area can be determined according to the mask of the watermark area in the mask image.

In the embodiment of the application, a fourth area corresponding to the watermark in the third area is determined according to the position information of the watermark in the target area.

Specifically, the exclusive or processing is performed on the third area and the watermark image in the target area, so that the fourth area corresponding to the watermark in the third area can be determined

The specific formula is as follows:

in the embodiment of the present application, the fourth area is used to replace the watermark in the target area.

Specifically, the target area can be replaced by the fourth area by using the following formula, so as to obtain the first area with the watermark removed.

Wherein the content of the first and second substances,

the first area is indicated.

In the embodiment of the application, when the iteration number of the training of the second neural network model reaches a certain number (for example, 100), the training of the neural network model is completed, and the stored model weight parameters are used for obtaining the first region after the watermark is removed.

In some embodiments, in order to monitor the training effect of the second neural network model, the third area is used as a target value, the first area after the watermark is removed is used as a predicted value, and a total loss function formed by linearly overlapping a plurality of loss functions is used for monitoring the sample training process, so that the predicted value obtained by training maximally approaches to the target value along with the increase of the iteration number.

Referring to fig. 8, fig. 8 is a flowchart illustrating a method for obtaining a first region after removing a watermark by applying a second neural network model according to an embodiment of the present application. As an implementation, the main body of execution of the method in fig. 8 may be the terminal device 100 in fig. 1, and as another implementation, the main body of execution of the method in fig. 8 may also be the server 200 in fig. 1. As shown in fig. 8, the method includes: s801 to S804.

And S801, acquiring a third area corresponding to the target area.

In the embodiment of the application, the target region is input into the trained second neural network model, the relevant parameters of the second neural network model are optimized after the second neural network model is trained, and the characteristic image output by the decoder is the third region corresponding to the target region.

And S802, determining the position information of the watermark in the target area.

In the embodiment of the present application, the method for determining the location information of the watermark in the target area is the same as the method for determining the location information of the watermark in the target area in S604, and details are not repeated here.

And S803, determining a fourth area corresponding to the watermark in the third area according to the position information of the watermark in the target area.

In the embodiment of the present application, a method for determining the fourth region is the same as the method for determining the fourth region in S604, and details are not repeated here.

And S804, replacing the watermark in the target area by using the fourth area.

In the embodiment of the application, the fourth area is used for replacing the watermark in the target area so as to obtain the first area after the watermark is removed.

The method for replacing the watermark in the target area by the fourth area is the same as the method for replacing the watermark in the target area by the fourth area in S604, and is not described herein again.

S402, fusing the first area and the second area in the file to obtain the file with the watermark removed.

Specifically, the first area is an area of the target area from which the watermark is removed. The second area is an area of the file other than the target area.

In the embodiment of the application, the information of the target area includes position information of a marking frame of the watermark, that is, vertex coordinate information of the marking frame, and the first area and the second area are fused according to the vertex coordinate information of the marking frame to obtain the file with the watermark removed.

In the embodiments of the present application. The first region and the second region may be fused by using weighted average, wavelet transform, fuzzy neural network, tower decomposition, and the like, and the method for fusing the first region and the second region is not limited in the embodiment of the present application.

In summary, in the embodiment of the present application, a file including a watermark is obtained; determining the position information of the watermark in the file by utilizing a first neural network model; acquiring a target area in the file according to the position information, wherein the target area is an area containing the watermark in the file; and carrying out watermark removal processing on the target area by utilizing the second neural network model to obtain the watermark-removed file. Namely, the application determines the position information of the watermark in the file; and acquiring a target area in the file according to the position information, and only performing watermark removal processing on the target area without processing the whole file to obtain the file with the watermark removed, so that the speed of removing the watermark is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a watermark removing apparatus according to an embodiment of the present application, and as an implementation, the apparatus may be applied to the terminal device 100 of fig. 1, and as another implementation, the apparatus may also be applied to the server 200 of fig. 1. The device includes:

a first obtaining module 91, configured to obtain a file containing a watermark.

A determining module 92 for determining the location information of the watermark in the document.

And a second obtaining module 93, configured to obtain a target area in the file according to the location information, where the target area is an area in the file that includes the watermark.

And the processing module 94 is configured to perform watermark removal processing on the target area to obtain a watermark-removed file.

Wherein, the determining module 92 includes:

the first processing unit 921 is configured to input the file to the first neural network model for processing, so as to obtain location information of the watermark output by the first neural network model in the file.

The second obtaining module 93 includes:

and a clipping unit 931, configured to clip the file according to the location information to obtain the target area.

Wherein, the processing module 94 includes:

the second processing unit 941 is configured to input the target region to the second neural network model for processing, so as to obtain a first region output by the second neural network model, where the first region is a region of the target region after removing the watermark.

The fusing unit 942 is configured to fuse the first area with a second area in the file, where the second area is an area of the file other than the target area, to obtain the watermarked file.

Wherein, the second processing unit 941 includes:

the first obtaining subunit 9411 is configured to obtain a third area corresponding to the target area, where the third area is a background area corresponding to the target area, and the third area does not include a watermark.

A second obtaining sub-unit 9412, configured to determine location information of the watermark in the target area.

A determining subunit 9413, configured to determine, according to the location information of the watermark in the target area, a fourth area corresponding to the watermark in the third area.

A replacing subunit 9414, configured to replace the watermark in the target area with the fourth area.

Wherein, the second obtaining subunit 9412 includes:

a mask processing subunit 9415, configured to perform mask processing on the target area to obtain mask information of the target area, and determine location information of the watermark in the target area according to the mask information.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

As shown in fig. 10, the present embodiment further provides a terminal device 20, which includes a memory 21, a processor 22, and a computer program 23 stored in the memory 21 and executable on the processor 22, where when the processor 22 executes the computer program 23, the display screen defect locating method of the foregoing embodiments is implemented.

The Processor 22 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 21 may be an internal storage unit of the terminal device 200. The memory 21 may also be an external storage device of the terminal device 200, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) and the like provided on the terminal device 200. Further, the memory 21 may also include both an internal storage unit of the terminal device 200 and an external storage device. The memory 21 is used to store computer programs and other programs and data required by the terminal device 200. The memory 21 may also be used to temporarily store data that has been output or is to be output.

The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the watermark removing method according to the above embodiments is implemented.

The embodiment of the present application provides a computer program product, which when running on a terminal device, enables the terminal device to implement the watermark removing method of the above embodiments when executed.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be implemented by a computer program, which can be stored in a computer readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable storage medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer memory, read-only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunication signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable storage media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and proprietary practices.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiments of the present application.

The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method of watermark removal, the method comprising:

acquiring a file containing a watermark;

determining the position information of the watermark in the file;

2. The method of claim 1, wherein the determining the location information of the watermark in the file comprises:

3. The method of claim 2, wherein the first neural network model is a YOLO v3 model.

4. The method of claim 1, wherein the obtaining the target area in the file according to the location information comprises:

5. The method according to any one of claims 1 to 4, wherein the performing watermark removal processing on the target area to obtain a watermarked file comprises:

6. The method of claim 5, wherein the second neural network model processes the target region, comprising:

determining location information of the watermark in the target area;

replacing the watermark in the target area with the fourth area.

7. The method of claim 6, wherein the determining the location information of the watermark in the target area comprises:

8. An apparatus for removing a watermark, the apparatus comprising:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the watermark removal method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the watermark removal method according to any one of claims 1 to 7.