WO2023019682A1 - 水印去除方法、装置、终端设备及可读存储介质 - Google Patents

水印去除方法、装置、终端设备及可读存储介质 Download PDF

Info

Publication number
WO2023019682A1
WO2023019682A1 PCT/CN2021/119725 CN2021119725W WO2023019682A1 WO 2023019682 A1 WO2023019682 A1 WO 2023019682A1 CN 2021119725 W CN2021119725 W CN 2021119725W WO 2023019682 A1 WO2023019682 A1 WO 2023019682A1
Authority
WO
WIPO (PCT)
Prior art keywords
watermark
file
area
target area
present application
Prior art date
Application number
PCT/CN2021/119725
Other languages
English (en)
French (fr)
Inventor
李�浩
Original Assignee
广东艾檬电子科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东艾檬电子科技有限公司 filed Critical 广东艾檬电子科技有限公司
Publication of WO2023019682A1 publication Critical patent/WO2023019682A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application belongs to the technical field of image processing, and in particular relates to a watermark removal method, device, terminal equipment and readable storage medium.
  • the determining the position information of the watermark in the file includes:
  • the first neural network model is a YOLO v3 model.
  • said obtaining the target area in the file according to the location information includes:
  • the described target area is subjected to watermark removal processing to obtain a file after watermark removal, including:
  • the first area is fused with the second area in the file to obtain a watermark-removed file
  • the second area is an area in the file other than the target area.
  • the process of processing the target area with the second neural network model includes:
  • the third area is a background area corresponding to the target area, and the third area does not include the watermark;
  • the watermark in the target area is replaced with the fourth area.
  • the determining the position information of the watermark in the target area includes:
  • the embodiment of the present application provides a watermark removal device characterized in that the device includes:
  • the first obtaining module is used to obtain the file containing the watermark
  • a determining module configured to determine the position information of the watermark in the file
  • a second acquiring module configured to acquire a target area in the file according to the location information, where the target area is an area in the file that contains the watermark;
  • a processing module configured to perform watermark removal processing on the target area to obtain a watermark-removed file.
  • an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, and the computer program is implemented when the processor executes the computer program.
  • the watermark removal method as described in the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the watermark removal method as described in the first aspect is implemented.
  • the present application obtains the file containing the watermark; determines the position information of the watermark in the file; obtains the target area in the file according to the position information, and the target area is the file contained in the file.
  • the area of the watermark; the watermark removal process is performed on the target area to obtain the file after the watermark is removed. That is, this application determines the position information of the watermark in the file; according to the position information to obtain the target area in the file, it does not need to process the entire file, and only needs to perform watermark removal processing on the target area to obtain the de-watermarked file , to increase the speed of watermark removal and reduce the calculation amount of watermark removal.
  • FIG. 1 is a schematic diagram of a network architecture of a watermark removal method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a watermark removal method provided by an embodiment of the present application
  • Fig. 3 a is an example diagram of the network structure of the YOLO v3 model provided by an embodiment of the present application
  • Fig. 3b is a schematic flowchart of a method for training a target detection model provided by an embodiment of the present application
  • Fig. 3c is an example diagram of the position information of a rectangular frame provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a method for removing watermarks from a target area provided by an embodiment of the present application
  • Fig. 5a is an example diagram of the network structure of the second neural network model provided by an embodiment of the present application.
  • Fig. 5b is an example diagram of a network structure of an encoder provided by an implementation of the present application.
  • Fig. 5c is an example diagram of a network structure of a decoder provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a training method for a second neural network model provided by an embodiment of the present application.
  • Fig. 7 is an example diagram of obtaining the first region after removing the watermark provided by an embodiment of the present application.
  • Fig. 8 is a schematic flowchart of a method for obtaining a first region after watermark removal by applying a second neural network model provided by an embodiment of the present application;
  • FIG. 9 is a schematic structural diagram of a watermark removal device provided in an embodiment of the present application.
  • Fig. 10 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • references to "an embodiment of this application” or “some embodiments” or the like described in the specification of the application mean that a specific feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the application.
  • the phrases “in other embodiments”, “an embodiment of the present application”, “other embodiments of the present application”, etc. appearing in different places in this specification do not necessarily all refer to the same embodiment, but rather means “one or more but not all embodiments” unless specifically emphasized otherwise.
  • the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.
  • FIG. 1 is a schematic diagram of a network architecture of a watermark removal method provided by an embodiment of the present application. For convenience of description, only the parts relevant to the present application are shown.
  • the network architecture includes: a terminal device 100 and a server 200.
  • the terminal device 100 may include, but not limited to, a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a personal digital assistant (personal digital assistant, PDA), etc.
  • the terminal device 100 can be used to deploy the first neural network model and the second neural network model.
  • the server 200 is essentially an electronic device with computing capability.
  • the server 200 is deployed in the cloud and can also be used to deploy the first neural network model and the second neural network model.
  • the server 200 mainly provides services for the terminal device 100 .
  • the terminal device 100 is connected to the Internet through a known network access method, and establishes a data communication link with the server 200 in the cloud, so as to start the training of the first neural network model and the second neural network model, and perform watermarking on the files containing the watermark area. processing etc.
  • FIG. 2 is a schematic flowchart of a watermark removal method provided by an embodiment of the present application.
  • the execution subject of the method in FIG. 2 may be the terminal device 100 in FIG. 1.
  • the execution subject of the method in FIG. 2 may also be the server 200 in FIG. 1, as shown in FIG. 2
  • the method includes: S201 to S204.
  • the file containing the watermark in the embodiment of the present application may be an image containing the watermark, a portable document format (Portable Document Format, PDF) containing the watermark, a web page containing the watermark, etc., and the embodiment of the present application does not make any reference to the file type containing the watermark limited.
  • a portable document format Portable Document Format, PDF
  • the location information of the watermark in the file is determined through the first neural network model.
  • the first neural network model is a target detection model
  • the target detection model includes a faster region-based convolutional neural network model (Faster R-CNN model), a single deep neural network detection model (Single Shot MultiBox Detector , SSD model) model and advanced real-time object detection model (Real-Time Object Detetion, YOLO model), etc.
  • the YOLO model includes the YOLO v1 model, the YOLO v2 model and the YOLO v3 model.
  • the embodiment of this application uses the YOLOv3 model for illustration. Please refer to Fig. 3a, Fig. 3a is an example diagram of the network structure of the YOLO v3 model provided by an embodiment of the present application.
  • the basic network of the YOLO v3 model is the Darknet53 network.
  • the Darknet53 network includes 52 convolutional layers, 1 average pooling layer, 1 fully connected layer and 1 activation function layer (softmax).
  • the 52-layer convolutional layer includes: 1 convolution kernel with 32 filters, 5 down-sampling layers, and 5 sets of repeated residual unit resblock_body (these 5 sets of residual units use the residual network (Residual Neural Network) , ResNet) this layer-skip connection method, each unit is composed of a separate convolutional layer and a group of repeated convolutional layers, and the repeated convolutional layers are repeated 1 time, 2 times, 8 times, 8 times times and 4 times; in each repeated convolution layer, first perform a 1x1 convolution operation, and then perform a 3x3 convolution operation), a total of 52 layers.
  • ResNet residual network
  • the Darknet53 network set in the YOLO v3 model is to obtain the features of different sizes of the input file.
  • the input file is an image file
  • the size of the image file is 416*416*3, where 416*416 represents the resolution of the image, and 3 represents the number of channels of the image.
  • the image file is processed by 5 downsampling layers and 5 sets of repeated residual units, that is, the image is subjected to 2 times downsampling (2) to obtain a feature image of 208*208*64 size, and 4 times downsampling (2 2 ) Get a feature image of 104*104*128 size, 8 times downsampling (2 3 ) to get a feature image of 52*52*256 size, 16 times downsampling (2 4 ) to get a feature image of 26*26*512 size, 32 Downsampling (2 5 ) to obtain a feature image with a size of 13*13*1024.
  • the network structure of the YOLO v3 model also includes 3 prediction layers, which are connected to the last 3 layers of residual units in the Darknet53 network through multiple convolutional layers, multiple upsampling layers, and multiple tensor splicing layers.
  • the purpose of setting three prediction layers in the YOLO v3 model is to detect the multi-scale features of the input file.
  • the input file is detected three times by using the three prediction layers in the YOLO v3 model, which are respectively detected at 32 times downsampling, 16 times downsampling, and 8 times downsampling, so as to realize different input files.
  • the features of the size are detected and the detection results are output.
  • the purpose of setting the upsampling layer in the YOLO v3 model is to expand the features obtained through low-power downsampling, so that the feature expression effect obtained through low-power downsampling is better.
  • the feature obtained through the low-power downsampling process is 13*13
  • the feature obtained after the expansion operation of the upsampling layer is 26*26.
  • the tensor splicing layer is set in the YOLO v3 model to splice the feature image output by the Darknet53 network with the feature image obtained after the upsampling process.
  • FIG. 3b is a schematic flowchart of a method for training a target detection model provided by an embodiment of the present application. As shown in Fig. 3a, the method includes: S301 to S303.
  • the sample file is a file containing a watermark.
  • a batch of files requiring watermark removal may be collected in advance, for example, 1000 to 3000 (eg 1500) images requiring watermark removal may be collected in advance.
  • the embodiment of the present application does not limit the number of files to be watermarked.
  • Obtaining multiple sets of sample files means acquiring pre-collected 1000 to 3000 images that need to be watermarked.
  • the target area is the area containing the watermark in the sample file.
  • each sample file needs to be labeled.
  • the embodiment of the present application can use the general text labeling tool (Labelme) to label watermarks in 1500 images respectively (for example, labeling boxes can be used to label watermarks) , and store the labeling results in the json file format, you can get 1500 sets of json files, and the json files include the position information of the label box of the watermark.
  • the information contained in these 1500 sets of json files is referred to as the target area information of each sample file.
  • the form of the labeling frame includes: polygon, rectangle, circle, etc., and the example of the labeling frame is a rectangular frame for illustration in the embodiment of the present application.
  • the coordinates of the four vertices are expressed as (x_top_left , y_top_left), (x_top_right, y_top_right), (x_bottom_left, y_bottom_left) and (x_bottom_right, y_bottom_right).
  • each sample file in the training set and the target area information of each sample file are input into the YOLO v3 model.
  • the Darknet53 network in the YOLO v3 model generates feature images of different sizes from each sample file and the target area information of each sample file.
  • divide the feature map into S*S grid units for example: a feature image of 16*16 size is divided into 16*16 grid units
  • the target area in the feature image If it falls into any grid unit, the grid unit will detect the target area.
  • three bounding boxes are set for each grid unit.
  • the bounding box of each feature image is used to calculate the rectangular box of the target area marked in advance to obtain the measurement standard (Intersection Over Union, IOU). Only the bounding box with the largest IOU can be used. to predict the target area.
  • IOU Intersection Over Union
  • IOU is a standard for measuring the accuracy of detecting corresponding objects in a specific data set. This standard is used to measure the correlation between the real and the prediction. The larger the value of IOU, the higher the correlation.
  • the IOUs of the bounding box and the target area can be calculated respectively through the calculation formula of the above IOU.
  • the YOLO v3 model in the embodiment of this application detects image features of multiple sizes.
  • the predicted output features have two dimensions that are the dimensions of the extracted feature image (such as 13*13), and another dimension (depth) that is B*(5+C), where B represents the predicted value of each grid unit
  • B represents the predicted value of each grid unit
  • C indicates the number of categories of bounding boxes
  • 5 indicates the coordinate information of 4 bounding boxes and the confidence of a target area.
  • each sample file in the training set and verification set and the target area information of each sample file are input into the model for training.
  • threshold such as 90%
  • the model is considered to be trained. Save the optimal model weight parameters to extract the position information of the watermark in the sample file.
  • the file is input to the first neural network model for processing, and the position information of the watermark output by the first neural network model in the file is obtained.
  • the file is cropped according to the location information to obtain the target area.
  • the target area of the file can be obtained by clipping the file according to the coordinate area information of the four vertices of the rectangular frame obtained in S202.
  • the pixel values in the watermark area are not 0, and the pixel values in other areas are 0.
  • the first area is the area after the watermark has been removed from the target area.
  • the second neural network model is a neural network model with an encoding and decoding structure.
  • FIG. 5a is an example diagram of a network structure of a second neural network model provided by an embodiment of the present application.
  • the contraction path mainly performs feature extraction step by step by implementing down-sampling at all levels of encoders
  • the expansion path mainly uses up-sampling at all levels of decoders to gradually restore feature images with higher and higher resolutions.
  • Contextual information is selected during the step-by-step downsampling of the extension path, so to compensate for feature loss, each decoder takes as input the concatenation of the upsampled features of its previous stage and the upsampled features of its encoder , used to compensate for contextual information in order to ensure the restored image quality.
  • the picture restored by the last level of decoder can be restored to a color closer to the original picture after a post-processing step.
  • FIG. 5b is an example diagram of the network structure of the encoder provided in an implementation of the present application.
  • Each encoder consists of a convolutional layer, an activation layer (Relu), a batch normalization layer (BatchNorm) and a max pooling layer.
  • the number of channels of the feature image becomes twice the number of original channels and increases layer by layer, but the spatial size decreases layer by layer, becoming 1/2 of the spatial size of the original feature image.
  • FIG. 5c is an example diagram of the network structure of the decoder provided in an embodiment of the present application.
  • Each decoder consists of tensor splicing layer, transposed convolutional layer, convolutional layer, activation function layer (Relu) and batch normalization layer (BatchNorm), tensor splicing layer is used to connect with the same level of the decoder
  • the transposed convolution layer is used to expand the dimension of the feature image.
  • the channel number of the feature image becomes 1/2 of the original channel number and decreases layer by layer, but the spatial size increases layer by layer, becoming twice the spatial size of the original feature image.
  • the replacement module G is connected to the decoder, and the replacement module includes a 1*1 convolutional layer and an activation function layer (sigmoid).
  • the replacement module operates on feature images that process the outputs of the encoder and decoder.
  • FIG. 6 is a schematic flowchart of a training method for a second neural network model provided by an embodiment of the present application. As shown in FIG. 6, the method includes: S601 to S604.
  • the embodiment of the present application removes the watermark in the file, in order to increase the speed of watermark removal, it is not necessary to process the entire file, but to determine the watermark in the file through the output result of the first neural network model. location information.
  • the target area in the file is obtained according to the position information, and then only the watermark removal process needs to be performed on the target area in the second neural network model. Therefore, in the embodiment of the present application, the multiple sets of sample files in S301 may be processed by the first neural network model, and the target regions in the multiple sets of sample files thus obtained may be used as training samples for the second neural network model.
  • S602. Input the training samples to multiple encoders in the contraction path of the second neural network model, and gradually down-sample to extract multi-size feature images of the training samples.
  • Step S603 inputting multi-size feature maps into multiple decoders in the extension path of the second neural network model, and gradually upsampling and restoring feature images with higher resolutions.
  • each decoder implements restoration with reference to the feature image obtained by its corresponding encoder.
  • the encoders and decoders on both sides have a symmetrical structure, where each decoder obtains two inputs, one of which is the input of the previous level.
  • the upsampled image features of the decoder, and the other way is the concatenation of the upsampled image features of the symmetric level encoder.
  • the feature image of 4*4*512 is restored to the feature image of 8*8*256 after being passed through the decoder a, and then restored to the feature image of 16*16*128 by the decoder b, and then the feature image of Decoder c restores the feature image of 32*32*64, and then restores it to 64*64*32 by encoder d, and finally restores it to 128*128*3 by encoder e, which is the same as the size of the feature image input by the encoder same.
  • the feature image output by the decoder and the feature image output by the encoder are both transmitted to a replacement module G for replacement.
  • the replacement module according to the inherent principle of the second neural network model, use The sigmoid function and the regularization process process the feature image output by the decoder and the feature image output by the encoder, so as to obtain the first region after removing the watermark.
  • Fig. 7 is an example diagram of obtaining the first region after removing the watermark provided by an embodiment of the present application.
  • the target region C r is processed by the decoder, and the output feature map contains A third area corresponding to the target area Feature image of the target region wait.
  • the third area is a background area corresponding to the target area, and the third area does not contain a watermark.
  • the characteristic image of the target area is a mask image, in the mask image, the pixel values in the watermark area are not 0, and the pixel values in other areas are 0.
  • the location information of the watermark in the target area is determined.
  • the target area is masked to obtain the mask information of the target area, and the position information of the watermark in the target area is determined according to the mask information
  • the target area is masked to obtain a mask image, and the position information of the watermark in the target area can be determined according to the mask of the watermark area in the mask image.
  • the fourth area corresponding to the watermark in the third area is determined.
  • the fourth area corresponding to the watermark in the third area can be determined.
  • the fourth area is used to replace the watermark in the target area.
  • the replacement of the target area by the fourth area can be completed by using the following formula, and the first area after watermark removal can be obtained.
  • the training of the neural network model is completed, and the saved model weight parameters are used to obtain the first region after the watermark is removed.
  • the third area is used as the target value
  • the first area after removing the watermark is used as the predicted value
  • the total loss obtained by linearly superimposing multiple loss functions is adopted.
  • the function supervises the sample training process, so that the predicted value obtained from the training will maximize and approach the target value with the increase of the number of iterations.
  • FIG. 8 is a schematic flowchart of a method for obtaining a first region after watermark removal by applying a second neural network model according to an embodiment of the present application.
  • the execution subject of the method in FIG. 8 may be the terminal device 100 in FIG. 1 .
  • the execution subject of the method in FIG. 8 may also be the server 200 in FIG. 1 .
  • the method includes: S801 to S804.
  • the method for determining the position information of the watermark in the target area is the same as the method for determining the position information of the watermark in the target area in S604, and will not be repeated here.
  • the method of using the fourth area to replace the watermark in the target area is the same as the method of using the fourth area to replace the watermark in the target area in S604, and will not be repeated here.
  • the first area is the area after the watermark is removed from the target area.
  • the second area is an area in the file other than the target area.
  • the information of the target area includes the position information of the label box of the watermark, that is, the vertex coordinate information of the label box. According to the vertex coordinate information of the label box, the first area and the second area are fused to obtain the watermark-free document.
  • the first area and the second area can be fused by using methods such as weighted average, wavelet transform, fuzzy neural network, tower decomposition, etc.
  • the embodiment of the present application does not limit the fusion method of the first area and the second area.
  • the embodiment of the present application obtains the file containing the watermark; uses the first neural network model to determine the position information of the watermark in the file; obtains the target area in the file according to the position information, and the target area is the area containing the watermark in the file ; Use the second neural network model to perform watermark removal processing on the target area to obtain the watermark-removed file. That is, this application determines the position information of the watermark in the file; according to the position information to obtain the target area in the file, it does not need to process the entire file, and only needs to perform watermark removal processing on the target area to obtain the de-watermarked file , to increase the speed of watermark removal.
  • FIG. 9 is a schematic structural diagram of a watermark removal device provided in an embodiment of the present application.
  • the device can be applied to the terminal device 100 in FIG. 1.
  • the device can also be applied to In the server 200 of FIG. 1 .
  • the unit includes:
  • the first obtaining module 91 is configured to obtain files containing watermarks.
  • the determination module 92 is configured to determine the location information of the watermark in the file.
  • the second obtaining module 93 is configured to obtain a target area in the file according to the location information, where the target area is the area containing the watermark in the file.
  • the processing module 94 is configured to perform watermark removal processing on the target area to obtain a watermark-removed file.
  • the determining module 92 includes:
  • the first processing unit 921 is configured to input the file into the first neural network model for processing, and obtain the position information of the watermark output by the first neural network model in the file.
  • the second acquisition module 93 includes:
  • the clipping unit 931 is configured to clip the file according to the location information to obtain the target area.
  • processing module 94 includes:
  • the second processing unit 941 is configured to input the target area to the second neural network model for processing to obtain a first area output by the second neural network model, where the first area is the target area after the watermark has been removed.
  • the first acquiring subunit 9411 is configured to acquire a third area corresponding to the target area, the third area is a background area corresponding to the target area, and the third area does not contain a watermark.
  • the second acquiring subunit 9412 is configured to determine the location information of the watermark in the target area.
  • the determining subunit 9413 is configured to determine a fourth area corresponding to the watermark in the third area according to the position information of the watermark in the target area.
  • a replacement subunit 9414 configured to use the fourth area to replace the watermark in the target area.
  • the second acquisition subunit 9412 includes:
  • the mask processing subunit 9415 is configured to perform mask processing on the target area, obtain mask information of the target area, and determine position information of the watermark in the target area according to the mask information.
  • the embodiment of the present application also provides a terminal device 20, including a memory 21, a processor 22, and a computer program 23 stored in the memory 21 and operable on the processor 22, and the processor 22 executes the computer program At 23:00, the methods for locating display screen defects in the above embodiments are implemented.
  • the processor 22 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the memory 21 may be an internal storage unit of the terminal device 200 .
  • the memory 21 can also be an external storage device of the terminal device 200, such as a plug-in hard disk equipped on the terminal device 200, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card) etc.
  • the memory 21 may also include both an internal storage unit of the terminal device 200 and an external storage device.
  • the memory 21 is used to store computer programs and other programs and data required by the terminal device 200 .
  • the memory 21 can also be used to temporarily store data that has been output or will be output.
  • An embodiment of the present application also provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the watermark removal methods in the foregoing embodiments are implemented.
  • An embodiment of the present application provides a computer program product, which, when the computer program product runs on a terminal device, enables the terminal device to implement the watermark removal methods in the foregoing embodiments when executed.
  • An integrated unit may be stored in a computer-readable storage medium if it is realized in the form of a software function unit and sold or used as an independent product.
  • all or part of the processes in the methods of the above embodiments in the present application can be completed by instructing related hardware through computer programs, and the computer programs can be stored in computer-readable storage media.
  • the computer program includes computer program code
  • the computer program code may be in the form of source code, object code, executable file or some intermediate form.
  • the computer-readable storage medium may at least include: any entity or device capable of carrying computer program codes to a photographing device/terminal device, a recording medium, a computer memory, a read-only memory (ROM), a random access memory ( random access memory, RAM), electrical carrier signals, telecommunication signals, and software distribution media.
  • a photographing device/terminal device a recording medium
  • a computer memory a read-only memory (ROM), a random access memory ( random access memory, RAM), electrical carrier signals, telecommunication signals, and software distribution media.
  • ROM read-only memory
  • RAM random access memory
  • electrical carrier signals telecommunication signals
  • software distribution media Such as U disk, mobile hard disk, magnetic disk or optical disk, etc.
  • computer readable storage media may not be electrical carrier signals and telecommunication signals based on legislation and patent practice.
  • a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

本申请适用于图像处理技术领域,尤其涉及一种水印去除方法、装置、终端设备及可读存储介质。该方法包括:获取包含水印的文件;确定水印在文件中的位置信息;根据位置信息获取文件中的目标区域,目标区域为文件中包含水印的区域;对目标区域进行水印去除处理,得到去水印后的文件。即本申请通过确定水印在文件中的位置信息;根据位置信息获取文件中的目标区域,并不需要对整个文件进行处理,只需对目标区域进行水印去除处理,即可得到去水印后的文件,提高去除水印的速度,降低去除水印的计算量。

Description

水印去除方法、装置、终端设备及可读存储介质 技术领域
本申请属于图像处理技术领域,尤其涉及一种水印去除方法、装置、终端设备及可读存储介质。
背景技术
当前市场上的书本水印去除方法主要有:使用后期图像修复软件来去除,或者购买相应的付费软件来实现特定水印的批量去除,或者使用传统图像处理的方案来去除水印。
但是使用现有的书本水印去除方法进行水印去除时,需要对包含水印的整个图像的所有像素点进行处理,导致去除水印的速度较慢,以及去除水印的计算量较大。
发明内容
本申请实施例提供的水印去除方法、装置、终端设备及可读存储介质,提高去除水印的速度,降低去除水印的计算量。
第一方面,本申请实施例提供了一种水印去除方法,所述方法包括:
获取包含水印的文件;
确定所述水印在所述文件中的位置信息;
根据所述位置信息获取所述文件中的目标区域,所述目标区域为所述文件中包含所述水印的区域;
对所述目标区域进行水印去除处理,得到去水印后的文件。
在第一方面一种可能实现的方式中,所述确定所述水印在所述文件中的位置信息,包括:
将所述文件输入至第一神经网络模型进行处理,得到所述第一神经网络模型输出的所述水印在所述文件中的位置信息。
其中,所述第一神经网络模型为YOLO v3模型。
其中,所述根据所述位置信息获取所述文件中的目标区域,包括:
根据所述位置信息裁剪所述文件,得到所述目标区域。
其中,所述对所述目标区域进行水印去除处理,得到去水印后的文件,包括:
将所述目标区域输入至第二神经网络模型进行处理,得到所述第二神经网络模型输出的第一区域,所述第一区域为所述目标区域去除水印后的区域;
将所述第一区域与所述文件中的第二区域进行融合,得到去水印后的文件,所述第二区域为所述文件中除所述目标区域以外的区域。
其中,所述将第二神经网络模型对所述目标区域进行处理的过程,包括:
获取与所述目标区域对应的第三区域,所述第三区域为所述目标区域对应的背景区域,且所述第三区域不包含所述水印;
确定所述水印在所述目标区域中的位置信息;
根据所述水印在所述目标区域中的位置信息,确定所述第三区域中与所述水印对应的第四区域;
利用所述第四区域替换所述目标区域中的所述水印。
其中,所述确定所述水印在所述目标区域中的位置信息,包括:
对所述目标区域进行掩码处理,得到所述目标区域的掩码信息,并根据所述掩码信息确定所述水印在所述目标区域中的位置信息。
第二方面,本申请实施例提供一种水印去除装置其特征在于,所述装置包括:
第一获取模块,用于获取包含水印的文件;
确定模块,用于确定所述水印在所述文件中的位置信息;
第二获取模块,用于根据所述位置信息获取所述文件中的目标区域,所述目标区域为所述文件中包含所述水印的区域;
处理模块,用于对所述目标区域进行水印去除处理,得到去水印后的文件。
第三方面,本申请实施例提供一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如第一方面所述的水印去除方法。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如第一方面所述的水印去除方法。
本申请实施例与现有技术相比存在的有益效果是:本申请通过获取包含水印的文件;确定水印在文件中的位置信息;根据位置信息获取文件中的目标区 域,目标区域为文件中包含水印的区域;对目标区域进行水印去除处理,得到去水印后的文件。即本申请通过确定水印在文件中的位置信息;根据位置信息获取文件中的目标区域,并不需要对整个文件进行处理,只需对目标区域进行水印去除处理,即可得到去水印后的文件,提高去除水印的速度,降低去除水印的计算量。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例提供的水印去除方法的一种网络架构的示意图;
图2是本申请一实施例提供的一种水印去除方法的流程示意图;
图3a是本申请一实施例提供的YOLO v3模型的网络结构的示例图;
图3b是本申请一实施例提供的一种目标检测模型的训练方法的流程示意图;
图3c是本申请一实施例提供的矩形框的位置信息的示例图;
图4是本申请一实施例提供的一种对目标区域进行水印去除处理的方法的流程示意图;
图5a是本申请一实施例提供的第二神经网络模型的网络结构的示例图;
图5b是本申请一实施提供的编码器的网络结构的示例图;
图5c是本申请一实施例提供的解码器的网络结构的示例图;
图6是本申请一实施例提供的一种第二神经网络模型的训练方法的流程示意图;
图7是本申请一实施例提供的一种获得去除水印后的第一区域的示例图;
图8是本申请一实施例提供的一种应用第二神经网络模型获得去除水印后的第一区域的方法的流程示意图;
图9是本申请实施例提供的一种水印去除装置的结构示意图;
图10是本申请一实施例提供的终端设备的结构示意图。
具体实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其他实施例中也可以实现本申请。在其他情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述,在其他情况中,各个实施例中的具体技术细节可以互相参考,在一个实施例中没有描述的具体系统可参考其他实施例。
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其他特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
在本申请说明书中描述的参考“本申请实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在另一些实施例中”、“本申请一实施例”、“本申请其他实施例”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
现有的书本水印去除方法进行水印去除时,需要对包含水印的整个图像的所有像素点进行处理,导致去除水印的速度较慢,以及去除水印的计算量较大。
为了解决上述缺陷,本申请的发明构思为:
通过确定水印在文件中的位置信息,根据位置信息获取文件中的目标区域,并不需要对整个文件进行处理,只需对目标区域进行水印去除处理,即可得到去水印后的文件,提高去除水印的速度,降低去除水印的计算量。
为了说明本申请的技术方案,下面通过具体实施例来进行说明。
请参阅图1,图1是本申请一实施例提供的水印去除方法的一种网络架构的示意图。为了方便说明,仅示出与本申请相关的部分。该网络架构包括:终 端设备100和服务器200。
在该网络架构中,终端设备100可以包括但不限于手机、平板电脑、可穿戴设备、车载设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等。终端设备100可用于部署第一神经网络模型和第二神经网络模型。
在该网络架构中,服务器200本质上是具备计算能力的电子设备,该服务器200部署在云端,也可用于部署第一神经网络模型和第二神经网络模型。该服务器200主要为终端设备100提供服务。
终端设备100通过公知的网络接入方式接入互联网,与云端的服务器200建立数据通信链路,以便启动对第一神经网络模型和第二神经网络模型的训练、对包含水印区域的文件进行水印处理等操作。
请参阅图2,图2是本申请一实施例提供的一种水印去除方法的流程示意图。作为一实现方式,图2中的方法的执行主体可以为图1中的终端设备100,作为其他实现方式,图2中的方法的执行主体也可以为图1中的服务器200,如图2所示,该方法包括:S201至S204。
S201、获取包含水印的文件。
具体的,本申请实施例中包含水印的文件可以是包含水印的图像、包含水印的便携式文档格式(Portable Document Format,PDF)和包含水印的网页等,本申请实施例对包含水印的文件类型不作限定。
本申请实施例中,当需要对一组包含水印的文件进行去除水印操作时,将一组包含水印的文件输入至终端设备或服务器,终端设备或服务器即可获取一组包含水印的文件。
本申请其他实施例中,当需要对多组包含水印的文件进行去除水印操作时,将多组包含水印的文件输入至终端设备或服务器,终端设备或服务器即可获取多组包含水印的文件。即本申请实施例的终端设备或服务器可以对多组包含水印的文件进行批量处理,提高去除水印的速度。
S202、确定水印在文件中的位置信息。
具体的,本申请实施例中,在获取包含水印的文件之后,通过第一神经网络模型确定水印在文件中的位置信息。
本申请实施例中,第一神经网络模型为目标检测模型,目标检测模型包括更快速的基于区域的卷积神经网络模型(Faster R-CNN模型)、单个深层神经 网络检测模型(Single Shot MultiBox Detector,SSD模型)模型和先进的实时目标检测模型(Real-Time Object Detetion,YOLO模型)等。其中,YOLO模型包含YOLO v1模型、YOLO v2模型和YOLO v3模型。本申请实施例用YOLOv3模型进行举例说明。请参考图3a,图3a是本申请一实施例提供的YOLO v3模型的网络结构的示例图。
YOLO v3模型的基础网络是Darknet53网络,Darknet53网络包括52层卷积层,1个平均池化层,1个全连接层和1个激活函数层(softmax)。
其中,52层卷积层包括:1个具有32个过滤器的卷积核,5个下采样层以及5组重复的残差单元resblock_body(这5组残差单元采用残差网络(Residual Neural Network,ResNet)这种跳层连接方式,每个单元由1个单独的卷积层与一组重复执行的卷积层构成,重复执行的卷积层分别重复1次、2次、8次、8次和4次;在每个重复执行的卷积层中,先执行1x1的卷积操作,再执行3x3的卷积操作),一共是52层。
卷积层计算方法为:
52=1+5+(1*2)+(2*2)+(8*2)+(8*2)+(4*2)。
YOLO v3模型中设置的Darknet53网络是为了得到输入文件的不同尺寸的特征。示例性的,输入文件为图像文件,图像文件的尺寸为416*416*3,416*416表示图像的分辨率,3表示图像的通道数。图像文件经过5个下采样层以及5组重复的残差单元的处理,即图像分别经过了2倍下采样(2)得到208*208*64尺寸的特征图像,4倍下采样(2 2)得到104*104*128尺寸的特征图像、8倍下采样(2 3)得到52*52*256尺寸的特征图像,16倍下采样(2 4)得到26*26*512尺寸的特征图像,32倍下采样(2 5)得到13*13*1024尺寸的特征图像。
YOLO v3模型的网络结构中还包括3个预测层,3个预测层与Darknet53网络中最后3层残差单元通过多个卷积层、多个上采样层与多个张量拼接层进行连接。
在YOLO v3模型中设置3个预测层是为了对输入文件的多尺寸的特征进行检测。示例性的,利用YOLO v3模型中的3个预测层对输入文件进行了3次检测,分别是在32倍下采样,16倍下采样,8倍下采样时进行检测,进而实现对输入文件不同尺寸的特征进行检测,输出检测结果。
在YOLO v3模型中设置上采样层是为了扩大经过低倍下采样处理得到的特征,使经过低倍下采样得到的特征表达效果更好。示例性的,经过低倍下采 样处理得到的特征为13*13,经过上采样层的扩大操作,得到的特征为26*26。
本申请实施例中,在YOLO v3模型中设置张量拼接层是为了将Darknet53网络输出的特征图像与经过上采样处理得到的特征图像进行拼接。
请参考图3b,图3b是本申请一实施例提供的一种目标检测模型的训练方法的流程示意图。如图3a所示,该方法包括:S301至S303。
S301、获取多组样本文件。
具体的,样本文件为包含水印的文件。在一些实施例中,可以预先收集一批需要去除水印的文件,示例性的,预先收集1000张至3000张(例如1500张)需要去除水印的图像。本申请实施例对需要去除水印的文件数量不作限定。
获取多组样本文件即获取预先收集的1000张至3000张需要去除水印的图像。
S302、对各样本文件中的水印进行标注,得到各样本文件的目标区域信息。
具体的,目标区域为样本文件中包含水印的区域。在获取多组样本文件之后,需要对各样本文件进行标注,示例性的,本申请实施例可以利用通用文本标注工具(Labelme)在1500张图像中分别标注水印(例如可以利用标注框标注水印),将标注结果以json文件格式进行存储,即可得到1500组json文件,json文件包括水印的标注框的位置信息。本申请实施例中将这1500组json文件包含的信息称为各样本文件的目标区域信息。
本申请实施例中,标注框的形式包括:多边形、矩形和圆形等,本申请实施例以标注框为矩形框进行举例说明。
本申请实施例中,矩形框的位置信息为水印在样本文件中的位置信息,该水印在样本文件中的位置信息可用矩形框的四个顶点坐标进行表示。示例性的,请参考图3c,图3c是本申请一实施例提供的矩形框的位置信息的示例图。在图3c中,w表示矩形框。H表示样本文件。坐标系是以图像左上角为坐标原点,以图像宽度方向为x轴的正方向,以图像高度方向为y轴的正方向建立坐标系,在该坐标系中,四个顶点坐标表示为(x_top_left,y_top_left),(x_top_right,y_top_right),(x_bottom_left,y_bottom_left)和(x_bottom_right,y_bottom_right)。
本申请实施例中,将S301中获取的各样本文件和S302中获取的各样本文件的目标区域信息作为数据集,从数据集中随机挑选一定比例的数据作为第一神经网络模型的训练集,将剩余比例的数据作为第一神经网络模型的验证集。示例性的,从数据集中随机挑选70%至90%,例如可以是80%的数据作为第一 神经网络模型的训练集,将30%至10%,例如可以是20%的数据作为第一神经网络模型的验证集。
S303、将训练集和验证集输入第一神经网络模型中进行训练,保存训练参数。
具体的,将训练集中的各样本文件和各样本文件的目标区域信息输入YOLO v3模型中。YOLO v3模型中的Darknet53网络将各样本文件和各样本文件的目标区域信息生成不同尺寸的特征图像。对不同尺寸的特征图像进行检测时,将特征图划分为S*S个网格单元(例如:16*16尺寸的特征图像被划分为16*16个网格单元),特征图像中的目标区域落入任一网格单元,则由该网格单元对目标区域进行检测。
本申请实施例中,每个网格单元均设定3个边界框。YOLO v3模型对目标区域进行检测时,利用各特征图像的边界框分别与提前标注的目标区域的矩形框进行计算,得到测量标准(Intersection Over Union,IOU),只用IOU最大的边界框才能用来预测该目标区域。
IOU是一种测量在特定数据集中检测相应物体准确度的一个标准,这个标准用于测量真实与预测之间的相关度,IOU的值越大,则相关度越高。
IOU的计算公式为:
Figure PCTCN2021119725-appb-000001
本申请实施例通过上述IOU的计算公式,即可分别计算出边界框与目标区域的IOU。
本申请实施例中的YOLO v3模型是在多个尺寸的图像特征上做检测。预测得到的输出特征有两个维度是提取到的特征图像的维度(比如13*13),还有一个维度(深度)是B*(5+C),其中B表示每个网格单元预测的边界框的数量,C表示边界框的类别数,5表示4个边界框的坐标信息和一个目标区域的置信度。
本申请实施例中,将训练集和验证集中各样本文件和各样本文件的目标区域信息输入模型中进行训练,当训练的次数(epoch)达到100次,或者训练的验证集准确率到达某个阈值(如90%),则认为模型训练完毕。保存最优的模型权重参数用于提取水印在样本文件中的位置信息。
本申请实施例中,将文件输入至第一神经网络模型进行处理,得到第一神经网络模型输出的水印在文件中的位置信息。
具体的,将需要去除水印的文件输入至已训练完成的YOLO v3模型中进行处理,即可识别文件中包含水印的目标区域以及获得目标区域的信息,目标区域的信息包括水印的标注框的位置信息,即标注框的顶点坐标信息。
S203、根据位置信息获取文件中的目标区域。
具体的,根据位置信息裁剪文件,得到目标区域。
在一些实施例中,根据S202获取的矩形框的4个顶点坐标区域信息裁剪文件,即可得到文件的目标区域。
在一些实施例中,根据位置信息裁剪文件,得到目标区域之后,对目标区域进行掩码处理,得到目标区域的掩码信息。
该掩码信息中,水印区域的像素值不为0,其他区域的像素值为0。
S204、对目标区域进行水印去除处理,得到去水印后的文件。
具体的,请参考图4,图4是本申请一实施例提供的一种对目标区域进行水印去除处理的方法的流程示意图。作为一实现方式,图4中的方法的执行主体可以为图1中的终端设备100,作为其他实现方式,图4中的方法的执行主体也可以为图1中的服务器200。如图4所示,该方法包括:S401至S402。
S401、将目标区域输入至第二神经网络模型进行处理,得到第二神经网络模型输出的第一区域。
具体的,本申请实施例中,第一区域为目标区域去除水印后的区域。
本申请实施例中,第二神经网络模型是具有编码解码结构的神经网络模型。请参考图5a,图5a是本申请一实施例提供的第二神经网络模型的网络结构的示例图。
第二神经网络模型包括一个由多个编码器构成的收缩路径、一个由多个解码器构成的扩展路径以及替换模块,本申请实施例的编码器为6个,为了便于描述,将这6个编码器记为A、B、C、D、E、F,本申请实施例的解码器为5个,为了便于描述,将这5个解码器记为a、b、c、d、e,为了便于描述,将替换模块记为G。
收缩路径主要通过各级编码器实施下采样来逐步进行特征提取,扩展路径主要通过各级解码器实施上采样来逐步还原出越来越高分辨率的特征图像。在扩展路径逐级下采样的过程中会选择上下文信息,因此,为了补偿特征丢失,每个解码器将其上一级的上采样特征与其同级的编码器的上采样特征的级联作为输入,用于补偿上下文信息,以便确保所还原的图像质量。最后一级解码器 还原的图片再经过一个后处理步骤便可还原出更接近原图的色彩。
本申请实施例中,每个编码器的网络结构请参考图5b,图5b是本申请一实施提供的编码器的网络结构的示例图。
每个编码器均包含卷积层,激活函数层(Relu),批量归一化层(BatchNorm)和最大池化层。在每个编码器中,特征图像的通道数变为原来通道数量的2倍且逐层增加,但空间尺寸逐层减少,变为原特征图像的空间尺寸的1/2。
本申请实施例中,每个解码器的网络结构请参考图5c,图5c是本申请一实施例提供的解码器的网络结构的示例图。
每个解码器均包含张量拼接层、转置卷积层、卷积层,激活函数层(Relu)和批量归一化层(BatchNorm),张量拼接层用于连接与该解码器同级的编码器,转置卷积层用于扩大特征图像的尺寸。在每个解码器中,特征图像的通道数变为原来通道数量的1/2且逐层减少,但空间尺寸逐层增加,变为原特征图像的空间尺寸的2倍。
对于第二神经网络模型,其残差连接体现在每一个编码器和解码器的网络结构中,如图5b和图5c所示,repeat x 3表示的意思是残差连接重复3次,这样做的好处是可以扩大感受野,同时有助于提升恢复图像的质量。
本申请实施例中,替换模块G与解码器连接,替换模块中包含1*1的卷积层,激活函数层(sigmoid)。替换模块对处理编码器和解码器的输出的特征图像进行处理。
请参考图6,图6是本申请一实施例提供的一种第二神经网络模型的训练方法的流程示意图。如图6所示,该方法包括:S601至S604。
S601、获取训练样本。
具体的,本申请实施例对文件中的水印进行去除时,为了提高去除水印的速度,并不需要对整个文件进行处理,而是通过第一神经网络模型的输出结果,确定水印在文件中的位置信息。根据位置信息获取文件中的目标区域,进而只需在第二神经网络模型中对目标区域进行水印去除处理。因此,本申请实施例中,可以将S301中多组样本文件经过第一神经网络模型的处理,进而得到的多组样本文件中的目标区域作为第二神经网络模型的训练样本。
S602、将训练样本输入至第二神经网络模型的收缩路径中的多个编码器,逐步下采样提取训练样本的多尺寸的特征图像。
示例性的,请参考图5a,将训练样本中各目标区域为128*128*3(长*宽* 高,其中,高可理解为通道数)的尺寸特征,经过编码器A下采样提取特征之后变成64*64*32的特征图像输出给编码器B,同理,经过编码器C之后,输出32*32*64的特征图像,再经过编码器D即成为16*16*128的特征图像,经编码器E即成为8*8*256的特征图像,经编码器F即成为4*4*512的特征图像,获得的多尺寸的特征图像被传入扩展路径。
步骤S603、将多尺寸的特征图输入第二神经网络模型的扩展路径中的多个解码器,逐步上采样还原为更高分辨率的特征图像。
具体的,每个解码器以其对应的一个编码器所获得的特征图像为参照实施还原。
请参照图5a,第二神经网络模型中除了中间的编码器F外,两侧的编码器与解码器呈对称结构,其中,每个解码器均获取两路输入,一路为其上一级的解码器的上采样的图像特征,另一路为其对称级的编码器的上采样的图像特征的级联。
本申请实施例中,4*4*512的特征图像经过解码器a之后,被还原为8*8*256的特征图像,然后经解码器b还原为16*16*128的特征图像,再经解码器c还原为32*32*64的特征图像,再经编码器d还原为64*64*32,最后经编码器e还原为128*128*3,与编码器输入时的特征图像的尺寸相同。
S604、对解码器的输出和编码器最后阶段的输出进行处理,获得去除水印后的第一区域。
具体的,本申请实施例中,解码器输出的特征图像及编码器输出的特征图像均被传输到一个替换模块G中进行替换,在替换模块中,根据该第二神经网络模型固有的原理运用sigmoid函数和正则化处理对解码器输出的特征图像及编码器输出的特征图像进行处理,以便获得去除水印后的第一区域。
请参考图7,图7是本申请一实施例提供的一种获得去除水印后的第一区域的示例图,在图7中,目标区域C r,经过解码器处理,输出的特征图中包含与目标区域对应的第三区域
Figure PCTCN2021119725-appb-000002
目标区域的特征图像
Figure PCTCN2021119725-appb-000003
等。
在一些实施例中,第三区域为目标区域对应的背景区域,且第三区域不包含水印。在一些实施例中,目标区域的特征图像为掩码图像,在掩码图像中,水印区域的像素值不为0,其他区域的像素值为0。
本申请实施例中,在替换模块G中,确定水印在目标区域中的位置信息。
具体的,对目标区域进行掩码处理,得到目标区域的掩码信息,并根据掩 码信息确定水印在目标区域中的位置信息
在一些实施例中,通过对目标区域进行掩码处理,获得掩码图像,根据掩码图像中水印区域的掩码即可确定水印在目标区域中的位置信息。
本申请实施例中,根据水印在目标区域中的位置信息,确定第三区域中与水印对应的第四区域。
具体的,通过将第三区域与目标区域中的水印图像进行异或处理,即可确定第三区域中与水印对应的第四区域
Figure PCTCN2021119725-appb-000004
具体公式如下:
Figure PCTCN2021119725-appb-000005
本申请实施例中,利用第四区域替换目标区域中的水印。
具体的,利用下述公式,即可完成第四区域对目标区域的替换,获得去除水印后的第一区域。
Figure PCTCN2021119725-appb-000006
其中,
Figure PCTCN2021119725-appb-000007
表示第一区域。
本申请实施例中,在第二神经网络模型训练的迭代次数达到一定次数时(例如100次),神经网络模型即完成训练,保存的模型权重参数用于获得去除水印后的第一区域。
在一些实施例中,为了监督第二神经网络模型的训练效果,将第三区域作为目标值,将去除水印后的第一区域作为预测值,采用由多个损失函数线性叠加而成的总损失函数监督样本训练过程,以使训练所得的预测值随迭代次数的增加而最大化趋近于目标值。
请参考图8,图8是本申请一实施例提供的一种应用第二神经网络模型获得去除水印后的第一区域的方法的流程示意图。作为一实现方式,图8中的方法的执行主体可以为图1中的终端设备100,作为其他实现方式,图8中的方法的执行主体也可以为图1中的服务器200。如图8所示,该方法包括:S801至S804。
S801、获取与目标区域对应的第三区域。
本申请实施例中,将目标区域输入已训练完成的第二神经网络模型,第二神经网络模型经过训练,其相关参数得到优化,其解码器输出的特征图像即为与目标区域对应的第三区域。
S802、确定水印在目标区域中的位置信息。
本申请实施例中,确定水印在目标区域中的位置信息的方法与S604中确定 水印在目标区域中的位置信息的方法相同,此处不再赘述。
S803、根据水印在目标区域中的位置信息,确定第三区域中与水印对应的第四区域。
本申请实施例中,确定第四区域的方法与S604中确定第四区域的方法相同,此处不再赘述。
S804、利用第四区域替换目标区域中的水印。
本申请实施例中,利用第四区域替换目标区域中的水印进而获得去除水印后的第一区域。
利用第四区域替换目标区域中的水印的方法与S604中利用第四区域替换目标区域中的水印的方法相同,此处不再赘述。
S402、将第一区域与文件中的第二区域进行融合,得到去水印后的文件。
具体的,第一区域为目标区域去除水印后的区域。第二区域为文件中除目标区域以外的区域。
本申请实施例中,目标区域的信息包括水印的标注框的位置信息,即标注框的顶点坐标信息,根据标注框的顶点坐标信息将第一区域与第二区域进行融合,得到去水印后的文件。
本申请实施例中。可以利用加权平均、小波变换、模糊神经网络、塔形分解等方法将第一区域与第二区域进行融合,本申请实施例对第一区域与第二区域的融合方法不作限定。
综上所述,本申请实施例通过获取包含水印的文件;利用第一神经网络模型确定水印在文件中的位置信息;根据位置信息获取文件中的目标区域,目标区域为文件中包含水印的区域;利用第二神经网络模型对目标区域进行水印去除处理,得到去水印后的文件。即本申请通过确定水印在文件中的位置信息;根据位置信息获取文件中的目标区域,并不需要对整个文件进行处理,只需对目标区域进行水印去除处理,即可得到去水印后的文件,提高去除水印的速度。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
请参阅图9,图9是本申请实施例提供的一种水印去除装置的结构示意图,作为一实现方式,该装置可以应用于图1的终端设备100,作为其他实现方式,该装置也可以应用于图1的服务器200。该装置包括:
第一获取模块91,用于获取包含水印的文件。
确定模块92,用于确定水印在文件中的位置信息。
第二获取模块93,用于根据位置信息获取文件中的目标区域,目标区域为文件中包含水印的区域。
处理模块94,用于对目标区域进行水印去除处理,得到去水印后的文件。
其中,确定模块92,包括:
第一处理单元921,用于将文件输入至第一神经网络模型进行处理,得到第一神经网络模型输出的水印在文件中的位置信息。
其中,第二获取模块93,包括:
裁剪单元931,用于根据位置信息裁剪文件,得到目标区域。
其中,处理模块94,包括:
第二处理单元941,用于将目标区域输入至第二神经网络模型进行处理,得到第二神经网络模型输出的第一区域,第一区域为目标区域去除水印后的区域。
融合单元942,用于将第一区域与文件中的第二区域进行融合,得到去水印后的文件,第二区域为文件中除目标区域以外的区域。
其中,第二处理单元941,包括:
第一获取子单元9411,用于获取与目标区域对应的第三区域,第三区域为目标区域对应的背景区域,且第三区域不包含水印。
第二获取子单元9412,用于确定水印在目标区域中的位置信息。
确定子单元9413,用于根据水印在目标区域中的位置信息,确定第三区域中与水印对应的第四区域。
替换子单元9414,用于利用第四区域替换目标区域中的水印。
其中,第二获取子单元9412,包括:
掩码处理子单元9415,用于对目标区域进行掩码处理,得到目标区域的掩码信息,并根据掩码信息确定水印在目标区域中的位置信息。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可 以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
如图10所示,本申请实施例还提供一种终端设备20,包括存储器21、处理器22以及存储在存储器21中并可在处理器22上运行的计算机程序23,处理器22执行计算机程序23时实现上述各实施例的显示屏瑕疵定位方法。
所述处理器22可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器21可以是终端设备200的内部存储单元。所述存储器21也可以是终端设备200的外部存储设备,例如终端设备200上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器21还可以既包括终端设备200的内部存储单元也包括外部存储设备。存储器21用于存储计算机程序以及终端设备200所需的其他程序和数据。存储器21还可以用于暂时地存储已经输出或者将要输出的数据。
本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时实现上述各实施例的水印去除方法。
本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行时实现上述各实施例的水印去除方法。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,计算机程序包括计 算机程序代码,计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。计算机可读存储介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读存储介质不可以是电载波信号和电信信号。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其他实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (10)

  1. 一种水印去除方法,其特征在于,所述方法包括:
    获取包含水印的文件;
    确定所述水印在所述文件中的位置信息;
    根据所述位置信息获取所述文件中的目标区域,所述目标区域为所述文件中包含所述水印的区域;
    对所述目标区域进行水印去除处理,得到去水印后的文件。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述水印在所述文件中的位置信息,包括:
    将所述文件输入至第一神经网络模型进行处理,得到所述第一神经网络模型输出的所述水印在所述文件中的位置信息。
  3. 根据权利要求2所述的方法,其特征在于,所述第一神经网络模型为YOLO v3模型。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述位置信息获取所述文件中的目标区域,包括:
    根据所述位置信息裁剪所述文件,得到所述目标区域。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述对所述目标区域进行水印去除处理,得到去水印后的文件,包括:
    将所述目标区域输入至第二神经网络模型进行处理,得到所述第二神经网络模型输出的第一区域,所述第一区域为所述目标区域去除水印后的区域;
    将所述第一区域与所述文件中的第二区域进行融合,得到去水印后的文件,所述第二区域为所述文件中除所述目标区域以外的区域。
  6. 根据权利要求5所述的方法,其特征在于,所述第二神经网络模型对所述目标区域进行处理的过程,包括:
    获取与所述目标区域对应的第三区域,所述第三区域为所述目标区域对应的背景区域,且所述第三区域不包含所述水印;
    确定所述水印在所述目标区域中的位置信息;
    根据所述水印在所述目标区域中的位置信息,确定所述第三区域中与所述水印对应的第四区域;
    利用所述第四区域替换所述目标区域中的所述水印。
  7. 根据权利要求6所述的方法,其特征在于,所述确定所述水印在所述目标区域中的位置信息,包括:
    对所述目标区域进行掩码处理,得到所述目标区域的掩码信息,并根据所述掩码信息确定所述水印在所述目标区域中的位置信息。
  8. 一种水印去除装置,其特征在于,所述装置包括:
    第一获取模块,用于获取包含水印的文件;
    确定模块,用于确定所述水印在所述文件中的位置信息;
    第二获取模块,用于根据所述位置信息获取所述文件中的目标区域,所述目标区域为所述文件中包含所述水印的区域;
    处理模块,用于对所述目标区域进行水印去除处理,得到去水印后的文件。
  9. 一种终端设备,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述的水印去除方法。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的水印去除方法。
PCT/CN2021/119725 2021-08-19 2021-09-22 水印去除方法、装置、终端设备及可读存储介质 WO2023019682A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110955089.7A CN113643173A (zh) 2021-08-19 2021-08-19 水印去除方法、装置、终端设备及可读存储介质
CN202110955089.7 2021-08-19

Publications (1)

Publication Number Publication Date
WO2023019682A1 true WO2023019682A1 (zh) 2023-02-23

Family

ID=78422893

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119725 WO2023019682A1 (zh) 2021-08-19 2021-09-22 水印去除方法、装置、终端设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN113643173A (zh)
WO (1) WO2023019682A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342363A (zh) * 2023-05-31 2023-06-27 齐鲁工业大学(山东省科学院) 基于两阶段深度神经网络的可见水印去除方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495110A (zh) * 2022-01-28 2022-05-13 北京百度网讯科技有限公司 图像处理方法、生成器训练方法、装置及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017016294A1 (zh) * 2015-07-24 2017-02-02 乐视控股(北京)有限公司 一种视频去水印方法及装置
US20180033113A1 (en) * 2016-08-01 2018-02-01 International Business Machines Corporation Multiple source watermarking for surveillance
CN111062854A (zh) * 2019-12-26 2020-04-24 Oppo广东移动通信有限公司 检测水印的方法、装置、终端及存储介质
CN111798360A (zh) * 2020-06-30 2020-10-20 百度在线网络技术(北京)有限公司 一种水印检测方法、装置、电子设备及存储介质
CN111932431A (zh) * 2020-07-07 2020-11-13 华中科技大学 基于水印分解模型的可见水印去除方法和电子设备
CN112419132A (zh) * 2020-11-05 2021-02-26 广州华多网络科技有限公司 视频水印检测方法、装置、电子设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG121783A1 (en) * 2003-07-29 2006-05-26 Sony Corp Techniques and systems for embedding and detectingwatermarks in digital data
CN111160335B (zh) * 2020-01-02 2023-07-04 腾讯科技(深圳)有限公司 基于人工智能的图像水印处理方法、装置及电子设备
CN111626912A (zh) * 2020-04-09 2020-09-04 智者四海(北京)技术有限公司 水印去除方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017016294A1 (zh) * 2015-07-24 2017-02-02 乐视控股(北京)有限公司 一种视频去水印方法及装置
US20180033113A1 (en) * 2016-08-01 2018-02-01 International Business Machines Corporation Multiple source watermarking for surveillance
CN111062854A (zh) * 2019-12-26 2020-04-24 Oppo广东移动通信有限公司 检测水印的方法、装置、终端及存储介质
CN111798360A (zh) * 2020-06-30 2020-10-20 百度在线网络技术(北京)有限公司 一种水印检测方法、装置、电子设备及存储介质
CN111932431A (zh) * 2020-07-07 2020-11-13 华中科技大学 基于水印分解模型的可见水印去除方法和电子设备
CN112419132A (zh) * 2020-11-05 2021-02-26 广州华多网络科技有限公司 视频水印检测方法、装置、电子设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342363A (zh) * 2023-05-31 2023-06-27 齐鲁工业大学(山东省科学院) 基于两阶段深度神经网络的可见水印去除方法

Also Published As

Publication number Publication date
CN113643173A (zh) 2021-11-12

Similar Documents

Publication Publication Date Title
CN111369545B (zh) 边缘缺陷检测方法、装置、模型、设备及可读存储介质
CN111681273B (zh) 图像分割方法、装置、电子设备及可读存储介质
WO2023019682A1 (zh) 水印去除方法、装置、终端设备及可读存储介质
CN114529459B (zh) 一种对图像边缘进行增强处理的方法和系统及介质
CN112070649B (zh) 一种去除特定字符串水印的方法及系统
CN114550021B (zh) 基于特征融合的表面缺陷检测方法及设备
CN113221869B (zh) 医疗发票结构化信息提取方法、装置设备及存储介质
CN112308866A (zh) 图像处理方法、装置、电子设备及存储介质
Chen et al. Fast defocus map estimation
CN112183517B (zh) 证卡边缘检测方法、设备及存储介质
Thajeel et al. A Novel Approach for Detection of Copy Move Forgery using Completed Robust Local Binary Pattern.
CN111932480A (zh) 去模糊视频恢复方法、装置、终端设备以及存储介质
CN113888431A (zh) 图像修复模型的训练方法、装置、计算机设备及存储介质
CN114758145A (zh) 一种图像脱敏方法、装置、电子设备及存储介质
Qin et al. Face inpainting network for large missing regions based on weighted facial similarity
CN113570725A (zh) 基于聚类的三维表面重建方法、装置、服务器及存储介质
Muntarina et al. MultiResEdge: A deep learning-based edge detection approach
CN116309612B (zh) 基于频率解耦监督的半导体硅晶圆检测方法、装置及介质
CN110533663B (zh) 一种图像视差确定方法、装置、设备及系统
Zheng et al. Joint residual pyramid for joint image super-resolution
CN115375715A (zh) 目标提取方法、装置、电子设备及存储介质
CN114332890A (zh) 表格结构提取方法、装置、电子设备及存储介质
CN113012132A (zh) 一种图像相似度确定方法、装置及计算设备、存储介质
CN110766644B (zh) 一种图像降采样方法及装置
CN111369425A (zh) 图像处理方法、装置、电子设备和计算机可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21953918

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE