WO2021129466A1 - Procédé de détection de filigrane, dispositif, terminal et support de stockage - Google Patents

Procédé de détection de filigrane, dispositif, terminal et support de stockage Download PDF

Info

Publication number
WO2021129466A1
WO2021129466A1 PCT/CN2020/136587 CN2020136587W WO2021129466A1 WO 2021129466 A1 WO2021129466 A1 WO 2021129466A1 CN 2020136587 W CN2020136587 W CN 2020136587W WO 2021129466 A1 WO2021129466 A1 WO 2021129466A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
watermark
target picture
feature
area
Prior art date
Application number
PCT/CN2020/136587
Other languages
English (en)
Chinese (zh)
Inventor
孙莹莹
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2021129466A1 publication Critical patent/WO2021129466A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection

Definitions

  • the embodiments of the present application relate to the field of image processing technology, and in particular, to a method, device, terminal, and storage medium for detecting watermarks.
  • Digital watermark (English: Digital Watermark) is a kind of protection information embedded in a carrier file through a computer algorithm. In some pictures or videos, a watermark is embedded to indicate the attribution or source of the file.
  • third-party social platforms can receive and publish pictures or videos uploaded by users. Third-party social platforms need to ensure that the uploaded pictures or videos do not contain watermarks to avoid infringement problems.
  • the third-party social platform determines the watermark by matching the image template. When a watermark that is the same as the preset template is detected in the image, the third-party social platform filters the image and prohibits uploading.
  • the embodiments of the present application provide a method, device, terminal, and storage medium for detecting a watermark.
  • the technical solution is as follows:
  • a method for detecting a watermark including:
  • the receptive field (English: Receptive Field) of the second feature map is larger than the receptive field of the first feature map;
  • the position of the watermark area is determined.
  • an apparatus for detecting a watermark including:
  • a feature extraction module configured to extract a first feature map of a target picture, where the first feature map is used to represent an image feature of the target picture
  • a feature processing module configured to input the first feature map into the hole convolution layer to obtain a second feature map, the receptive field of the second feature map is larger than the receptive field of the first feature map;
  • a watermark detection module configured to determine whether there is a watermark area in the target picture according to the second feature map
  • the area determination module is used to determine the location of the watermark area when there is a watermark area in the target picture.
  • a terminal includes a processor and a memory, and at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement Implement the provided method for detecting watermarks.
  • a computer-readable storage medium stores at least one instruction, and the instruction is loaded and executed by a processor to realize the detection of watermark as provided in the implementation of this application. method.
  • Fig. 1 is a structural block diagram of a terminal provided by an exemplary embodiment of the present application.
  • Fig. 2 is a flowchart of a method for detecting a watermark provided by an exemplary embodiment of the present application
  • FIG. 3 is a schematic diagram of a watermark detection process based on the embodiment shown in FIG. 2;
  • Fig. 4 is a schematic diagram of another watermark detection process based on the embodiment shown in Fig. 2;
  • Fig. 5 is a flowchart of a method for detecting a watermark provided by another exemplary embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a neural network for detecting watermarks provided based on the embodiment shown in Fig. 5;
  • FIG. 7 is a schematic structural diagram when a feature extraction layer includes multiple convolution blocks according to the embodiment shown in FIG. 5; FIG.
  • FIG. 8 is a schematic structural diagram of a watermark detection neural network provided based on the embodiment shown in FIG. 5;
  • Fig. 9 is a flowchart of the training process of the watermark detection algorithm model provided by the embodiment shown in Fig. 2;
  • Fig. 10 is a structural block diagram of a device for detecting a watermark provided by an exemplary embodiment of the present application.
  • watermarking can be used to enhance the protection of target pictures.
  • a watermark detection algorithm based on basic features is usually used for automatic detection.
  • the watermark detection algorithm based on basic features may include at least one of a watermark detection algorithm based on color features, a watermark detection algorithm based on texture features, or a watermark detection algorithm based on shape features. Based on the current status of computer hardware development, the speed of watermark detection algorithms based on basic features is difficult to meet the needs of actual detection work.
  • the target picture can usually be a picture collected by a user through a camera or other equipment.
  • the camera is collecting pictures, the scene in the picture has a certain scale of displacement and rotation deviation. Therefore, the template matching methods provided in some technologies determine that the positioning of the watermark is inaccurate, resulting in the inability to complete the detection of the watermark.
  • the watermark that can be determined by the template matching method needs to exist in the template, and the watermark not in the template will cause the watermark in the target image to be undetectable.
  • An embodiment of the present application provides a method for detecting a watermark, wherein the method includes: extracting a first feature map of a target picture, where the first feature map is used to represent the image characteristics of the target picture; A feature map is input into the hole convolutional layer to obtain a second feature map.
  • the receptive field of the second feature map is larger than the receptive field of the first feature map; according to the second feature map, it is determined whether the target picture is There is a watermark area; when there is a watermark area in the target picture, determine the position of the watermark area.
  • the inputting the first feature map into a hole convolutional layer to obtain a second feature map includes: inputting the first feature map into m hole convolution layers respectively, and the m hole convolution The expansion rates between the two layers are different; each convolutional layer with holes processes the first feature map to obtain an intermediate output map; combines the m intermediate output maps to obtain the second feature map.
  • the m hollow convolutional layers are combined in a cascaded manner, the m hollow convolutional layers are combined into a hollow space pyramid pooling structure, and the hollow space pyramid pooling structure is used to make m holes Convolutional layers work in parallel.
  • the determining whether there is a watermark area in the target picture according to the second feature map includes: merging the second feature map and the fusion feature map to obtain the resulting feature map, the fusion feature
  • the map is a feature map in the feature merging layer except for the second feature map; according to the result feature map, it is determined whether there is a watermark area in the target picture.
  • the extracting the first feature map of the target picture includes: inputting the target picture into a feature extraction layer, where the feature extraction layer includes n convolutional blocks, where n is a positive integer, and the n Convolution blocks are arranged in series to form the feature extraction layer; input the target picture to the feature extraction layer to obtain n feature maps, and each convolution block outputs a feature map; output the i-th convolution block
  • the feature map of is used as the first feature map, and i is a positive integer not greater than n.
  • the determining the position of the watermark area when there is a watermark area in the target picture includes: determining boundary pixels in the target picture; and determining a quadrilateral border of the watermark area according to the boundary pixels The coordinates of the vertex.
  • the method further includes: intercepting the watermark area in the target picture; performing watermark removal on the watermark area according to the image characteristics of the target picture to obtain the processed area; The area is covered on the watermark area, and the watermarkless image corresponding to the target picture is obtained.
  • the method further includes: intercepting the watermark area in the target picture to obtain an image of the watermark area; and texting the content in the watermark area image Recognize to obtain recognized text; in response to the recognized text containing the first vocabulary, the target picture is determined as the screening result image, and the first vocabulary is a keyword input through the input component.
  • the method further includes: in response to the second word contained in the recognized text, displaying watermark detection prompt information, the watermark detection prompt information being used to prompt that the target picture is protected by copyright; and displaying a key input control
  • the key input control is used to input the copyright key corresponding to the target picture; in response to the copyright key being correct, the watermark in the target picture is removed.
  • the beneficial effects brought about by the technical solutions provided by the embodiments of the present application may include:
  • the watermark detection method provided by the embodiment of the application can extract the first feature map of the target picture, obtain the second feature map whose receptive field is larger than the first feature map through the hole convolution layer, and then determine whether the target picture is in the target picture according to the second feature map. There is a watermark area. When there is a watermark area in the target picture, the embodiment of the present application can determine the watermark area in the target picture. Since this application can balance the resolution and receptive field of the feature map when extracting high-level semantic information from the target picture, a higher receptive field can be obtained under the target picture of the same resolution, which improves the speed and accuracy of determining the watermark position. .
  • VGG (English: Visual Geometry Group, Chinese: Visual Geometry Group) 16: Belongs to one of VGGNet (English: Visual Geometry Group Network).
  • VGGNet is a convolutional neural network used for classification, and the network depth of the network is 16.
  • CNN (English: Convolutional Neural Networks, Chinese: Convolutional Neural Networks): It is a type of feedforward neural network that includes convolution calculations and has a deep structure. It is one of the representative algorithms of deep learning.
  • Keras It is an open source software library that can be used for high-performance numerical calculations. With its flexible architecture, the terminal can deploy computing tasks to multiple platforms (CPU, GPU, or TPU) and devices (desktop devices, server clusters, mobile devices or edge devices, etc.).
  • platforms CPU, GPU, or TPU
  • devices desktop devices, server clusters, mobile devices or edge devices, etc.
  • the watermark detection method shown in the embodiment of the present application can be applied to a terminal, which has a display screen and has a watermark detection function.
  • Terminals can be mobile phones, tablets, laptops, desktop computers, all-in-one computers, servers, workstations, TVs, set-top boxes, smart glasses, smart watches, digital cameras, MP4 playback terminals, MP5 playback terminals, learning machines, point-to-read Electronic equipment such as computers, electronic paper books, electronic dictionaries or vehicle-mounted terminals.
  • FIG. 1 is a structural block diagram of a terminal provided by an exemplary embodiment of the present application.
  • the terminal includes a processor 120 and a memory 140.
  • the memory 140 stores at least one instruction, and the instruction is
  • the processor 120 is loaded and executed to implement the watermark detection method described in each method embodiment of the present application.
  • the terminal 100 is an electronic device with a function of detecting a watermark.
  • the terminal 100 can extract the first feature map in the target picture.
  • the extraction method can be obtained by convolution.
  • the first feature map is used to represent the image characteristics of the target picture, and the terminal 100 will The first feature map is input into the hole convolutional layer to obtain a second feature map.
  • the receptive field of the second feature map is larger than the receptive field of the first feature map.
  • the terminal determines whether there is a watermark area in the target picture. When there is a watermark area in the target picture, the terminal will determine the location of the watermark area.
  • the processor 120 may include one or more processing cores.
  • the processor 120 uses various interfaces and lines to connect various parts of the entire terminal 100, and executes the terminal by running or executing instructions, programs, code sets, or instruction sets stored in the memory 140, and calling data stored in the memory 140. 100's various functions and processing data.
  • the processor 120 may use at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA).
  • DSP Digital Signal Processing
  • FPGA Field-Programmable Gate Array
  • PDA Programmable Logic Array
  • the processor 120 can integrate one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), a tensor processing unit (TPU), a modem, etc. .
  • the CPU mainly processes the operating system, user interface, and application programs; the GPU is responsible for the rendering and drawing of the content that the display needs to display; the TPU is mainly used for matrix multiplication and convolution operations; the modem is used for processing wireless communications. It can be understood that the above-mentioned modem may not be integrated into the processor 120, but may be implemented by a chip alone.
  • the memory 140 may include random access memory (RAM), or read-only memory (ROM).
  • the memory 140 includes a non-transitory computer-readable storage medium.
  • the memory 140 may be used to store instructions, programs, codes, code sets or instruction sets.
  • the memory 140 may include a program storage area and a data storage area, where the program storage area may store instructions for implementing the operating system and instructions for at least one function (such as touch function, sound playback function, image playback function, etc.), Instructions used to implement the following method embodiments; the storage data area can store the data involved in the following method embodiments.
  • Fig. 2 is a flowchart of a method for detecting a watermark provided by an exemplary embodiment of the present application.
  • the method for detecting the watermark can be applied to the terminal shown in FIG. 1 above.
  • the method of detecting the watermark includes:
  • Step 210 Extract a first feature map of the target picture, where the first feature map is used to represent the image characteristics of the target picture.
  • the terminal can extract the first feature map from the target picture.
  • the target picture may be a picture stored in the terminal.
  • the target picture may be a picture collected by the terminal through the image acquisition component, or a picture received by the terminal from the network.
  • the solutions provided in the embodiments of the present application can be applied to designated applications.
  • the terminal can perform watermark detection on pictures uploaded to social applications, news applications, or public media applications.
  • the terminal may detect whether there is a watermark in the picture, which is not limited in the embodiment of the present application.
  • the terminal can extract the first feature map from the target picture in a specified manner.
  • the terminal can extract a feature map through a specified neural network model, feature extraction layer or feature extraction model.
  • the embodiment of the present application may extract the first feature map based on VGG16.
  • a feature map processed from a certain convolution block (English: Block) in VGG16 is used as the first feature map.
  • Step 220 Input the first feature map into the hole convolutional layer to obtain a second feature map.
  • the receptive field of the second feature map is larger than the receptive field of the first feature map.
  • the terminal after obtaining the first feature map containing high-level semantic information, the terminal inputs the first feature map into the hole convolutional layer to obtain a second feature map.
  • the receptive field of the second feature map is larger than that of the first feature map. wild.
  • the hole convolution layer adopted in the embodiment of the present application solves this problem.
  • the first feature map of the same size can have a larger receptive field when it is input to the hole convolutional layer than the feature map that is not input to the hole convolution layer.
  • Step 230 Determine whether there is a watermark area in the target picture according to the second feature map.
  • the terminal can determine whether there is a watermark area in the target picture according to the second feature map.
  • the second feature map includes boundary pixels, and the terminal can determine whether there is a watermark area in the target picture according to the boundary pixels.
  • Step 240 When there is a watermark area in the target picture, determine the location of the watermark area.
  • the terminal determines the location of the watermark area through frame regression prediction.
  • the terminal can use the boundary pixels in the target picture to predict the coordinates of two adjacent vertices.
  • the terminal can respectively confirm the coordinates of the two vertices of the head and the coordinates of the two vertices of the tail of the table frame, thereby determining the coordinates of the four vertices of the quadrilateral frame where the water area is located.
  • the area enclosed by the quadrilateral frame is the watermark area.
  • FIG. 3 is a schematic diagram of a watermark detection process based on the embodiment shown in FIG. 2.
  • the target picture 310 contains the words "Zhang San Design" in the watermark 311, and the terminal may input the target picture 310 into the feature extraction model 320 to obtain the first feature map 330.
  • the terminal can input the first feature map 330 into the hole convolution layer 340 to obtain the second feature map 350, and then the terminal can obtain the detected target image 360 by analyzing the second feature map 350, and the detected target image 360
  • the watermark area 361 is included in.
  • the watermark area 361 is implemented as a rectangular frame.
  • Fig. 4 is a schematic diagram of another watermark detection process based on the embodiment shown in Fig. 2.
  • the target picture 410 contains the watermark 411 "Zhang San Design", and the picture is input into a feature extractor (English: Feature Extractor) 420.
  • CNN can be used as a feature extractor to extract the feature map of the target picture.
  • the feature maps extracted by the feature extractor 420 execute two processes respectively.
  • the first process performs pixel-level semantic segmentation (English: Pixel Classification).
  • the function of the first process is to classify the watermark and the background image into two categories, and the result is obtained
  • the image 430 includes a background of a black part and an area 431 of a watermark part.
  • the second process performs a bounding box regression (English: Bounding Box Regression) operation to obtain a result image 440, and the result image 440 is marked with an area of the content 441 to be processed.
  • the result image 430 and the result image 440 perform a non-maximum suppression (English: Non-Maximum Suppression, abbreviation: NMS) operation to obtain a detected target image 450, and the target image 450 is marked with a target frame 451.
  • NMS Non-Maximum Suppression, abbreviation
  • the method for detecting watermarks provided in this embodiment can extract the first feature map of the target picture, and obtain the second feature map with the receptive field larger than the first feature map through the hole convolution layer, and then according to the second feature map It is determined whether there is a watermark area in the target picture.
  • the embodiment of the present application can determine the watermark area in the target picture. Since this application can balance the resolution and receptive field of the feature map when extracting high-level semantic information from the target picture, a higher receptive field can be obtained under the target picture of the same resolution, which improves the speed and accuracy of determining the watermark position. .
  • the terminal can also detect the watermark through a neural network, where the hole convolutional layer can be set at different levels in the neural network, please refer to the following embodiment.
  • Fig. 5 is a flowchart of a method for detecting a watermark provided by another exemplary embodiment of the present application.
  • the method for detecting the watermark can be applied to the terminal shown in FIG. 1 above.
  • the method for detecting a watermark includes:
  • Step 511 Input the target picture into the feature extraction layer.
  • the feature extraction layer includes n convolution blocks, where n is a positive integer, and n convolution blocks are arranged in series to form the feature extraction layer.
  • the embodiment of the present application can complete the watermark detection task through the designated neural network.
  • FIG. 6 is a schematic structural diagram of a neural network for detecting watermarks provided based on the embodiment shown in FIG. 5.
  • the watermark detection neural network 600 may include a feature extraction layer 610, a feature merging layer 620, and an output layer 630.
  • the feature extraction layer 610 may be a single layer or multiple layers.
  • the feature merging layer 620 may be a single layer or multiple layers.
  • the feature extraction layer may include n convolution blocks, n convolution blocks are arranged in series, and the feature extraction work of the target picture is completed in turn.
  • FIG. 7 is a schematic structural diagram of a feature extraction layer provided based on the embodiment shown in FIG. 5 when it includes multiple convolution blocks.
  • the feature extraction layer 610 includes four convolution blocks, and the four convolution blocks are a first convolution block 611, a second convolution block 612, a third convolution block 613, and a fourth convolution block 614.
  • each convolutional block includes a pooling layer and several convolutional layers.
  • the first convolutional block 611 includes 1 convolutional layer and 1 pooling layer
  • the second convolutional block includes 2 convolutional layers and 1 pooling layer
  • the third convolutional block includes 2 convolutional layers and 1 A pooling layer
  • the fourth convolutional block includes 3 convolutional layers and 1 pooling layer.
  • n convolutional blocks can be 1 convolutional block, 2 convolutional blocks, 3 convolutional blocks, 5 convolutional blocks, or other positive integer convolutional blocks.
  • the implementation of this application The example does not limit this.
  • Step 512 Input the target picture to the feature extraction layer to obtain n feature maps, and each convolution block outputs a feature map.
  • each convolution block in the feature extraction layer can output a complete feature map.
  • n is equal to 4
  • all four convolutional blocks can output one feature map for one target picture.
  • Step 513 Use the feature map output by the i-th convolution block as the first feature map, where i is a positive integer not greater than n.
  • the i-th convolution block is one of the n convolution blocks.
  • the terminal can use the feature map output by the i-th convolution block as the first feature map.
  • the i-th convolution block may be one of the n convolution blocks.
  • a convolution block is responsible for outputting the first feature map.
  • step 521 the first feature map is input into m convolutional layers with holes respectively, and the expansion ratios of the two convolutional layers with m holes are different.
  • FIG. 8 is a schematic structural diagram of a watermark detection neural network provided based on the embodiment shown in FIG. 5.
  • the watermark detection neural network is used for the processed target picture 6A
  • the watermark detection neural network includes a feature extraction layer, a feature merging layer, and an output layer 630.
  • the feature extraction layer includes a first convolution block 611, a second convolution block 612, a third convolution block 613, and a fourth convolution block 614.
  • the feature merging layer includes a hollow space pyramid pooling structure 621, an image pooling group 622, a first merging convolution layer 623, a first merging layer 624, a second merging layer 625, and a second merging convolution layer 626.
  • the output layer 630 includes three sets of data.
  • the output layer of the first set of data is a 1-bit score (Chinese: score), which is used to indicate whether the pixel is in the watermark area. The higher the score, the possibility that the pixel is in the watermark area. The higher the sex, the group of data is represented as (1*1,1) in the figure.
  • the second set of data is a 2-bit vertex code (Chinese: vertex code), which is used to determine whether the pixel is a boundary pixel and whether the pixel is the head or tail of the watermark. This set of data is represented in the figure as (1*1, 2).
  • the third set of data is a 4-bit vertex geo, which is the two vertex coordinates that the pixel can predict, and all the boundary pixels form the shape of the text box.
  • the feature maps extracted by the feature extraction layer can be merged with the feature maps in the feature merging layer respectively.
  • the feature map f3 output by the second convolution block 612 will be processed by the connection part (the part covered by the diagonal line in the figure) in the second merging layer 625 and merged with the up-sampled feature map of the second merging layer 625.
  • the processing operation of the feature map f2 is similar to the processing operation of the feature map f3.
  • m hollow convolutional layers are combined in a cascaded manner, m hollow convolutional layers are combined into a hollow space pyramid pooling structure, and the hollow space pyramid pooling structure is used to make m hollow convolutional layers Work in parallel.
  • the first feature map f1 is input into the hollow space pyramid pooling structure 621.
  • the hollow space pyramid pooling structure 621 includes 4 hollow convolutional layers combined in a cascade manner.
  • the four hole convolutional layers combined in a cascaded manner are the first hole convolution layer 621a, the second hole convolution layer 621b, and the third hole convolution layer 621c from left to right.
  • the fourth hole convolutional layer 621d From the first hole convolution layer 621a to the fourth hole convolution layer 621d, the expansion rate of the hole convolution layer of each layer gradually increases.
  • the expansion rate of the first cavity convolution layer 621a is 1, the expansion rate of the second cavity convolution layer 621b is 6, the expansion rate of the third cavity convolution layer 621c is 12, and the expansion rate of the fourth cavity convolution layer 621d
  • the expansion rate is 18.
  • the output of each layer of the hole convolutional layer will be combined with the input and all the previous layer outputs of the hole convolution layer. It can be seen that the final merged output will obtain more and larger-scale receptive fields.
  • the hollow space pyramid pooling structure 621 in the figure can generate denser and larger feature pyramids.
  • the hollow space pyramid pooling structure 621 of the present application includes 4 parallel operations.
  • the first hole convolution layer 621a uses a 1*1 convolution
  • the second hole convolution layer 621b, the third hole convolution layer 621c, and the fourth hole convolution layer 621d use a 3*3 convolution.
  • the output compensation of the feature map is 16.
  • Step 522 each hole convolutional layer processes the first feature map to obtain an intermediate output map.
  • an intermediate output map will be obtained.
  • the hollow space pyramid pooling structure 621 will output 4 intermediate output images in parallel.
  • Step 523 Combine the m intermediate output maps to obtain a second feature map.
  • the m intermediate output maps obtained in the foregoing steps are combined to obtain a second characteristic map.
  • the feature map merged by the first merged convolutional layer 623 is the second feature map.
  • Step 531 Combine the second feature map and the fusion feature map to obtain a result feature map.
  • the fusion feature map is a feature map other than the second feature map in the feature merging layer.
  • the embodiment of the present application can merge the second feature map and the fusion feature map to obtain the resulting feature map.
  • the second feature map may be merged with the feature map f2 in the first merging layer 624, and merged with the feature map f3 in the second merging layer 625.
  • the fusion feature map includes a feature map f2 and a feature map f3.
  • the result feature map is the feature map that the specified pixels in the target image have been labeled and processed by the watermark detection network.
  • Step 532 Determine whether there is a watermark area in the target picture according to the result feature map.
  • the terminal will determine whether there is a watermark area in the target image according to the resulting feature map. Wherein, when the number of pixels used to mark the watermark in the target picture exceeds a specified number, the terminal confirms that there is a watermark area in the target picture. When the number of pixels used to mark the watermark in the target picture does not exceed the specified number, the terminal confirms that there is no watermark area in the target picture.
  • Step 541 Determine the boundary pixels in the target picture.
  • the target picture detected by the watermark detection neural network has marked boundary pixels. This boundary pixel is used to indicate the boundary of the watermark area.
  • Step 542 Determine the coordinates of the vertices of the quadrilateral border of the watermark area according to the boundary pixels.
  • the terminal determines the coordinates of the vertices of the quadrilateral border of the watermark area by the boundary pixels.
  • the terminal in response to the terminal determining the coordinates of the vertices of the quadrilateral border of the watermark area, the terminal can clearly identify the watermark area where the watermark is located. On this basis, this solution can also be used to remove watermarks. It should be noted that some watermarks are only used to illustrate the source of the target picture, and there are some watermarks to illustrate the copyright ownership of the target picture. Therefore, in this scenario, after the terminal recognizes the watermark area in the target picture, it needs to perform corresponding processing according to the actual content of the watermark in the watermark area.
  • the terminal can obtain a watermark vocabulary in advance, and the object indicated by the second vocabulary contained in the watermark vocabulary is the copyright owner. Therefore, the target image where the watermark containing the second vocabulary is located cannot be used to remove the watermark.
  • the terminal generates a second vocabulary database in advance through a copyright database or a copyright list provided by a standard copyright organization, and stores the second vocabulary database locally.
  • the second vocabulary can be stored in the cloud. When the terminal needs to be used, it is temporarily read from the cloud, and the second vocabulary is deleted after the solution shown in this application is executed.
  • the terminal will intercept the watermark area from the target picture to obtain the watermark area image, and perform text recognition on the content in the watermark area image to obtain the recognized text.
  • the terminal compares the recognized text in the second vocabulary, and responds to the second vocabulary included in the recognized text, indicating that the target picture is protected by copyright.
  • the terminal will display the watermark detection prompt information, which can be text information.
  • the watermark detection prompt message may be "This picture is protected by copyright, please contact the copyright owner to obtain the copyright key”.
  • the terminal will display a key input control, which can be a control such as a text input box. This control is used to receive the copyright key for the target picture entered by the user.
  • the copyright key may be a key obtained by the user contacting the copyright owner.
  • the terminal will remove the watermark in the target picture.
  • the terminal can determine whether the picture is the picture it needs to find according to the content of the watermark.
  • the specific execution steps may include: intercepting the watermark area in the target picture to obtain the watermark area image; performing text recognition on the content in the watermark area image to obtain the recognized text; and responding to the recognized text containing the first A vocabulary, the target picture is determined as the screening result image, and the first vocabulary is a keyword input through the input component.
  • the terminal can ask the user to input keywords.
  • the embodiment of this application selects the watermark from the target picture
  • the terminal determines the target picture as the screening result image. For example, the terminal searches a total of 10,000 watermarked images, and searches for pictures containing the "Designer A" watermark. Among them, 200 pictures were determined as the screening result pictures. That is, the solutions provided in the embodiments of the present application can be used to filter pictures according to their watermark content.
  • Step 551 intercept the watermark area in the target picture.
  • the terminal can intercept the watermark area in the target picture.
  • the terminal can determine the quadrilateral frame of the watermark area through the foregoing steps, and intercept the image in the quadrilateral frame.
  • Step 552 Perform watermark removal on the watermark area according to the image characteristics of the target picture to obtain the processed area.
  • the terminal can call the trained watermark removal model, and only input the watermark area into the watermark removal model to obtain the processed area.
  • the terminal may also replace the watermark area with a preset pattern, and the preset pattern may be a pattern intercepted by other parts of the target picture.
  • Step 553 Cover the processed area on the watermark area to obtain a watermarkless image corresponding to the target picture.
  • the embodiment of the present application can cover the processed area on the watermark area, and merge the processed area with the target picture to obtain a watermarkless image corresponding to the target picture.
  • the objective function of the neural network involved in the embodiment of the present application may be composed of two parts, namely the classification graph loss and the geometric shape loss, and the calculation formula is as follows:
  • L s is the classification map loss
  • ⁇ g is the weight of the two losses, which can be set to 1 here.
  • L g is the geometric shape loss.
  • the sum of L s and ⁇ g is selected as the loss function.
  • Keras may be used to define a convolutional neural network model in this embodiment of the present application.
  • the hollow convolutional layer can be set at different levels in the neural network, so that the structure of the neural network has different changes, and the neural network can be maintained in the case of more pooling layers. Obtain a larger receptive field, thereby improving the ability of the neural network to encode more scale information when recognizing the watermark, and improving the ability of the neural network to recognize the watermark.
  • the watermark detection method provided in this embodiment can also perform watermark removal work on the image in the area after identifying the area where the watermark is located, reducing the area of the image area to be processed by the watermark removal model, and improving the removal of watermark s efficiency.
  • an embodiment of the present application also provides a method for training a watermark detection algorithm model. Please refer to the following embodiment.
  • FIG. 9 is a flowchart of the training process of the watermark detection algorithm model provided by the embodiment shown in FIG. 2.
  • the terminal can train the neural network before the neural network can be used.
  • the process shown in FIG. 9 is executed before the process shown in FIG. 2 or FIG. 5.
  • the training process of the neural network includes:
  • Step 910 construct a watermark data set.
  • the source of the watermark can come from an individual, an organization, or a company.
  • the style of the watermark can include Chinese, English, logo (ie logo) and so on.
  • the terminal can make an image with a watermark.
  • this embodiment may use the image of the PASCAL VOC 2012 data set as the original watermarkless image, and then use image processing tools to attach the collected q types of watermarks to the original image with random sizes, positions and transparency, and at the same time Record the location information of the watermark to obtain the watermark data set.
  • q is the number of types of collected watermarks
  • q is a positive integer.
  • Step 920 Perform format adjustment, de-duplication and renaming of the training samples in the watermark data set.
  • Step 930 Perform data enhancement on the training samples.
  • the terminal can perform processing through two aspects: scale conversion and noise addition.
  • the terminal can unify the scale of the training sample into a standard scale.
  • the terminal may perform scale conversion in a manner of scaling.
  • the terminal can add random noise to the training samples.
  • Step 940 Perform normalization processing on the training samples.
  • the terminal can normalize the pixel value of the pixel in the training sample from the interval of [0,255] to [0,1], remove the redundant information contained in the training sample data, and compress the training time.
  • Step 950 Divide the watermark data set into a training set and a test set.
  • the terminal may divide 80% of the watermark data set into the training set, and divide 20% of the watermark data set into the test set.
  • the training samples in the test set will be completely different from the training samples in the training set, which can better simulate the actual application of the model to identify scenarios where the watermark has never been processed.
  • Step 960 Use error back propagation to train the watermark detection algorithm model according to the objective function.
  • the training process of the convolutional neural network when solving the optimization problem, the parameters and the objective function are continuously optimized through error back propagation, and the weights in the network are updated iteratively to complete the entire training process.
  • the training set is input into the watermark detection algorithm model and iterated for a preset number of epochs.
  • the embodiment of the present application is set to 90 epochs.
  • the Adam gradient descent algorithm is used to optimize the objective function.
  • a unified sampling of 512*512 from the image set is used to construct the number of pictures sent in each batch to 24, and then the learning rate is set to be attenuated by stages. Conducive to rapid convergence of the model.
  • Step 970 Send the test set to the trained watermark detection algorithm model to verify the accuracy of the model.
  • Step 980 In response to the watermark detection algorithm model with an accuracy higher than the target threshold, the model is determined as a convolutional neural network model for watermark detection.
  • the training method of the watermark detection algorithm model can obtain a relatively rich and complete watermark data set through the watermark mobile phone, format adjustment, deduplication, renaming, and data enhancement, and then normalize it.
  • the trained watermark detection algorithm model can adapt to watermarks that have not been encountered in the actual watermark detection, which improves the training efficiency of the watermark detection algorithm model and the robustness of the trained watermark detection algorithm model .
  • FIG. 10 is a structural block diagram of a watermark detection device provided by an exemplary embodiment of the present application.
  • the device for detecting the watermark can be implemented as all or a part of the terminal through software, hardware or a combination of the two.
  • the device includes:
  • the feature extraction module 1010 is configured to extract a first feature map of a target picture, where the first feature map is used to represent the image feature of the target picture;
  • the feature processing module 1020 is configured to input the first feature map into the hole convolutional layer to obtain a second feature map, and the receptive field of the second feature map is larger than the receptive field of the first feature map;
  • the watermark detection module 1030 is configured to determine whether there is a watermark area in the target picture according to the second feature map;
  • the area determining module 1040 is configured to determine the location of the watermark area when there is a watermark area in the target picture.
  • the feature processing module 1020 is configured to input the first feature map into m convolutional layers with holes, and the expansion ratios of the m convolutional layers are different. ; Each convolutional layer of holes processes the first feature map to obtain an intermediate output map; combines the m intermediate output maps to obtain the second feature map.
  • the m hollow convolutional layers involved in this device are combined in a cascaded manner, and the m hollow convolutional layers are combined into a hollow space pyramid pooling structure, and the hollow space pyramid The pooling structure is used to make m convolutional layers work in parallel.
  • the watermark detection module 1030 is configured to merge the second feature map and the fusion feature map to obtain the result feature map, and the fusion feature map is the feature merging layer except for the A feature map other than the second feature map; according to the result feature map, it is determined whether there is a watermark area in the target picture.
  • the feature extraction module 1010 is configured to input the target picture into a feature extraction layer, where the feature extraction layer includes n convolutional blocks, where n is a positive integer, and Convolution blocks are arranged in series to form the feature extraction layer; the target picture is input to the feature extraction layer to obtain n feature maps, and each convolution block outputs a feature map; the i-th convolution block The output feature map is used as the first feature map, and i is a positive integer not greater than n.
  • the area determining module 1040 is configured to determine boundary pixels in the target picture; and determine the coordinates of the vertices of the quadrilateral border of the watermark area according to the boundary pixels.
  • the device further includes a first interception module, a first removal module, and an image fusion module.
  • the first interception module is configured to intercept the watermark area in the target picture.
  • the first removal module is configured to perform watermark removal on the watermark area according to the image characteristics of the target picture to obtain a processed area.
  • the image fusion module is configured to cover the processed area on the watermark area to obtain a watermarkless image corresponding to the target picture.
  • the device further includes: a second interception module, a text recognition module, and an image determination module; the second interception module is configured to intercept the watermark area after the position of the watermark area is determined.
  • the watermark area in the target picture obtains a watermark area image;
  • the text recognition module is configured to perform text recognition on the content in the watermark area image to obtain recognized text;
  • the image determination module is configured to respond A first vocabulary is included in the recognized text, the target picture is determined as a screening result image, and the first vocabulary is a keyword input through an input component.
  • the device further includes an information prompt module, a control display module, and a second removal module.
  • the information prompt module is configured to display watermark detection prompt information in response to the second word contained in the recognized text, and the watermark detection prompt information is used to prompt that the target picture is protected by copyright;
  • the control display module uses In the display key input control, the key input control is used to input the copyright key corresponding to the target picture;
  • the second removal module is used to remove the copyright key from the target picture in response to the copyright key being correct. Watermark.
  • the embodiments of the present application also provide a computer-readable medium that stores at least one instruction, and the at least one instruction is loaded and executed by the processor to realize the watermark detection described in each of the above embodiments. method.
  • the watermark detection device provided in the above embodiment executes the watermark detection method
  • only the division of the above functional modules is used as an example for illustration.
  • the above functions can be allocated to different functions according to needs.
  • Module completion that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the watermark detection device provided in the above-mentioned embodiment and the watermark detection method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

L'invention concerne un procédé détection de filigrane, un dispositif, un terminal et un support de stockage, se rapportant au domaine technique du traitement d'images. Le procédé peut extraire une première carte de caractéristiques d'une image cible, et obtenir une seconde carte de caractéristiques dont le champ de réception est supérieur à la première carte de caractéristiques à travers une couche de convolution de cavité, déterminer ensuite si la zone de filigrane existe dans l'image cible selon la seconde carte de caractéristiques, et lorsque la zone de filigrane existe dans l'image cible, détermine la zone de filigrane dans l'image cible. Le procédé peut équilibrer la résolution et le champ de réception de la carte de caractéristiques lorsque les informations sémantiques avancées sont extraites de l'image cible, peut obtenir le champ de réception élevé sous l'image cible avec la même résolution, et améliorer la vitesse et la précision pour déterminer la position du filigrane.
PCT/CN2020/136587 2019-12-26 2020-12-15 Procédé de détection de filigrane, dispositif, terminal et support de stockage WO2021129466A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911365673.6A CN111062854B (zh) 2019-12-26 2019-12-26 检测水印的方法、装置、终端及存储介质
CN201911365673.6 2019-12-26

Publications (1)

Publication Number Publication Date
WO2021129466A1 true WO2021129466A1 (fr) 2021-07-01

Family

ID=70303838

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136587 WO2021129466A1 (fr) 2019-12-26 2020-12-15 Procédé de détection de filigrane, dispositif, terminal et support de stockage

Country Status (2)

Country Link
CN (1) CN111062854B (fr)
WO (1) WO2021129466A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114119329A (zh) * 2021-10-29 2022-03-01 暨南大学 基于双卷积模块的水印去除方法、装置、设备和存储介质
CN116309476A (zh) * 2023-03-15 2023-06-23 深圳爱析科技有限公司 一种crt拆解质量检测的方法、装置和电子设备
WO2023202570A1 (fr) * 2022-04-21 2023-10-26 维沃移动通信有限公司 Procédé de traitement d'image et appareil de traitement d'image, dispositif électronique et support de stockage lisible

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062854B (zh) * 2019-12-26 2023-08-25 Oppo广东移动通信有限公司 检测水印的方法、装置、终端及存储介质
CN112733822B (zh) * 2021-03-31 2021-07-27 上海旻浦科技有限公司 一种端到端文本检测和识别方法
CN113643173A (zh) * 2021-08-19 2021-11-12 广东艾檬电子科技有限公司 水印去除方法、装置、终端设备及可读存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170039669A1 (en) * 2004-09-17 2017-02-09 Digimarc Corporation Hierarchical watermark detector
CN109285105A (zh) * 2018-09-05 2019-01-29 北京字节跳动网络技术有限公司 水印检测方法、装置、计算机设备和存储介质
CN109325534A (zh) * 2018-09-22 2019-02-12 天津大学 一种基于双向多尺度金字塔的语义分割方法
CN110020676A (zh) * 2019-03-18 2019-07-16 华南理工大学 基于多感受野深度特征的文本检测方法、系统、设备及介质
CN110322495A (zh) * 2019-06-27 2019-10-11 电子科技大学 一种基于弱监督深度学习的场景文本分割方法
CN110428357A (zh) * 2019-08-09 2019-11-08 厦门美图之家科技有限公司 图像中水印的检测方法、装置、电子设备及存储介质
CN111062854A (zh) * 2019-12-26 2020-04-24 Oppo广东移动通信有限公司 检测水印的方法、装置、终端及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG121783A1 (en) * 2003-07-29 2006-05-26 Sony Corp Techniques and systems for embedding and detectingwatermarks in digital data
CN108230269B (zh) * 2017-12-28 2021-02-09 智慧眼科技股份有限公司 基于深度残差网络的去网格方法、装置、设备及存储介质
CN109784181B (zh) * 2018-12-14 2024-03-22 平安科技(深圳)有限公司 图片水印识别方法、装置、设备及计算机可读存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170039669A1 (en) * 2004-09-17 2017-02-09 Digimarc Corporation Hierarchical watermark detector
CN109285105A (zh) * 2018-09-05 2019-01-29 北京字节跳动网络技术有限公司 水印检测方法、装置、计算机设备和存储介质
CN109325534A (zh) * 2018-09-22 2019-02-12 天津大学 一种基于双向多尺度金字塔的语义分割方法
CN110020676A (zh) * 2019-03-18 2019-07-16 华南理工大学 基于多感受野深度特征的文本检测方法、系统、设备及介质
CN110322495A (zh) * 2019-06-27 2019-10-11 电子科技大学 一种基于弱监督深度学习的场景文本分割方法
CN110428357A (zh) * 2019-08-09 2019-11-08 厦门美图之家科技有限公司 图像中水印的检测方法、装置、电子设备及存储介质
CN111062854A (zh) * 2019-12-26 2020-04-24 Oppo广东移动通信有限公司 检测水印的方法、装置、终端及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114119329A (zh) * 2021-10-29 2022-03-01 暨南大学 基于双卷积模块的水印去除方法、装置、设备和存储介质
CN114119329B (zh) * 2021-10-29 2024-07-16 暨南大学 基于双卷积模块的水印去除方法、装置、设备和存储介质
WO2023202570A1 (fr) * 2022-04-21 2023-10-26 维沃移动通信有限公司 Procédé de traitement d'image et appareil de traitement d'image, dispositif électronique et support de stockage lisible
CN116309476A (zh) * 2023-03-15 2023-06-23 深圳爱析科技有限公司 一种crt拆解质量检测的方法、装置和电子设备
CN116309476B (zh) * 2023-03-15 2024-06-11 深圳爱析科技有限公司 一种crt拆解质量检测的方法、装置和电子设备

Also Published As

Publication number Publication date
CN111062854B (zh) 2023-08-25
CN111062854A (zh) 2020-04-24

Similar Documents

Publication Publication Date Title
WO2021129466A1 (fr) Procédé de détection de filigrane, dispositif, terminal et support de stockage
TWI773189B (zh) 基於人工智慧的物體檢測方法、裝置、設備及儲存媒體
CN111753727B (zh) 用于提取结构化信息的方法、装置、设备及可读存储介质
WO2020125495A1 (fr) Procédé, appareil et dispositif de segmentation panoramique
CN111950723B (zh) 神经网络模型训练方法、图像处理方法、装置及终端设备
CN111325271B (zh) 图像分类方法及装置
CN112861575A (zh) 一种行人结构化方法、装置、设备和存储介质
CN110751218B (zh) 图像分类方法、图像分类装置及终端设备
JP2022088304A (ja) ビデオを処理するための方法、装置、電子機器、媒体及びコンピュータプログラム
CN108961267B (zh) 图片处理方法、图片处理装置及终端设备
CN112101386B (zh) 文本检测方法、装置、计算机设备和存储介质
WO2022089170A1 (fr) Procédé et appareil d'identification de zone de sous-titres, et dispositif et support de stockage
AU2018202767A1 (en) Data structure and algorithm for tag less search and svg retrieval
CN111931809A (zh) 数据的处理方法、装置、存储介质及电子设备
CN112989085A (zh) 图像处理方法、装置、计算机设备及存储介质
CN113205047A (zh) 药名识别方法、装置、计算机设备和存储介质
JP2022185143A (ja) テキスト検出方法、テキスト認識方法及び装置
CN114926734A (zh) 基于特征聚合和注意融合的固体废弃物检测装置及方法
CN113870196A (zh) 一种基于锚点切图的图像处理方法、装置、设备和介质
CN117746015A (zh) 小目标检测模型训练方法、小目标检测方法及相关设备
CN110705653A (zh) 图像分类方法、图像分类装置及终端设备
CN114328884B (zh) 一种图文去重方法及装置
KR102444172B1 (ko) 영상 빅 데이터의 지능적 마이닝 방법과 처리 시스템
CN106469437B (zh) 图像处理方法和图像处理装置
CN114419322A (zh) 一种图像实例分割方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20907855

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20907855

Country of ref document: EP

Kind code of ref document: A1