CN112581344A

CN112581344A - Image processing method and device, computer equipment and storage medium

Info

Publication number: CN112581344A
Application number: CN202011314189.3A
Authority: CN
Inventors: 贾梦晓
Original assignee: Ping An Puhui Enterprise Management Co Ltd
Current assignee: Ping An Puhui Enterprise Management Co Ltd
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2021-03-30

Abstract

The embodiment of the application belongs to the technical field of image processing, and relates to an image processing method and device for preventing certificate abuse, computer equipment and a storage medium. The application also relates to blockchain techniques in which a user's raw certificate image can be stored. According to the image processing method for preventing the abuse of the certificate, the initial watermark area covering all text segmentation word boxes is calculated, and the area corresponding to the key text word box is intercepted in the initial watermark area, so that the target watermark area which does not shield the information of the key text word box and is intersected with the characters of the certificate image is obtained, the added watermark can ensure the integrity of the original certificate information, and the abuse of the original certificate information is avoided; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected.

Description

Image processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus for preventing document abuse, a computer device, and a storage medium.

Background

The identity card is used as the only official authentication of the citizen identity, the use of the identity card picture needs to be very cautious, if the original image of the identity card without the identification is uploaded, once the information is leaked and falsely used, economic or other losses can be caused to the identity card holder.

The existing certificate anti-theft method is that a certificate photo is uploaded to an image processing APP terminal by a person, characters are added automatically, the color, transparency and gradient of the characters are adjusted manually, and the certificate photo with multi-line watermarks is generated by adding and adjusting for multiple times.

However, the traditional certificate anti-theft method is generally not intelligent, the operation of adding the watermark to the certificate is very complicated, the processing speed is low, the certificate image after the watermark is added does not always accord with the standard, key information can be shielded, and the certificate image cannot be normally used.

Disclosure of Invention

The embodiment of the application aims to provide an image processing method, an image processing device, computer equipment and a storage medium for preventing certificate abuse, so as to solve the problems that the traditional certificate anti-theft method is very complicated in watermark adding operation and low in processing speed, and a certificate image after the watermark is added does not necessarily accord with the specification, so that key information can be shielded, and the certificate image cannot be normally used.

In order to solve the above technical problem, an embodiment of the present application provides an image processing method for preventing document abuse, which adopts the following technical solutions:

receiving a watermark adding request which is input by a user through a terminal and carries an original certificate image and watermark content information;

inputting the original certificate image into an image segmentation model for image segmentation operation to obtain a text segmentation word box;

performing position calculation operation based on the original certificate image to obtain an initial watermark region intersected with all the text segmentation word boxes;

inputting the text segmentation word box into a word recognition model for word recognition operation to obtain segmentation text information;

screening the text segmentation word boxes based on the segmentation text information to obtain key text word boxes;

intercepting the initial watermark region based on the key text word box to obtain a target watermark region which does not shield the key text word box;

adding the watermark content information to the original certificate image based on the target watermark region to obtain a target certificate image;

and outputting the target certificate image.

In order to solve the above technical problem, an embodiment of the present application further provides an image processing apparatus for preventing document abuse, which adopts the following technical solution:

the request receiving module is used for receiving a watermark adding request which is input by a user through a user terminal and carries an original certificate image and watermark content information;

the semantic segmentation module is used for inputting the original certificate image into an image segmentation model for image segmentation to obtain a text segmentation word box;

the position calculation module is used for performing position calculation operation based on the original certificate image to obtain an initial watermark area intersected with all the text segmentation word boxes;

the character recognition module is used for inputting the text segmentation word box into a character recognition model for character recognition operation to obtain segmentation text information;

a key text word box obtaining module, configured to perform a screening operation on the text segmentation word box based on the segmentation text information to obtain a key text word box;

the intercepting operation module is used for intercepting the initial watermark region based on the key text word box to obtain a target watermark region which does not shield the key text word box;

the target certificate acquisition module is used for adding the watermark content information to the original certificate image based on the target watermark area to obtain a target certificate image;

and the target certificate output module is used for outputting the target certificate image.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

comprising a memory having computer readable instructions stored therein which when executed by the processor implement the steps of the image processing method of preventing abuse of a document as described above.

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

the computer readable storage medium has stored thereon computer readable instructions which, when executed by a processor, implement the steps of the image processing method for preventing abuse of documents as described above.

Compared with the prior art, the image processing method, the image processing device, the computer equipment and the storage medium for preventing the abuse of the certificates, which are provided by the embodiment of the application, have the following beneficial effects:

the embodiment of the application provides an image processing method for preventing certificate abuse, which comprises the steps of receiving a watermark adding request which is input by a user through a terminal and carries an original certificate image and watermark content information; inputting the original certificate image into an image segmentation model for image segmentation operation to obtain a text segmentation word box; performing position calculation operation based on the original certificate image to obtain an initial watermark region intersected with all the text segmentation word boxes; inputting the text segmentation word box into a word recognition model for word recognition operation to obtain segmentation text information; screening the text segmentation word boxes based on the segmentation text information to obtain key text word boxes; intercepting the initial watermark region based on the key text word box to obtain a target watermark region which does not shield the key text word box; adding the watermark content information to the original certificate image based on the target watermark region to obtain a target certificate image; and outputting the target certificate image. By calculating an initial watermark region covering all text segmentation word frames and intercepting a region corresponding to a key text word frame in the initial watermark region, a target watermark region which does not shield the information of the key text word frame and intersects with the characters of the certificate image is obtained, so that the added watermark can ensure the integrity of the original certificate information and avoid being abused; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is a flowchart of an implementation of an image processing method for preventing document abuse according to an embodiment of the present application;

FIG. 2 is a flowchart of an implementation of step S106 in FIG. 1;

fig. 3 is a schematic structural diagram of a U-net network architecture according to an embodiment of the present application;

FIG. 4 is a flowchart of an implementation of building a U-net network architecture according to an embodiment of the present disclosure;

FIG. 5 is a flowchart of an implementation of obtaining an image segmentation model according to an embodiment of the present application;

FIG. 6 is a flowchart of an implementation of step S503 in FIG. 5;

FIG. 7 is a schematic structural diagram of an image processing apparatus for preventing document abuse provided in embodiment II of the present application;

FIG. 8 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

Example one

As shown in fig. 1, an implementation flowchart of an image processing method for preventing document abuse according to an embodiment of the present application is shown, and for convenience of description, only the part relevant to the present application is shown.

In step S101, a watermark adding request carrying an original certificate image and watermark content information, which is input by a user through a user terminal, is received.

In the embodiment of the present application, the user terminal refers to a terminal device for executing the image processing method for preventing abuse of certificates provided by the present application, and the user terminal may be a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a navigation device, and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like.

In the embodiment of the application, the original certificate image refers to electronic image information consistent with certificate information needing to be added with watermarks, the original certificate image is required to be clear, and a lens is shot in parallel to an identity card, so that the size of the original certificate image is ensured to meet processing requirements.

In the embodiment of the present application, the watermark content information is mainly used for identifying the usage purpose of the original document image, and by way of example, the format of the watermark content information may be "xx private", specifically, for example: for use by a network credit company, for use by a purchasing company, etc., it should be understood that the examples of the format of the watermark content information are only for convenience of understanding and are not intended to limit the present application.

In this embodiment, the watermark content information may further include signature information, which requires inputting a name of a regular script font to generate a handwritten watermark name. Signature handwriting is added into watermark information by handwriting, so that the credibility of the information is increased.

In step S102, the original document image is input to the image segmentation model for image segmentation operation, and a text segmentation word box is obtained.

In the embodiment of the present application, the image segmentation operation flow is as follows (the original document image takes an identity card as an example):

the method comprises the steps of collecting a batch of identity card picture data sets including Chinese nationalities and minority nationalities, intercepting the identity cards from original images, and generating corresponding masks by adopting a marking tool. For example, the name field is labeled as type 1, gender as type 2, ethnicity as type 3, address as type 4, etc.

Specifically, the principle of segmentation includes:

1) downsampling + upsampling: convllution + Deconvllution/Resize;

2) multi-scale feature fusion: adding features point by point/splicing feature channel dimensions;

3) get segment map at pixel level: judging the category of each pixel point

The separation algorithm adopts depeplab 3 in the depeplab series as the identity card separation algorithm, and the full-field identity card separation is realized by combining various full-field separation data sets of the identity card. The core of the Deeplab series of algorithms employs a hole Convolution (scaled/Atrous Convolution). The hole convolution is actually that a plurality of holes are inserted in the middle of a common convolution kernel. The hole convolution with different sampling rates can effectively capture multi-scale information. And taking the semantic segmentation area as the input of the model, obtaining a full-field mask image of the identity card in the segmentation model, and finding out a rectangular frame with the maximum outline according to a label value in the mask to obtain a corresponding key field.

In the embodiment of the present application, the text division box refers to the above-described rectangular box representing the maximum outline of the text field portion.

In step S103, a position calculation operation is performed based on the original document image, and an initial watermark region intersecting all text segmentation word boxes is obtained.

In the embodiment of the application, the position calculation operation is mainly used for calculating the adding position and the inclination angle of the watermark content, so that the watermark is intersected with the characters of the text segmentation word box, and the character information of the original certificate image cannot be maliciously tampered.

In the embodiment of the present application, the inclination angle is calculated by taking the original document image as an identification card, and the ratio of the width to the length of the identification card (length 85.6mm, width 54mm) is:

since arcsin0.8458a57.75775a (calculated using the inverse trigonometric function of the calculator, sin-1 key) is included, the inclination angle is 57.8 ° downward to the right at an angle to the vertical.

In step S104, the text division box is input to the character recognition model for character recognition operation, and the divided text information is obtained.

In the embodiment of the present application, the Character Recognition model may be implemented by OCR (Optical Character Recognition), i.e. determining the shape by detecting the dark and light patterns, and then translating the shape into the computer characters by using a Character Recognition method.

In step S105, a filtering operation is performed on the text division word box based on the division text information, so as to obtain a key text word box.

In the embodiment of the application, the obtaining of the key text word box corresponding to the key information may be to perform character recognition through text box data, compare whether the recognized character content is consistent with the type of the key information, and if so, confirm that the text box data belongs to the key text word box, otherwise, confirm that the text box data does not belong to the key text word box. The key information refers to the most important information content in the original document image, such as, by way of example, name, identification number, and the like.

In step S106, an interception operation is performed on the initial watermark region based on the key text box, so as to obtain a target watermark region that does not block the key text box.

In the embodiment of the application, since the key information is the most important information content in the original certificate image, when the watermark content intersects with the key information, the key information in the original certificate image may be confused, and therefore, the coverage area of the watermark region needs to be adjusted, so that the watermark is prevented from shielding the key text information.

In this embodiment of the present application, the intercepting operation may be to adjust an inclination angle of the initial watermark region, so that the initial watermark region only intersects with the semantic segmentation sub-frame except for the above-mentioned key text word frame; the intercepting operation may also be to intercept an area coinciding with the key text word box on the basis of the initial watermark, so that the added watermark does not cover the key text content.

In step S107, the watermark content information is added to the original document image based on the target watermark region, resulting in a target document image.

In the embodiment of the present application, the year, month and day information of the system can be automatically identified, for example, the word "8/31/2020" is added to the second row of the watermark, and the position coordinates start from the left (0) coordinate, and the angle is 57.8 ° at the 2/3 coordinates from top to bottom, i.e., (0, 36), and distributed from top left to bottom right.

In step S108, the target credential image is output.

In the embodiment of the application, an image processing method for preventing certificate abuse is provided, which receives a watermark adding request which is input by a user through a terminal and carries an original certificate image and watermark content information; inputting the original certificate image into an image segmentation model for image segmentation operation to obtain a text segmentation word box; performing position calculation operation based on the original certificate image to obtain an initial watermark region intersected with all the text segmentation word boxes; inputting the text segmentation word box into a word recognition model for word recognition operation to obtain segmentation text information; screening the text segmentation word boxes based on the segmentation text information to obtain key text word boxes; intercepting the initial watermark region based on the key text word box to obtain a target watermark region which does not shield the key text word box; adding the watermark content information to the original certificate image based on the target watermark region to obtain a target certificate image; and outputting the target certificate image. By calculating an initial watermark region covering all text segmentation word frames and intercepting a region corresponding to a key text word frame in the initial watermark region, a target watermark region which does not shield the information of the key text word frame and intersects with the characters of the certificate image is obtained, so that the added watermark can ensure the integrity of the original certificate information and avoid being abused; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected.

In some optional implementation manners of the first embodiment of the present application, before the step S102, the following steps are further included:

and carrying out graying operation on the original certificate image.

In the embodiment of the present application, in the RGB model, if R ═ G ═ B, the color represents a gray scale color, where the value of R ═ G ═ B is called a gray scale value, so that each pixel of the gray scale image only needs one byte to store the gray scale value (also called an intensity value and a brightness value), and the gray scale range is 0 to 255.

Specifically, the formula for calculating the graying is as follows:

note that the 2.2 nd power and the 2.2 nd power root here, the RGB color values cannot be simply added directly, but must be converted to physical optical power with the power of 2.2. Since the RGB values are not simply linear but power functions, the exponent of this function is called the Gamma value, typically 2.2, and this scaling process is called Gamma correction.

In the embodiment of the application, the deduction speed of the whole identification process is higher by graying the certificate image.

Continuing to refer to fig. 2, a flowchart for implementing step S106 in fig. 1 is shown, and for convenience of illustration, only the portions relevant to the present application are shown.

In some optional implementations of the first embodiment of the present application, the step S106 specifically includes: step S201 and step S202.

In step S201, an occlusion region that coincides with the position information of the key text box is acquired in the initial watermark region.

In the embodiment of the application, a plane two-dimensional coordinate system is established by using an original certificate image, and because the initial watermark region and the key text box are obtained based on the original certificate image, the initial watermark region and the key text box can be marked in the plane two-dimensional coordinate system of the original certificate image through plane two-dimensional coordinate information, and the part with the overlapped coordinate position is used as a shielding region.

In step S202, the occlusion region is intercepted from the initial watermark region to obtain the target watermark region.

In the embodiment of the application, the blocking area in the initial watermark area is intercepted, so that the intercepted initial watermark area does not block the text information of the key text word box after being added to the original certificate image.

In the embodiment of the application, the ratio of the segmentation target object is larger, so that the segmentation task is simpler than before, and after the image segmentation model adopts the U-net network architecture, the segmentation efficiency of the whole model can be higher.

Continuing to refer to fig. 3, a flowchart of an implementation of building a U-net network architecture provided in an embodiment of the present application is shown, and for convenience of explanation, only the relevant portions of the present application are shown.

In some optional implementation manners of the first embodiment of the present application, the image segmentation model uses U-net as a network architecture, and before the step S102, the method further includes: step S301, step S302, step S303, step S304, step S305, and step S306.

In the embodiment of the present application, the U-net network architecture is as shown in fig. 4, and the U-net network architecture is simpler than the conventional FCN network, the first half, i.e., the left part in the figure, is used for feature extraction, and the second half, i.e., the right part in the figure, is used for upsampling. U-net adopts completely different feature fusion modes: splicing, U-net, uses features spliced together in the channel dimension to form thicker features. And the corresponding points used in FCN fusion add up and do not form thicker features.

In the embodiment of the application, the U-net is built on the network architecture of the FCN, and the network framework is modified and expanded, so that the U-net can obtain a very accurate segmentation result by using few training images. The up-sampling stage is added, and a plurality of characteristic channels are added, so that more information of the texture of the original image is allowed to be transmitted in the layers with high resolution. U-net has no FC layer and uses valid all the way to perform convolution, which ensures that the segmentation result is obtained based on the context feature without missing.

In step S301, a downsampling layer of U-Net is built.

In the present embodiment, the downsampled layer consists of multiple (e.g., 4) convolution modules, each convolution module consisting of two 3 × 3 convolution layers, one ReLU, and one 2 × 2 max pooling layer.

In step S302, an up-sampling layer of U-Net is built.

In the present embodiment, the upsampling layer is composed of a plurality of (e.g., 4) deconvolution modules, each of which is composed of one 2 × 2 deconvolution (Up-Convolution) layer, two 3 × 3 Convolution layers, and one ReLU.

In step S303, the downsampling layer and the upsampling layer are connected based on Skip Connection.

In the embodiment of the application, skip connections are called in Chinese translation and are usually used in a residual error network, and the skip connections have the function of effectively solving the problems of gradient explosion and gradient disappearance in the training process in a deeper network.

In the embodiment of the application, the characteristic layer output by the convolution module in the down-sampling layer is connected to the deconvolution module in the corresponding up-sampling layer, and the characteristic layer is connected with the input of the deconvolution module in the previous layer in series and serves as the input characteristic.

In step S304, a Dropout layer is built in the downsampling layer.

In the embodiment of the application, in order to avoid overfitting of the network training process, a Dropout layer is added in the lower sampling layer.

In step S305, a network output module of U-Net is built.

In the embodiment of the application, the output of the upsampling layer is passed through a 2 x 2 and a 1x1 convolution layer to obtain the final output of U-Net. Therefore, the U-Net can carry out end-to-end segmentation on the pixels, namely, an image is input, and the output is also an image with the same size.

In step S306, the network parameters of U-Net are set.

In the embodiment of the application, the network parameter settings of the U-Net comprise the number of the convolution and deconvolution modules, an optimizer, a loss function, an activation function, Dropout and the like. In this embodiment, the size of the ISAR picture in the data set is 128 × 128, so the downsampling layer of the U-Net network is set to 5 convolution modules, the feature dimensions after convolution processing are sequentially set to 16-32-64-128-256, the corresponding upsampling layer is composed of 5 deconvolution modules, the output feature dimensions are sequentially set to 128-64-32-16, and the two layers form a symmetrical structure. Because the output of the output layer is an image, the activation functions of other layers are all ReLU functions except the Sigmoid function. The optimizer selects an Adam optimizer combining the advantages of two optimization algorithms of AdaGrad and RMSProp, so that the memory requirement is less, and the calculation is more efficient. To prevent the training process from overfitting, the Dropout layer is set to a 50% discard rate, i.e., the Dropout layer will randomly disconnect 50% of the input neurons each time the parameters are updated during the training process. And finally, selecting a binary cross entropy function as a loss function of the network.

Continuing to refer to fig. 5, a flowchart for implementing the image segmentation model provided in the embodiment of the present application is shown, and for convenience of description, only the portion related to the present application is shown.

In some optional implementation manners of the first embodiment of the present application, before the step S102, the method further includes: step S501, step S502, and step S503.

In step S501, a training data set including a plurality of input images, a target object corresponding to each input image, and a rectangular region corresponding to the target object in each input image is acquired.

In the embodiment of the present application, with respect to the image segmentation model in the foregoing embodiment, the embodiment of the present application further includes a training method for the image segmentation model, and it is worth to be noted that training of the image segmentation model may be performed in advance according to the acquired training data set, and then may be performed by using the image segmentation model each time image segmentation needs to be performed, without training the image segmentation model each time image segmentation of the target object is performed.

In an embodiment of the present application, the training data set may include a plurality of input images, a target object in each input image, and a rectangular region of the target object in each input image. The input image may be an image including a target object, for example, a name, a gender, a ethnicity, and the like, which is not limited herein.

In the embodiment of the present application, the number of input images may not be limited. As an optional implementation manner, the number of the input images may be multiple, each input image is labeled with a corresponding target object, and a rectangular region of the target object corresponding to each input image, and the initial model may be trained according to each input image, the target object labeled to each input image, and the rectangular region, respectively, so as to improve the accuracy of the image segmentation model obtained after training.

In step S502, an image segmentation network is obtained, where the image segmentation network includes a first sub-network and a second sub-network, the first sub-network is used for outputting a target object in an image, and the second sub-network is used for outputting a rectangular region corresponding to the target object in the image.

In the embodiment of the present application, when training to obtain an image segmentation model, an image segmentation network may be constructed, where the image segmentation network may include a first sub-network for outputting a target object in an image and a second sub-network for outputting a rectangular region corresponding to the target object in the image.

In the embodiment of the present application, the image segmentation network may be constructed according to the deplabv 3+ semantic image segmentation model. Among them, the depllabv 3+ semantic image segmentation model is a deep learning model for semantic segmentation of images, and its objective is to assign a semantic label (such as name, gender, ethnicity, etc.) to each pixel of an input image to realize segmentation of a target object in the image. The output of the ASPP structure of the deplabv 3+ semantic image segmentation model typically has one output branch, and the output score is used to output the target object.

In this embodiment of the application, when the image segmentation network is constructed according to the deplabv 3+ semantic image segmentation model, another output branch may be led out from the output of the ASPP structure of the original deplabv 3+ network, that is, a second sub-network may be led out after the Encoder network, the second sub-network may be a CNN neural network, and the original output branch serves as a first sub-network, so that the construction of the image segmentation model may be completed.

In the embodiment of the application, the Encoder network in the depeplabv 3+ semantic image segmentation model generally has an ASPP hole convolution structure to extract object information in an image and output the object information to the Decoder network. Therefore, the above second sub-network may be derived at the output of the ASPP hole convolution structure, and the second self-network may be a CNN neural network to output a rectangular region of the target object based on the information output by the ASPP structure.

In the embodiment of the application, when the image segmentation network is constructed based on the deplabv 3+ semantic image segmentation model, considering that the volume of the deplabv 3+ semantic image segmentation model and the operation amount during running are large, and the backsbone network part in the deplabv 3+ semantic image segmentation model can be replaced by the mobilenetv2 network when the image segmentation network is applied to mobile terminals such as mobile phones. The mobilenetv2 network is a lightweight CNN network mainly applied to a mobile terminal, and comprises a depthwise convolution and a pointwise convolution of 1x1 convolution, and the structure separates spatial correlation and channel correlation, so that compared with the traditional convolution, the calculated amount and parameters are greatly reduced, the constructed image segmentation network is based on a mobiletv 2 network, and the image segmentation model obtained by subsequent training can be prevented from being blocked in running when running on the mobile terminal.

In step S503, the image segmentation network is trained according to the training data set to obtain an image segmentation model.

In this embodiment, the electronic device may train the image segmentation network by using the acquired training data set to obtain an image segmentation model that can output a target object in an input image and a rectangular region corresponding to the target object according to the input image. The electronic equipment can perform iterative training on the image segmentation network according to the constructed total loss function and the training data set by using the total loss function, and finally obtain the image segmentation model through training.

In this embodiment of the application, in the iterative training process, parameters of a structure of the image segmentation network constantly change, and the image segmentation network after the iterative training is completed can output a result with a smaller total loss function value, and the parameters of the image segmentation network obtained at this time can output a target object in an input image and a rectangular region corresponding to the target object according to the input image.

Continuing to refer to fig. 6, a flowchart for implementing step S503 in fig. 5 is shown, and for convenience of illustration, only the portions relevant to the present application are shown.

In some optional implementation manners of the first embodiment of the present application, step S503 includes: step S601 and step S602.

In step S601, a loss function of the image segmentation network is obtained, where the loss function includes a cross-entropy loss characterizing the first sub-network and a regression loss characterizing the second sub-network.

In the embodiment of the present application, the loss function of the image segmentation network may be as follows:

Total_loss＝Segmentation_loss+Detection_loss

in the embodiment of the application, Segmentation _ loss represents the cross entropy loss of a first sub-network, Detection _ loss represents the regression loss of a second sub-network, and Total _ loss represents the Total loss of the whole image Segmentation network.

In step S602, the image segmentation network is trained by using a back propagation algorithm according to the loss function and the training data set until the image segmentation network converges, so as to obtain an image segmentation model.

In this embodiment of the application, after acquiring the total loss function of the result output by the image segmentation network, the electronic device may perform training in a tensoflow training framework according to the combination of the total loss function and training data, and the trained image segmentation model may output a mask image of a target object in an input image and a rectangular region corresponding to the target object according to the input image.

In the embodiment of the present application, under the tensoflow training framework, the model parameters may be trained by using a back propagation algorithm, and gradient descent is used on all parameters to minimize the value of the loss function of the image segmentation network on the training data. It can be understood that iterative training is performed, so that the image segmentation model obtained by final training can output the result (the target object and the rectangular region) according to the input image in the training data set, and the difference between the label (the target object and the rectangular region) of the labeled input image is minimal.

In the embodiment of the application, an Adam optimizer can be used for carrying out iterative training on the image segmentation network until the image segmentation network converges, and the converged image segmentation network is stored to obtain a trained image segmentation model. The Adam optimizer combines the advantages of two optimization algorithms, AdaGra (adaptive gradient) and RMSProp. The first moment estimate (i.e., the mean of the gradient) and the second moment estimate (i.e., the variance of the gradient that is not centered) of the gradient are considered together, and the update step is calculated.

In the embodiment of the present application, the convergence of the image segmentation network (i.e., the termination condition of the iterative training) may include: the number of times of iterative training reaches the target number of times; or the value of the total loss function corresponding to the result output by the image segmentation network meets the set condition.

In the embodiment of the present application, the convergence condition is to make the loss function as small as possible, and the initial learning rate 1e-3 is used, the learning rate decays with the cosine of the step number, and after training 16 epochs, the convergence is considered to be completed. Where batch _ size may be understood as a batch parameter, its limit is the total number of samples in the training set, epoch refers to the number of times the entire data set is trained using all samples in the training set, colloquially the value of epoch is the number of times the entire data set is cycled, 1 epoch equals 1 training time using all samples in the training set.

In this embodiment, the condition that the value of the total loss function satisfies the setting condition may include: the value of the total loss function is less than a set threshold. Of course, the specific setting conditions may not be limiting.

In the embodiment of the application, the image segmentation model obtained through training can be stored locally in the mobile terminal, and the image segmentation model obtained through training can also be stored in the server in communication connection with the electronic device in a server mode, so that the storage space occupied by the electronic device can be reduced, and the operation efficiency of the electronic device is improved.

In the embodiment of the present application, the image segmentation model may also periodically or aperiodically acquire new training data, and train and update the image segmentation model.

In summary, the present application provides an image processing method for preventing certificate abuse, which receives a watermark adding request carrying an original certificate image and watermark content information, which is input by a user through a terminal; inputting the original certificate image into an image segmentation model for image segmentation operation to obtain a text segmentation word box; performing position calculation operation based on the original certificate image to obtain an initial watermark region intersected with all the text segmentation word boxes; inputting the text segmentation word box into a word recognition model for word recognition operation to obtain segmentation text information; screening the text segmentation word boxes based on the segmentation text information to obtain key text word boxes; intercepting the initial watermark region based on the key text word box to obtain a target watermark region which does not shield the key text word box; adding the watermark content information to the original certificate image based on the target watermark region to obtain a target certificate image; and outputting the target certificate image. By calculating an initial watermark region covering all text segmentation word frames and intercepting a region corresponding to a key text word frame in the initial watermark region, a target watermark region which does not shield the information of the key text word frame and intersects with the characters of the certificate image is obtained, so that the added watermark can ensure the integrity of the original certificate information and avoid being abused; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected. Meanwhile, the deduction speed of the whole identification process is higher by graying the certificate image; the division target object is larger in proportion, so that the division task is simpler than before, and the division efficiency of the whole model can be higher after the image division model adopts a U-net network architecture; in the iterative training process, the parameters of the structure of the image segmentation network are changed continuously, the image segmentation network after the iterative training can output a result with a smaller total loss function value, and the parameters of the image segmentation network obtained at the moment can output the target object in the input image and the rectangular region corresponding to the target object according to the input image.

It is emphasized that, in order to further ensure the privacy and security of the user's original certificate image information, the user's original certificate image information may also be stored in a node of a block chain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, can include processes of the embodiments of the methods described above. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

Example two

With further reference to fig. 6, as an implementation of the method shown in fig. 2 described above, the present application provides an embodiment of an image processing apparatus for preventing abuse of documents, which corresponds to the embodiment of the method shown in fig. 2, and which is particularly applicable in various electronic devices.

As shown in fig. 6, the image processing apparatus 100 for preventing abuse of a document according to the present embodiment includes: a request receiving module 110, a semantic segmentation module 120, a position calculation module 130, a word recognition module 140, a key text box acquisition module 150, a truncation operation module 160, a target certificate acquisition module 170, and a target certificate output module 180. Wherein:

a request receiving module 110, configured to receive a watermark adding request carrying an original certificate image and watermark content information, which is input by a user through a user terminal;

the semantic segmentation module 120 is used for inputting the original certificate image into the image segmentation model for image segmentation to obtain a text segmentation word box;

the position calculation module 130 is configured to perform position calculation operations based on the original certificate image to obtain an initial watermark region intersecting all text segmentation word boxes;

the character recognition module 140 is configured to input the text division box into a character recognition model for performing character recognition operation to obtain division text information;

a key text word box obtaining module 150, configured to perform a screening operation on the text segmentation word box based on the segmentation text information to obtain a key text word box;

an intercepting operation module 160, configured to perform intercepting operation on the initial watermark region based on the key text box to obtain a target watermark region that does not block the key text box;

the target certificate acquisition module 170 is configured to add watermark content information to the original certificate image based on the target watermark region to obtain a target certificate image;

and a target certificate output module 180 for outputting a target certificate image.

Specifically, the principle of segmentation includes:

1) downsampling + upsampling: convllution + Deconvllution/Resize;

3) get segment map at pixel level: judging the category of each pixel point

angle a is arcsin0.8458a57.75775 ° (calculated using the inverse trigonometric function of the calculator, sin-1 key), so the tilt angle is 57.8 ° down to the right at an angle to the vertical.

In the embodiment of the application, an image processing device for preventing certificate abuse is provided, and by calculating an initial watermark region covering all text segmentation word boxes and intercepting a region corresponding to a key text word box in the initial watermark region, a target watermark region which does not shield information of the key text word box and intersects with certificate image characters is obtained, so that added watermarks can ensure the integrity of original certificate information and avoid being abused; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected.

In some optional implementations of the second embodiment of the present application, the image processing apparatus 100 for preventing document abuse further includes: and a graying module. Wherein:

and the graying module is used for performing graying operation on the original certificate image.

Specifically, the formula for calculating the graying is as follows:

In some optional implementations of the second embodiment of the present application, the intercepting operation module 160 specifically includes: a shielding area acquisition submodule and a shielding area interception submodule. Wherein:

the occlusion region acquisition submodule is used for acquiring an occlusion region which is overlapped with the position information of the key text word box in the initial watermark region;

and the blocking area intercepting submodule is used for intercepting the blocking area in the initial watermark area to obtain the target watermark area.

In some optional implementations of the second embodiment of the present application, the image segmentation model uses U-net as a network architecture, and the image processing apparatus 100 for preventing certificate abuse further includes: the system comprises a down-sampling layer building module, an up-sampling layer building module, a connecting module, a Dropout layer building module, an output module building module and a network parameter setting module. Wherein:

the down-sampling layer building module is used for building a down-sampling layer of the U-Net;

the upper sampling layer building module is used for building an upper sampling layer of the U-Net;

a Connection module for connecting the downsampling layer and the upsampling layer based on Skip Connection;

the Dropout layer building module is used for building a Dropout layer;

the output module building module is used for building a network output module of the U-Net;

and the network parameter setting module is used for setting the network parameters of the U-Net.

In some optional implementations of the second embodiment of the present application, the apparatus 100 further includes: the device comprises a training set acquisition module, a segmentation network acquisition module and a network training module. Wherein:

the training set acquisition module is used for acquiring a training data set, wherein the training data set comprises a plurality of input images, a target object in each input image and a rectangular area corresponding to the target object in each input image;

the segmentation network acquisition module is used for acquiring an image segmentation network, the image segmentation network comprises a first sub-network and a second sub-network, the first sub-network is used for outputting a target object in an image, and the second sub-network is used for outputting a rectangular area corresponding to the target object in the image;

and the network training module is used for training the image segmentation network according to the training data set to obtain an image segmentation model.

In some optional implementation manners of the second embodiment of the present application, the network training module specifically includes: a loss function acquisition sub-module and a network training sub-module. Wherein:

the loss function acquisition sub-module is used for acquiring a loss function of the image segmentation network, wherein the loss function comprises cross entropy loss used for representing the first sub-network and regression loss of the second sub-network;

and the network training submodule is used for training the image segmentation network by utilizing a back propagation algorithm according to the loss function and the training data set until the image segmentation network is converged to obtain an image segmentation model.

In summary, an image processing device for preventing certificate abuse is provided, an initial watermark region covering all text segmentation word frames is calculated, and a region corresponding to a key text word frame is intercepted in the initial watermark region, so that a target watermark region which does not shield the information of the key text word frame and intersects with the certificate image words is obtained, the added watermark can ensure the integrity of the original certificate information and avoid being abused; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected. Meanwhile, the deduction speed of the whole identification process is higher by graying the certificate image; the division target object is larger in proportion, so that the division task is simpler than before, and the division efficiency of the whole model can be higher after the image division model adopts a U-net network architecture; in the iterative training process, the parameters of the structure of the image segmentation network are changed continuously, the image segmentation network after the iterative training can output a result with a smaller total loss function value, and the parameters of the image segmentation network obtained at the moment can output the target object in the input image and the rectangular region corresponding to the target object according to the input image.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 8, fig. 8 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 200 includes a memory 210, a processor 220, and a network interface 230 communicatively coupled to each other via a system bus. It is noted that only computer device 200 having

components

210 and 230 is shown, but it is understood that not all of the illustrated components are required and that more or fewer components may alternatively be implemented. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 210 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 210 may be an internal storage unit of the computer device 200, such as a hard disk or a memory of the computer device 200. In other embodiments, the memory 210 may also be an external storage device of the computer device 200, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 200. Of course, the memory 210 may also include both internal and external storage devices of the computer device 200. In this embodiment, the memory 210 is generally used for storing an operating system and various application software installed in the computer device 200, such as computer readable instructions of an image processing method for preventing certificate abuse. In addition, the memory 210 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 220 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 220 is generally operative to control overall operation of the computer device 200. In this embodiment, the processor 220 is configured to execute computer readable instructions or process data stored in the memory 210, for example, execute computer readable instructions of the image processing method for preventing document abuse.

The network interface 230 may include a wireless network interface or a wired network interface, and the network interface 230 is generally used to establish a communication connection between the computer device 200 and other electronic devices.

According to the image processing method for preventing the certificate abuse, the initial watermark area covering all text segmentation word boxes is calculated, and the area corresponding to the key text word box is intercepted in the initial watermark area, so that the target watermark area which does not shield the information of the key text word box and is intersected with the certificate image words is obtained, the added watermark can ensure the integrity of the original certificate information, and the abuse of the original certificate information is avoided; meanwhile, the execution process of the image processing is executed at the user terminal, so that the condition that the information leakage occurs in the process of information transmission of the original certificate image is effectively avoided, and the privacy information of the user is effectively protected.

The present application further provides another embodiment, which is a computer-readable storage medium having computer-readable instructions stored thereon which are executable by at least one processor to cause the at least one processor to perform the steps of the image processing method for preventing abuse of documents as described above.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

1. An image processing method for preventing abuse of a document, comprising the steps of:

receiving a watermark adding request which is input by a user through a user terminal and carries an original certificate image and watermark content information;

and outputting the target certificate image.

2. The method of claim 1, wherein the step of inputting the original document image to an image segmentation model for image segmentation to obtain text segmentation word blocks further comprises:

and carrying out graying operation on the original certificate image.

3. The image processing method for preventing document abuse according to claim 1, wherein the step of performing an intercepting operation on the initial watermark region based on the key text box to obtain a target watermark region that does not block the key text box specifically includes:

acquiring an occlusion area which is overlapped with the position information of the key text word box in the initial watermark area;

and intercepting the shielded area in the initial watermark area to obtain the target watermark area.

4. The image processing method for preventing the abuse of the certificates according to claim 1, wherein the image segmentation model takes U-net as a network architecture, and before the step of inputting the original certificate image into the image segmentation model for image segmentation operation to obtain the text segmentation word block, the method further comprises:

building a down-sampling layer of the U-Net;

building an upper sampling layer of the U-Net;

connecting the downsampling layer and the upsampling layer based on Skip Connection;

building a Dropout layer in the downsampling layer;

building a network output module of the U-Net;

and setting the network parameters of the U-Net.

5. The method of claim 1, wherein the step of inputting the original document image to an image segmentation model for image segmentation to obtain text segmentation word blocks further comprises:

acquiring a training data set, wherein the training data set comprises a plurality of input images, a target object corresponding to each input image and a rectangular area corresponding to the target object in each input image;

acquiring an image segmentation network, wherein the image segmentation network comprises a first sub-network and a second sub-network, the first sub-network is used for outputting a target object in an image, and the second sub-network is used for outputting a rectangular region corresponding to the target object in the image;

and training the image segmentation network according to the training data set to obtain the image segmentation model.

6. The method for image processing for preventing document abuse according to claim 5, wherein the step of performing a training operation on the image segmentation network according to the training data set to obtain the image segmentation model specifically comprises:

obtaining a loss function of the image segmentation network, wherein the loss function comprises cross entropy loss for characterizing the first sub-network and regression loss of the second sub-network;

and training the image segmentation network by using a back propagation algorithm according to the loss function and the training data set until the image segmentation network is converged to obtain the image segmentation model.

7. An image processing apparatus for preventing abuse of a document, comprising:

8. The abuse-preventative image processing apparatus according to claim 7, wherein the apparatus further comprises:

9. A computer device comprising a memory having computer readable instructions stored therein and a processor which when executed implements the steps of the image processing method of preventing abuse of a document according to any one of claims 1 to 6.

10. A computer-readable storage medium having computer-readable instructions stored thereon which, when executed by a processor, implement the steps of the image processing method of preventing abuse of a document according to any one of claims 1 to 6.