CN117253242A

CN117253242A - Note processing method and device

Info

Publication number: CN117253242A
Application number: CN202210657938.5A
Authority: CN
Inventors: 林文松
Original assignee: Beijing Kingsoft Office Software Inc; Zhuhai Kingsoft Office Software Co Ltd; Wuhan Kingsoft Office Software Co Ltd
Current assignee: Beijing Kingsoft Office Software Inc; Zhuhai Kingsoft Office Software Co Ltd; Wuhan Kingsoft Office Software Co Ltd
Priority date: 2022-06-10
Filing date: 2022-06-10
Publication date: 2023-12-19

Abstract

The application discloses a note processing method, which comprises the steps of obtaining a note image; and inputting the note-taking image into a note erasing model, and executing a note clearing operation on the note-taking image through the note erasing model to obtain a note-free image corresponding to the note-taking image. By applying the technical scheme provided by the application, the note erasing model is trained in advance, when the note images needing to be erased are obtained, the note images can be directly input into the note erasing model, and the note erasing model is used for executing the note erasing operation on the note images, so that the note-free images after the note erasing operation are obtained. The application also discloses a note processing device which has the beneficial effects.

Description

Note processing method and device

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and an apparatus for processing notes.

Background

With the rapid development of mobile terminals such as mobile phones and electronic book readers, it is increasingly accepted by the public to electronically scan paper documents and archive the paper documents in an electronically scanned image manner. However, paper documents are often subjected to handwritten notes during use, and therefore, the handwritten notes need to be erased before their electronically scanned images can be archived. In the related art, a graphic repair tool, such as PS (photo, an image processing software), a graphic show, etc., is mostly used, and note erasing is performed by a manual operation mode, which obviously has a problem of low efficiency.

Therefore, how to erase handwritten notes in electronically scanned images quickly and efficiently is a problem to be solved by those skilled in the art.

Disclosure of Invention

The purpose of the application is to provide a note processing method, which can quickly and efficiently erase handwritten notes in an electronic scanning image; another object of the present application is to provide a note processing apparatus, which also has the above-mentioned advantageous effects.

In a first aspect, the present application provides a note processing method, including:

acquiring a note-taking image;

and inputting the note-taking image into a note erasing model, and executing a note clearing operation on the note-taking image through the note erasing model to obtain a note-free image corresponding to the note-taking image.

Optionally, after the obtaining the note-free image corresponding to the note-free image, the method further includes:

determining a region to be processed in the note-free image;

performing note segmentation on the region to be processed to obtain a region segmentation map;

performing background reduction on the region to be processed by using the region segmentation map to obtain a region reduction map;

and synthesizing the area reduction image and the note-free image to obtain a background reduction image.

Optionally, the inputting the noted note image into a note erasure model includes:

performing binarization processing on the note-in-process image to obtain a binarized image;

and inputting the binarized image into the note erasure model.

Optionally, the performing, by the note erasing model, a note erasing operation on the note-on image includes:

identifying a pixel point type of each pixel point in the note image through the note erasing model, wherein the pixel point type comprises: notes and non-notes;

obtaining a note pixel point set corresponding to the note image according to the pixel point type;

and restoring the note pixel points in the note-taking image to background pixel points aiming at each note pixel point in the note pixel point set.

Optionally, the identifying, by the note erasure model, a pixel type of each pixel in the note-in-picture includes:

identifying each pixel point in the note images through the note erasing model to obtain a pixel point set;

for each pixel in the pixel set, extracting features of the pixel from multiple dimensions to obtain a pixel feature, wherein the dimensions at least comprise one of the following: texture, color, or shape;

And determining the pixel point type of the corresponding pixel point according to the pixel point characteristics.

Optionally, the restoring the note pixel point in the note-in-note image to a background pixel point includes:

determining a first pixel point characteristic corresponding to the background pixel point;

determining a second pixel characteristic corresponding to the note pixel;

and updating the second pixel point characteristics of the note pixel points into the first pixel point characteristics so as to restore the note pixel points into the background pixel points.

Optionally, the note erasing model is obtained through training of the following steps:

acquiring a training set without a note image and a training set with a note image;

generating a note-taking image training set based on the note-taking image training set and the note-taking image training set;

and training an initial model by using the note-taking image training set and the note-free image training set to obtain a note erasing model.

Optionally, the generating a training set of note images based on the training set of no note images and the training set of note images includes:

combining the note-free training image with each note-free training image in the note-free image training set to obtain an image pair aiming at any note-free training image in the note-free image training set;

Combining the note-free training image and the note training image in the image pair aiming at any image pair to obtain a note training image;

and generating the note-taking image training set based on all the note-taking training images.

Optionally, the synthesizing the note-free training image and the note training image in the image pair to obtain a note training image includes:

determining a synthesis region in the note-free training image, and generating a mask template based on the synthesis region;

performing mask extraction on the note training image by using the mask template to obtain a mask image;

and synthesizing the mask image and the note-free training image to obtain the note-free training image.

In a second aspect, the present application provides a note processing apparatus, comprising:

the acquisition module is used for acquiring the note-taking image;

the processing module is used for inputting the note-taking image into a note erasing model, and executing note clearing operation on the note-taking image through the note erasing model to obtain a note-free image corresponding to the note-taking image.

Optionally, the note processing device further includes an area processing module, where the area processing module includes:

The area determining unit is used for determining an area to be processed in the note-free image after the note-free image corresponding to the note-free image is obtained;

the note segmentation unit is used for carrying out note segmentation on the region to be processed to obtain a region segmentation map;

the background reduction unit is used for carrying out background reduction on the region to be processed by utilizing the region segmentation map to obtain a region reduction map;

and the image synthesis unit is used for synthesizing the area reduction image and the note-free image to obtain a background reduction image.

Optionally, the processing module includes:

the binarization unit is used for carrying out binarization processing on the note image to obtain a binarized image;

and the input unit is used for inputting the binarized image into the note erasing model.

Optionally, the processing module includes:

the type identifying unit is used for identifying the pixel point type of each pixel point in the note-taking image through the note erasing model, and the pixel point type comprises: notes and non-notes;

the set generating unit is used for obtaining a note pixel point set corresponding to the note image according to the pixel point type; and the pixel restoring unit is used for restoring the note pixel points in the note image into background pixel points aiming at each note pixel point in the note pixel point set.

Optionally, the type identifying unit is specifically configured to identify each pixel point in the note-taking image through the note erasing model, so as to obtain a pixel point set; for each pixel in the pixel set, extracting features of the pixel from multiple dimensions to obtain a pixel feature, wherein the dimensions at least comprise one of the following: texture, color, or shape; and determining the pixel point type of the corresponding pixel point according to the pixel point characteristics.

Optionally, the pixel reduction unit is specifically configured to determine a first pixel feature corresponding to the background pixel; determining a second pixel characteristic corresponding to the note pixel; and updating the second pixel point characteristics of the note pixel points into the first pixel point characteristics so as to restore the note pixel points into the background pixel points.

Optionally, the note processing apparatus further includes a model training module, the model training module including:

the acquisition unit is used for acquiring the note-free image training set and the note image training set;

the generating unit is used for generating a note image training set based on the note-free image training set and the note image training set;

The training unit is used for training the initial model by using the note-taking image training set and the note-free image training set to obtain a note erasing model.

Optionally, the generating unit includes:

a combining subunit, configured to combine, for any one of the note-less training images in the note-less image training set, the note-less training image with each of the note-less training images in the note-image training set, to obtain an image pair;

the synthesis subunit is used for synthesizing the note-free training image and the note training image in any image pair to obtain a note-free training image;

and the generation subunit is used for generating the note image training set based on all the note training images.

Optionally, the synthesis subunit is specifically configured to determine a synthesis region in the note-less training image, and generate a mask template based on the synthesis region; performing mask extraction on the note training image by using the mask template to obtain a mask image; and synthesizing the mask image and the note-free training image to obtain the note-free training image.

In a third aspect, the present application provides an electronic device, including:

A memory for storing a computer program;

and a processor for implementing the steps of any of the note processing methods described above when executing the computer program.

In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the note processing methods described above.

According to the note processing scheme, a note image is acquired; the method comprises the steps of inputting the note-on image into a note erasing model, executing note erasing operation on the note-on image through the note erasing model, and obtaining a note-free image corresponding to the note-on image.

The note processing device, the electronic device and the computer readable storage medium provided by the application have the beneficial effects described above, and are not described herein.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the prior art and the technical solutions in the embodiments of the present application, the following will briefly describe the drawings that need to be used in the description of the prior art and the embodiments of the present application. Of course, the following figures related to the embodiments of the present application are only some of the embodiments of the present application, and it is obvious to those skilled in the art that other figures can be obtained from the provided figures without any inventive effort, and the obtained other figures also belong to the protection scope of the present application.

Fig. 1 is a schematic flow chart of a note processing method according to an embodiment of the present application;

FIG. 2 (a) is a diagram of a note-taking image according to an embodiment of the present application;

FIG. 2 (b) is a non-note image corresponding to the note image shown in FIG. 2 (a) provided in an embodiment of the present application;

FIG. 3 is a schematic flow chart of another note processing method according to an embodiment of the present disclosure;

FIG. 4 (a) is another noteworthy image provided in an embodiment of the present application;

FIG. 4 (b) is a binary image corresponding to the note-on image shown in FIG. 4 (a) according to one embodiment of the present application;

FIG. 4 (c) is a non-note image corresponding to the binarized image shown in FIG. 4 (b) according to an embodiment of the present application;

FIG. 5 is a flowchart of a method for performing a note-erasing operation by a note-erasing model according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for training a note erasure model according to an embodiment of the present disclosure;

FIG. 7 is a non-note training image provided in an embodiment of the present application;

FIG. 8 is a note training image provided in an embodiment of the present application;

FIG. 9 is a training image with notes provided in an embodiment of the present application;

FIG. 10 is a schematic diagram of a note processing apparatus according to an embodiment of the present disclosure;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.

The embodiment of the application provides a note processing method which can be applied to mobile terminals such as mobile phones, embedded equipment, PC equipment and the like.

Referring to fig. 1, fig. 1 is a flow chart of a note processing method according to an embodiment of the present application, where the note processing method may include:

s101: acquiring a note-taking image;

s102: and inputting the note-taking image into a note erasing model, and executing a note clearing operation on the note-taking image through the note erasing model to obtain a note-taking-free image corresponding to the note-taking image.

For ease of understanding, the several steps described above are described in combination.

In the implementation process, a note-taking image is firstly obtained, wherein the note-taking image is an image needing to be subjected to note erasing, and the note-taking image comprises original printing information and note information needing to be erased. For example, the note image may be an electronically scanned image of a paper document, and the original printed information is information printed on the paper document, including but not limited to printed text, symbols, numbers, line segments, etc.; the note information to be erased may be a handwritten note for the original printed information, including but not limited to handwritten text, symbols, numerals, line segments, etc. It is assumed that an article is printed in a paper document and carries a handwritten label made for the article, where the printed article is original printed information and the handwritten label is the note information to be erased.

Note that, the method of acquiring the note image does not affect implementation of the present technical solution, for example, the note image may be an image directly input by the user through the front-end device, an image sent by the electronic scanning device, or an image obtained by calling from an image library (a database for storing images to be subjected to note erasing operation), which is not limited in this application.

Further, the note-on image is input into a note erasing model, and the note-on image is processed, specifically, the note erasing operation is performed on the note-on image by the note erasing model, so that a note-off image corresponding to the note-on image is obtained, and the note-off image is an image which only retains original printing information after erasing the note information needing to be erased in the note-on image. For example, referring to fig. 2 (a) and fig. 2 (b), fig. 2 (a) is a note-taking image provided in the embodiment of the present application, the image is an image input to a note erasing model, where "×" indicates note information to be erased, and the rest is original printed information (asterisks, horizontal lines); fig. 2 (b) is a non-note image corresponding to the note-taking image shown in fig. 2 (a), which is a non-note image output after the note-taking image shown in fig. 2 (a) is processed by the note erasing model, according to an embodiment of the present application.

The note erasing model is a model which is obtained through training in advance and is used for executing note erasing operation on the note-carrying image to obtain the corresponding note-free image. After model training is completed, the model training can be pre-stored in a corresponding storage space, such as an internal storage space or an external storage device, and can be directly called when the note processing is performed. In addition, in the use process of the model, the note erasing model can be optimized in a timing/non-timing manner, so that the note erasing model with higher precision can be obtained, and the accuracy of a note erasing result can be further improved.

Therefore, according to the note processing method provided by the embodiment of the application, the note erasing model is trained in advance, when the note images needing to be erased are obtained, the note images can be directly input into the note erasing model, and the note erasing model is used for executing note erasing operation on the note images, so that note-free images after the notes are erased are obtained, and obviously, compared with a manual operation mode by means of a graph repairing tool, the automatic erasing of the handwritten notes is realized by the aid of the implementation mode, and the operation efficiency is higher.

The embodiment of the application provides another note processing method.

Referring to fig. 3, fig. 3 is a flowchart of another note processing method according to an embodiment of the present application, where the note processing method may include:

S201: acquiring a note-taking image;

s202: inputting the note-taking image into a note erasing model, and executing note clearing operation on the note-taking image through the note erasing model to obtain a note-free image corresponding to the note-taking image;

s203: determining a region to be processed in the note-free image;

s204: performing note segmentation on the region to be processed to obtain a region segmentation map;

s205: performing background reduction on the region to be processed by using the region segmentation map to obtain a region reduction map;

s206: and synthesizing the area reduction image and the note-free image to obtain a background reduction image.

It will be appreciated that after the note erasure operation is performed on the note-taking image using the note erasure model, there may be cases where the erasure effect of the note-taking image output therefrom is not ideal, particularly in the area where the printed text is close to or overlaps with the handwritten note, and it is likely that the handwritten note still remains. Therefore, in order to solve the technical problem, the note erasing process can be performed again on the area with poor erasing effect in the note-free image, so as to effectively improve the note erasing effect.

Specifically, after the note-free image output by the note erasing model is obtained, whether the note is required to be erased again can be judged first, and obviously, if the erasing effect is good, the note is not required to be erased again, and if the erasing effect is poor, the note is required to be erased again. Further, when the note erasing processing needs to be performed on the note-free image again, the to-be-processed area can be determined first, and the to-be-processed area is the area, in which the note erasing needs to be performed again, in the note-free image, namely, the area with poor erasing effect. For the region to be processed, firstly, carrying out note segmentation processing on the region to obtain a segmentation map only containing the handwritten notes in the region, namely the region segmentation map; then, taking the region segmentation map as a reference, carrying out careful background filling on the region to be processed, namely taking the handwriting notes in the region segmentation map as the reference, and filling the corresponding handwriting notes in the region to be processed as background colors, wherein specifically, the pixels of the handwriting notes can be replaced by background pixels so as to achieve a better background reduction effect, and obtaining a corresponding region reduction map; and finally, combining the area reduction image with the note-free image to obtain the reprocessed note-free image, namely the background reduction image, wherein obviously, the background reduction image has better erasing effect compared with the note-free image output by the note erasing model.

The process of performing note segmentation on the region map in S204 to obtain the region segmentation map may be implemented based on a note segmentation model, that is, the region map is input into the note segmentation model, and note segmentation operation is performed on the region map by using the note segmentation model to obtain the region segmentation map corresponding to the region map. In one possible implementation, the overall structure of the note segmentation model may be a full convolution structure of the encoding and decoding, and the encoder may include three convolution layers and four residual modules. Similar to the above-mentioned note erasing model, the note segmentation model is also a model trained in advance, and can be pre-stored in a corresponding storage space and can be directly called.

In S205, the process of performing background reduction on the region map by using the region segmentation map to obtain a region reduction map may be implemented based on a background reduction model, that is, the region segmentation map is input into the background reduction model, and the region reduction operation is performed on the region segmentation map by using the background reduction model to obtain the background reduction map corresponding to the region segmentation map. In one possible implementation, the background restoration model may be composed of three encoders of the downsampled partial convolutional layers, one self-attention module, and three decoders of the upsampled partial convolutional layers. Similarly, similar to the above-mentioned note erasing model and note erasing model, the background restoring model is also a model trained in advance, and can be pre-stored in a corresponding storage space and can be directly called.

As a preferred embodiment, the determining the area to be processed in the note-free image may include the following steps:

step one, receiving triggering selection operation of a note-free image;

and step two, taking the area corresponding to the triggering selection operation as the area to be processed.

The preferred embodiment provides an implementation manner of determining the area to be processed, namely, the determination can be realized through a user interaction interface. In the implementation process, after the note-free image output by the note erasing model is obtained, the note-free image can be output to the user interaction interface for display, so that a user can check the erasing result of the note-free image, and if the erasing effect is poor, the user can directly frame and select an area needing secondary treatment on the user interaction interface. Further, after the user selects the area to be processed, the processor receives the trigger selection operation of the user for the note-free image, so that the area corresponding to the trigger selection operation can be used as the area to be processed, and subsequent processing can be performed on the area.

Obviously, in the preferred embodiment, the user directly frames the area to be processed on the user interaction interface, so that the area with poor erasing effect can be more accurately determined, and the area is secondarily erased, so that the note-free image which meets the actual requirements more can be achieved.

As a preferred embodiment, the inputting the note-taking image into the note erasure model may include the steps of:

step one, carrying out binarization processing on a note-carrying image to obtain a binarized image;

and step two, inputting the binarized image into a note erasing model.

It can be understood that gray scales of different degrees may exist in the note-on image obtained through electronic scanning, so that the problem of unclear image is solved, therefore, in order to achieve a better note erasing effect, when the note-on image is input into the note erasing model, binarization processing can be performed on the note-on image first to obtain a clearer binarization image, then the binarization image is input into the note erasing model, and further, note erasing operation is performed on the binarization image by using the note erasing model, so that a note-off image corresponding to the binarization image is obtained, and the note erasing accuracy can be improved.

The binarization process is to set the gray value of each pixel point in the image to 0 or 255, that is, the whole image shows obvious black-white effect. For the note image, the gray value of the pixel points of the original print information and the note information to be erased in the image is set to 255, and the gray value of the pixel points of other areas (namely, the background area in the image) except the original print information and the note information to be erased in the image is set to 0. The identification of the pixel points of the original printing information and the note information to be erased in the image can be realized according to the gray threshold (the pixel points are set according to historical experience and the value of the pixel points is not limited in the application), when the gray value of the pixel points exceeds the gray threshold, the pixel points can be determined to be the pixel points of the original printing information or the pixel points of the note information to be erased, and the gray value of the pixel points is updated to 255; when the gray value of the pixel does not exceed the gray threshold, the pixel with the pixel as the background area can be determined, and the gray value is updated to 0 at the moment, so that the note-taking image can show obvious black-and-white effect. For example, referring to fig. 4 (a), fig. 4 (b) and fig. 4 (c), fig. 4 (a) is another image with a note provided in an embodiment of the present application, fig. 4 (b) is a binary image corresponding to the image with a note provided in an embodiment of the present application, fig. 4 (c) is an image without a note provided in an embodiment of the present application corresponding to the binary image shown in fig. 4 (b), the binary image with a note shown in fig. 4 (a) is first subjected to a binary process to obtain the binary image shown in fig. 4 (b), and then the binary image shown in fig. 4 (b) is input to a note erasure model, so that the image without a note shown in fig. 4 (c) can be obtained.

It is apparent that in the preferred embodiment, the binarization process is performed on the note-on image before the note-on image is erased, which helps to obtain a clearer note-off image.

Based on the above embodiments:

referring to fig. 5, fig. 5 is a flowchart of a method for performing a note-clearing operation through a note-erasing model according to an embodiment of the present application, where the method may include:

s301: and identifying the pixel point type of each pixel point in the note image through the note erasing model, wherein the pixel point type comprises the following steps: notes and non-notes;

s302: obtaining a note pixel point set corresponding to the note image according to the pixel point type;

s303: and restoring the note pixel points in the note-taking image to background pixel points aiming at each note pixel point in the note pixel point set.

In the implementation process, after a note-taking image is input into a note-taking erasing model, the note-taking erasing model firstly carries out pixel point type identification on the note-taking image so as to determine pixel point types of each pixel point in the note-taking image, wherein the pixel point types can be divided into note types and non-note types, the note-taking pixel point refers to current pixel point as note information, the non-note-taking pixel point refers to current pixel point not as note information but original printing information or background information, and therefore, all note-taking pixel points in the note-taking image can be obtained through carrying out type identification on each pixel point in the note-taking image, and a note pixel point set is generated based on all note-taking pixel points; further, for each note pixel in the note pixel set, the note pixel is restored to a background pixel, so that the note information in the note-in-the-presence image is restored to the background color, namely, the note information in the note-in-presence image is erased.

Therefore, in the embodiment of the application, the processing of each pixel point in the note-taking image is realized by identifying the pixel points one by one and carrying out background reduction on the note-taking pixel points one by one, so that the note-taking image is reduced to the note-free image, and a better note erasing effect can be obtained.

As a preferred embodiment, the identifying the pixel type of each pixel in the note image by the note erasure model may include the following steps:

step one, identifying each pixel point in a note image through a note erasing model to obtain a pixel point set;

step two, for each pixel point in the pixel point set, extracting characteristics of the pixel point from multiple dimensions to obtain characteristics of the pixel point, wherein the dimensions at least comprise one of the following: texture, color, or shape;

and step three, determining the pixel point type of the corresponding pixel point according to the pixel point characteristics.

The preferred embodiment provides an implementation of identifying the pixel point type. It can be understood that for different types of pixel points, there must be different feature information (such as different colors, different textures, different shapes, different pixel values, etc.), for example, for a note image, the color of the note information, the color of the original print information, and the color of the background information are all different, based on which the identification of the pixel point type can be achieved by extracting the feature information of the pixel point.

In the implementation process, firstly, identifying and acquiring each pixel point in the note-taking image, and generating a pixel point set corresponding to the note-taking image based on all the pixel points; further, feature extraction may be performed on each pixel in the pixel set, specifically, feature extraction may be performed from a plurality of preset dimensions, so as to obtain a pixel feature of each pixel, where the feature extraction dimensions may include, but are not limited to, texture features, color features, shape features, and so on; and finally, determining the pixel type of the corresponding pixel according to the extracted pixel characteristics, namely determining whether the current pixel is a note-type pixel or a non-note-type pixel.

Therefore, in the preferred embodiment, based on the characteristics that different types of pixel points have different characteristic information, the identification of the types of the pixel points is realized, and the accuracy of the identification result of the types of the pixel points can be effectively ensured.

As a preferred embodiment, the above-mentioned method for reducing the note pixel point in the note-taking image to the background pixel point may include the following steps:

step one, determining a first pixel point characteristic corresponding to a background pixel point;

step two, determining the characteristics of a second pixel point corresponding to the note pixel point;

And thirdly, updating the second pixel point characteristics of the note pixel points into the first pixel point characteristics so as to restore the note pixel points into background pixel points.

The preferred embodiment provides an implementation manner for reducing pixel points. In the implementation process, after the pixel types of each pixel point are identified and determined, the pixel point characteristics (namely the first pixel point characteristics) of the background pixel point and the pixel point characteristics (namely the second pixel point characteristics) of the note pixel point can be reserved respectively, then the pixel point characteristics of the note pixel point are replaced by the pixel point characteristics of the background pixel point, and of course, the operation is performed on each note pixel point in the note-in image, so that all the note pixel points in the note-in image are restored to the background pixel point, that is, the restoration of note information in the note-in image to the background color is realized, and the erasure of note information in the note-in image is completed.

Therefore, in the preferred embodiment, by performing the feature replacement of the pixel points, all the note pixel points in the note-taking image are restored to the background pixel points, so that the note information in the note-taking image is erased.

Based on the above embodiments:

referring to fig. 6, fig. 6 is a flowchart of a method for training a note erasing model according to an embodiment of the present application, where the method may include:

s401: acquiring a training set without a note image and a training set with a note image;

s402: generating a note-taking image training set based on the note-taking image training set and the note-taking image training set;

s403: and training the initial model by using the training set with the note images and the training set without the note images to obtain a note erasing model.

In the implementation process, firstly, a training set of images without notes and a training set of images without notes are obtained, wherein the training set of images without notes comprises a large number of images without notes, namely, the training set of images without notes only comprises original printing information and does not carry original images with the information of notes to be erased, and the training set of images without notes is shown in fig. 7; the note image training set includes a large number of note training images, i.e., images containing only note information, as shown in fig. 8. Further, generating a note-taking training image by using the note-taking training images in the note-taking image training set and the note training images in the note-taking image training set, and generating a note-taking image training set based on all the note-taking training images, wherein the note-taking training image is an image which contains both original printing information and note information to be erased, as shown in fig. 9, and the note-taking training image shown in fig. 9 is a note-taking training image generated based on the note-taking image shown in fig. 7 and the note training image shown in fig. 8. And finally, taking the training set without the note images and the training set with the note images as model training samples, and training the initial model to obtain the note erasing model for executing the note erasing operation.

Because the note-taking image training set is automatically generated based on the note-taking image training set and the note-taking image training set, the note-taking training images in the note-taking image training set and the note-taking image in the note-taking image training set are corresponding, and the note-taking image training set corresponds to one or more note-taking images. Based on the above, in the model training process, firstly forming a training data pair by corresponding non-note training images and note training images, taking the note training images in the training data pair as original images, and taking the non-note training images in the training data pair as label images; then, inputting the original image into an initial model to obtain a model output image, comparing the model output image with a label image, and calculating by combining a loss function to obtain a loss value; further, the initial model parameters are updated according to the loss values, and training is performed again, so that the updated model parameters are trained through multiple iterations to minimize the loss values, and the note erasing model meeting the requirements can be obtained. Based on the note erasing model, the effect that the model output image is similar to the label graph can be achieved, so that the note erasing model can be applied to the note processing method of each embodiment, the note erasing operation is carried out on the note-on image based on the note erasing model, and the note-off image corresponding to the note-on image is obtained.

In one possible implementation, the note erasure model may be a model based on generating a countermeasure network (GAN, generative Adversarial Network), which may be built with reference to the Pix2PixHD network, including a multi-stage generator and a multi-scale discriminator to facilitate acquisition of high definition images.

Therefore, in the embodiment of the application, the automatic generation of the training set of the note-taking image is realized based on the training set of the note-taking image and the training set of the note-taking image, and compared with the traditional technology, the implementation mode of manually marking the note-taking image and manually erasing the note-taking image through the graph-repairing tool is obtained, so that the automatic generation of the training data set is realized, and the method is simple and efficient.

As a preferred embodiment, the generating the training set with the note image based on the training set without the note image and the training set with the note image may include the following steps:

step one, combining an image pair with each note training image in a note image training set aiming at any note-free training image in the note image training set;

step two, aiming at any image pair, synthesizing the note-free training image and the note training image in the image pair to obtain a note training image;

And thirdly, generating a note-taking image training set based on all the note-taking training images.

The preferred embodiment provides an implementation of generating a training set of note images based on a training set of no note images and a training set of note images. In the implementation process, firstly, for any one note-free training image in a note-free image training set, combining the note-free training image with each note-free training image in the note-free image training set to obtain an image pair; further, for each image pair, performing image synthesis on the note-free training image and the note training image to obtain a note-free training image, wherein the essence of the image synthesis is as follows: synthesizing the note information in the note training image into the note-free training image to obtain an image with the note information; finally, after the image synthesis operation is performed on all the image pairs, generating a note-taking image training set from all the obtained note-taking training images.

It is conceivable that a larger number of images with note training can be obtained by adopting the above-described synthesis method, which is helpful for improving the model accuracy of the note erasure model.

As a preferred embodiment, the above-mentioned combining the note-free training image and the note training image in the image pair to obtain the note-taking training image may include the following steps:

Step one, determining a synthesis region in a note-free training image, and generating a mask template based on the synthesis region;

step two, performing mask extraction on the note training image by using a mask template to obtain a mask image;

and thirdly, synthesizing the mask image and the note-free training image to obtain the note-free training image.

The preferred embodiment provides an implementation of combining a note-less training image and a note training image. In the implementation process, firstly, determining a synthesized region in the non-note training image, namely, a region in the non-note training image, to which the note information in the note training image is added, wherein the region is generally a region (background region) except for the original printing information in the non-note training image; further, generating a mask template based on the synthesized region, wherein the mask template is used for extracting a mask from the note training image to obtain a corresponding mask image, and it can be understood that the essence of mask extraction is that strokes of note information in the note training image are extracted, information except strokes of the note information are removed, such as background information is removed, and then the extracted strokes are added to the synthesized region of the mask template, so that a corresponding mask image can be obtained, therefore, only strokes of the note information in the note training image are contained in the mask image, and the strokes of the note information are located in the synthesized region determined in the step one; and finally, synthesizing the mask image and the note-free training image to obtain the note-free training image corresponding to the note-free training image, wherein the synthesis process can be realized by adopting an addWeighted algorithm (an image weighted mixing algorithm) in an OpenCV kit (a cross-platform computer vision and machine learning software library).

In one possible implementation manner, to obtain a clearer note-taking training image, for a note-taking training image, a text detection algorithm may be used to detect the note-taking training image to determine a text printing area therein, and then other areas except the printed text area in the note-taking training image are used as a synthesized area, that is, note information in the note-taking training image is added to a non-text area of the note-taking training image as much as possible, so that a clearer high-quality note-taking training image is obtained.

As described above, the process of performing note segmentation on the region to be processed in S204 to obtain the region segmentation map may be implemented based on the note segmentation model, and based on this, after the mask image is obtained by performing the mask extraction operation in the step two, the mask image may be further saved, so as to obtain the mask image training set, thereby, the initial model may be trained by using the mask image training set and the note image training set to obtain the note segmentation model, and the model training process may be performed by referring to the training process of the note erasure model.

The embodiment of the application provides a note processing device.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a note processing device according to an embodiment of the present application, where the note processing device may include:

an acquisition module 1 for acquiring a note-taking image;

and the processing module 2 is used for inputting the note-taking image into the note erasing model, and executing note clearing operation on the note-taking image through the note erasing model to obtain a note-free image corresponding to the note-taking image.

Therefore, according to the note processing device provided by the embodiment of the application, the note erasing model is trained in advance, when the note images needing to be erased are obtained, the note images can be directly input into the note erasing model, and the note erasing model is used for executing note erasing operation on the note images, so that note-free images after erasing the notes are obtained, and obviously, compared with a manual operation mode by means of a graph repairing tool, the automatic erasing of the handwritten notes is realized by the aid of the implementation mode, and the operation efficiency is higher.

As a preferred embodiment, the note processing apparatus may further include an area processing module, and the area processing module may include:

As a preferred embodiment, the processing module 2 may include:

the binarization unit is used for carrying out binarization processing on the note-taking image to obtain a binarized image;

and an input unit for inputting the binarized image to the note erasure model.

As a preferred embodiment, the processing module 2 may include:

the type identifying unit is used for identifying the pixel point type of each pixel point in the note image through the note erasing model, wherein the pixel point type comprises the following components: notes and non-notes;

the set generating unit is used for obtaining a note pixel point set corresponding to the note image according to the pixel point type; and the pixel restoring unit is used for restoring the note pixel points in the note image into background pixel points for each note pixel point in the note pixel point set.

As a preferred embodiment, the above type identifying unit may be specifically configured to identify each pixel point in the note image through the note erasure model, to obtain a set of pixel points; for each pixel in the pixel set, extracting features of the pixel from multiple dimensions to obtain features of the pixel, wherein the dimensions at least comprise one of the following: texture, color, or shape; and determining the pixel point type of the corresponding pixel point according to the pixel point characteristics.

As a preferred embodiment, the pixel reduction unit may be specifically configured to determine a first pixel feature corresponding to a background pixel; determining a second pixel characteristic corresponding to the note pixel; and updating the second pixel point characteristic of the note pixel point into the first pixel point characteristic so as to restore the note pixel point into a background pixel point.

As a preferred embodiment, the note processing apparatus may further include a model training module, and the model training module may include:

the generating unit is used for generating a note image training set based on the note image training set and the note image training set;

the training unit is used for training the initial model by using the training set with the note images and the training set without the note images to obtain a note erasing model.

As a preferred embodiment, the generating unit may include:

a combining subunit, configured to combine, for any one of the note-less training images in the note-less image training set, the note-less training image with each of the note-less training images in the note-less image training set, to obtain an image pair;

the synthesis subunit is used for synthesizing the note-free training image and the note training image in the image pair aiming at any image pair to obtain a note-free training image;

And the generation subunit is used for generating a note-taking image training set based on all the note-taking training images.

As a preferred embodiment, the above synthesis subunit may be specifically configured to determine a synthesis region in the note-less training image, and generate a mask template based on the synthesis region; performing mask extraction on the note training image by using a mask template to obtain a mask image; and synthesizing the mask image and the note-free training image to obtain the note-free training image.

For the description of the note processing device provided in the embodiment of the present application, reference is made to the embodiment of the note processing method, and the embodiment of the present application is not repeated herein.

The embodiment of the application provides electronic equipment.

Referring to fig. 11, fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device may include:

a memory for storing a computer program;

a processor for implementing steps of any of the note processing methods described above when executing a computer program.

As shown in fig. 11, which is a schematic diagram of a composition structure of an electronic device, the electronic device may include: a processor 10, a memory 11, a communication interface 12 and a communication bus 13. The processor 10, the memory 11 and the communication interface 12 all complete communication with each other through a communication bus 13.

In the present embodiment, the processor 10 may be a central processing unit (Central Processing Unit, CPU), an asic, a dsp, a field programmable gate array, or other programmable logic device, etc. The processor 10 may call a program stored in the memory 11, and in particular, the processor 10 may perform operations in an embodiment of a note processing method.

The memory 11 is used for storing one or more programs, and the programs may include program codes, where the program codes include computer operation instructions, and in this embodiment, at least the programs for implementing the following functions are stored in the memory 11:

acquiring a note-taking image;

and inputting the note-taking image into a note erasing model, and executing a note clearing operation on the note-taking image through the note erasing model to obtain a note-taking-free image corresponding to the note-taking image.

In one possible implementation, the memory 11 may include a storage program area and a storage data area, where the storage program area may store an operating system, and at least one application program required for functions, etc.; the storage data area may store data created during use.

In addition, the memory 11 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device or other volatile solid-state storage device.

The communication interface 12 may be an interface of a communication module for interfacing with other devices or systems.

Of course, it should be noted that the structure shown in fig. 11 is not limited to the electronic device in the embodiment of the present application, and the electronic device may include more or fewer components than those shown in fig. 11 or may combine some components in practical applications.

Embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor, performs steps of any of the note processing methods described above.

The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

For the introduction of the computer readable storage medium provided in the embodiments of the present application, reference is made to the above method embodiments, and the embodiments of the present application are not described herein in detail.

In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The technical scheme provided by the application is described in detail. Specific examples are set forth herein to illustrate the principles and embodiments of the present application, and the description of the examples above is only intended to assist in understanding the methods of the present application and their core ideas. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the present application.

Claims

1. A note processing method, comprising:

acquiring a note-taking image;

2. The method according to claim 1, further comprising, after the obtaining the note-less image corresponding to the note-taking image:

determining a region to be processed in the note-free image;

3. The method of claim 1, wherein the inputting the note-in-process image into a note erasure model comprises:

and inputting the binarized image into the note erasure model.

4. The method of claim 1, wherein performing a note-clearing operation on the note-taking image by the note-erasure model comprises:

5. The method of claim 4, wherein the identifying, by the note erasure model, a pixel type for each pixel in the note-in-picture comprises:

6. The method of claim 4, wherein the reducing the note pixel in the note-in image to a background pixel comprises:

determining a second pixel characteristic corresponding to the note pixel;

7. The method according to any one of claims 1 to 6, wherein the note erasure model is trained by:

8. The method of claim 7, wherein the generating a noteworthy image training set based on the noteless image training set and the noteworthy image training set comprises:

9. The method of claim 8, wherein the combining the note-less training image and the note training image in the image pair to obtain the note-taking training image comprises:

10. A note processing apparatus, comprising:

the acquisition module is used for acquiring the note-taking image;