CN115035539B

CN115035539B - Document anomaly detection network model construction method and device, electronic equipment and medium

Info

Publication number: CN115035539B
Application number: CN202210964812.2A
Authority: CN
Inventors: 冯德亮; 孙铁; 陈奕均; 毛奔; 冯伟
Original assignee: Ping An Bank Co Ltd
Current assignee: Ping An Bank Co Ltd
Priority date: 2022-08-12
Filing date: 2022-08-12
Publication date: 2022-10-28
Anticipated expiration: 2042-08-12
Also published as: CN115035539A

Abstract

The embodiment of the application provides a method and a device for constructing a document anomaly detection network model, electronic equipment and a medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: randomly selecting a text area based on a normal document image, and generating a document abnormal image sample set according to the text area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples; carrying out document abnormal marking on each document abnormal image to generate marking information files corresponding to each marked image sample; extracting marking information files with a first sample number from the plurality of marking information files, and generating a training image index list according to the marking information files with the first sample number; and training the initial document abnormality detection network model according to the real bounding box, the training image index list and the document abnormality training image to obtain the document abnormality detection network model. Therefore, the document abnormality detection can be carried out on the document image through the model, and the automation degree and accuracy of the document abnormality detection are improved.

Description

Document anomaly detection network model construction method and device, electronic equipment and medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for constructing a document anomaly detection network model, an electronic device, and a medium.

Background

At present, various document images comprise document images shot by a user and document images obtained by scanning, the document images uploaded by a client have the conditions of character overlapping and character shielding, and a scheme for separating characters and directly complementing shielding information to be correct is not provided in the industry for a while. Therefore, it is desirable to provide a solution for analyzing abnormal text situations, such as text overlap and text occlusion, of a document image.

Disclosure of Invention

In order to solve the technical problem, embodiments of the present application provide a method and an apparatus for constructing a document anomaly detection network model, an electronic device, and a medium.

In a first aspect, an embodiment of the present application provides a method for constructing a document anomaly detection network model, where the method includes:

randomly selecting a character area based on a normal document image, and generating a document abnormal image sample set according to the character area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples;

carrying out abnormal marking on each abnormal document image to obtain a plurality of marked image samples, and generating marking information files corresponding to the marked image samples;

determining a first sample number of a document abnormal image training set, extracting marking information files of the first sample number from a plurality of marking information files, and generating a training image index list according to the marking information files of the first sample number;

and constructing an initial document anomaly detection network model based on a YOLO framework, and training the initial document anomaly detection network model according to the size of a real boundary box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain the document anomaly detection network model.

In a second aspect, an embodiment of the present application provides a document abnormality detection method for a document image, where the method includes:

inputting a document image to be detected into a document anomaly detection network model, wherein the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided by the first aspect;

detecting the document image to be detected through the document abnormality detection network model to obtain a document abnormality output result, wherein the document abnormality output result comprises document abnormality coordinate information, object confidence, category probability and a category to which the document abnormality output result belongs;

and generating a document abnormal detection result according to the document abnormal coordinate information, the object confidence coefficient, the class probability and the belonged class.

In a third aspect, an embodiment of the present application provides a document anomaly detection network model building apparatus, where the apparatus includes:

the document abnormal image sampling device comprises a selecting module, a processing module and a processing module, wherein the selecting module is used for randomly selecting a character area based on a normal document image and generating a document abnormal image sampling set according to the character area, and the document abnormal image sampling set comprises a plurality of document abnormal image samples;

the marking module is used for carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples and generating marking information files corresponding to the marked image samples;

the determining module is used for determining the first sample number of a document abnormal image training set, extracting marking information files of the first sample number from the marking information files, and generating a training image index list according to the marking information files of the first sample number;

and the training module is used for constructing an initial document abnormity detection network model based on a YOLO framework, and training the initial document abnormity detection network model according to the size of a real boundary box, the training image index list and a document abnormity training image corresponding to the training image index list to obtain a document abnormity detection network model.

In a fourth aspect, an embodiment of the present application provides an apparatus for detecting document anomalies of a document image, where the apparatus includes:

the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided by the first aspect;

the detection module is used for detecting the document image to be detected through the document abnormity detection network model to obtain a document abnormity output result, and the document abnormity output result comprises document abnormity coordinate information, object confidence coefficient, class probability and the class of the document abnormity output result;

and the generation module is used for generating a document abnormity detection result according to the document abnormity coordinate information, the object confidence coefficient, the class probability and the belonged class.

In a fifth aspect, an embodiment of the present application provides an electronic device, which includes a memory and a processor, where the memory is used to store a computer program, and when the processor runs the computer program, the computer program executes the method for constructing a document anomaly detection network model according to the first aspect, or executes the method for detecting document anomalies of a document image according to the second aspect.

In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program runs on a processor, the computer program executes the method for constructing a document abnormality detection network model provided in the first aspect, or executes the method for detecting document abnormality of a document image provided in the second aspect.

According to the document anomaly detection network model construction method, the document anomaly detection network model construction device, the electronic equipment and the medium, document anomaly detection can be performed on the document image through the document anomaly detection network model, influences of factors such as difficulty in positioning under the condition that the characters are very small, interference of missing parts of the characters and the like on the document anomaly detection are avoided, and the automation degree and accuracy of the document anomaly detection are improved.

Drawings

In order to more clearly explain the technical solutions of the present application, the drawings needed to be used in the embodiments are briefly introduced below, and it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope of protection of the present application. Like components are numbered similarly in the various figures.

FIG. 1 is a flow chart illustrating a document anomaly detection network model building method according to an embodiment of the present application;

FIG. 2 is another schematic flow chart of a document anomaly detection network model construction method provided in the embodiment of the present application;

FIG. 3 is a flow chart illustrating a document abnormality detection method for a document image according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a document anomaly detection network model building device according to an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a document abnormality detection apparatus for a document image according to an embodiment of the present application;

fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Icon: 400-document anomaly detection network model construction device, 401-selection module, 402-marking module, 403-determination module, 404-training module, 500-document image anomaly detection device, 501-input module, 502-detection module, 503-generation module, 600-electronic equipment, 601-transceiver, 602-processor and 603-memory.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments.

The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

Hereinafter, the terms "including", "having", and their derivatives, which may be used in various embodiments of the present application, are intended to indicate only specific features, numbers, steps, operations, elements, components, or combinations of the foregoing, and should not be construed as first excluding the existence of, or adding to, one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.

Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another, and are not to be construed as indicating or implying relative importance.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the various embodiments of this application belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in various embodiments.

Example 1

The embodiment of the disclosure provides a method for constructing a document anomaly detection network model.

Referring to fig. 1, the method for constructing the document anomaly detection network model includes:

step S101, randomly selecting a character area based on a normal document image, and generating a document abnormal image sample set according to the character area.

In the embodiment, one or more position areas are automatically randomly selected for a normal document image, the position of a Character can be confirmed through Optical Character Recognition (OCR), and an internal text set composed of text contents is obtained. The document abnormal image sample set comprises a plurality of document abnormal image samples. It should be noted that the normal document image may also be referred to as a normal document image, and may be a normal document scanning file obtained by scanning a normal document, or a normal document photo obtained by taking a picture by a user, which is not limited herein.

In one embodiment, the generating a document abnormal image sample set according to the text area in step S101 includes:

determining the character position of the character area through an OCR (optical character recognition), and acquiring a text content set corresponding to the character position;

calculating the marking process of the shielded text through OPENCV image processing to obtain the background and font color of the text content set in the normal document image, calculating the font size through the width and height of the character position and the line number of the text, and constructing an edge frame according to the background, the font color and the font size;

and generating a document abnormal image sample according to the edge frame and the original text frame of the normal document image.

In this embodiment, the OPENCV image processing algorithm may provide image processing algorithms such as image binarization processing, erosion processing, filtering processing, and blurring processing, and various processing algorithms provided by the OPENCV image processing algorithm may collect text contents in a background and font colors of a normal document image, for example, the text contents are collected as "today's weather is sunny", and the OPENCV image processing algorithm may determine that "today's weather is sunny" in the original normal document image, the background is black, and the font colors are yellow. The font size indicates a text size style, such as font 4. The edge frame is used for sliding on an original text frame of the normal document image, and different document abnormal image samples are generated through the control of the intersection ratio of the edge frame and the original text frame.

In an embodiment, the generating the document abnormal image sample according to the edge frame and the original text frame of the normal document image includes:

constructing the text overlapping sample by calculating the intersection ratio of the edge frame and the original text frame; and/or the presence of a gas in the gas,

and shielding the normal document image through a preset text box according to the text content set to obtain the text shielding sample.

In the present embodiment, the text overlap sample is generated by moving the intersection ratio of the edge frame and the original text frame to control the area where the documents overlap. And randomly carrying out a masking layer shielding in a certain range left and right through a preset text box by utilizing the text content set to generate a text shielding sample. The text occlusion samples all include one or more abnormal regions, and the abnormal regions can be text overlapping regions and text occlusion regions.

And S102, carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples, and generating marking information files corresponding to each marked image sample.

In one embodiment, the marking information file includes a custom image object name, an image file path, an image size, and document abnormal coordinate information.

Exemplarily, marking is performed by adopting an open source labelImg tool, namely, a shielding position area is framed by a rectangle, an object name of the shielding area is set as hid, and then a marking information file is generated according to the shielding position area and the shielding area object name, wherein the marking information file can be a marking xml file. For example, the branded xml file contains the branded custom image object name photo1, the image file name, the image file path, the image size corresponding width, height, depth, the occlusion region position minimum x coordinate, minimum y coordinate, maximum x coordinate, maximum y coordinate, and the occlusion region object name hid. The branded xml file is placed in another folder.

It should be noted that the marking process and the saving process of the overlapped text have similar steps to those of the marking process and the saving process of the shielded text, marking contents marked on the same image are all saved in the same marking xml file, and the shielded and overlapped object name tags corresponding to the marking process can be represented as 0 and 1, or be named as hid, etc., and are not limited herein.

Step S103, determining a first sample number of a document abnormal image training set, extracting marking information files of the first sample number from the marking information files, and generating a training image index list according to the marking information files of the first sample number.

Exemplarily, the ratio of the training sample to the test sample is set to 9:1, according to 9:1, determining the number of first samples and the number of second samples, extracting marking information files of the number of the first samples from the plurality of marking information files, forming a training image index list by marking image samples corresponding to the marking information files of the number of the first samples, searching corresponding abnormal document images according to the training image index list, and generating an abnormal document image training set according to the searched abnormal document images.

The marking information file is randomly extracted, the name of the image file is determined according to the extracted marking information file, the image file name suffix (such as jpg) is removed and then is used as a training image index, the training image index is stored to txt, and the training image index list is obtained in a single-line mode. And marking label content and an original image corresponding to the marking information file can be simultaneously obtained according to the training image index list.

And step S104, constructing an initial document abnormality detection network model based on a YOLO framework, and training the initial document abnormality detection network model according to the size of a real boundary box, the training image index list and a document abnormality training image corresponding to the training image index list to obtain a document abnormality detection network model.

Exemplarily, a YOLO framework basis is adopted, a cross-stage partial Connection (CSP) is added to each large residual block of the Darknet53, corresponding to layer0 to layer104, a backbone network (backbone) is formed, a spatial pyramid pooling is added to increase a perception field of the network, 5 × 5, 9 × 9 and 13 × 13 maximal pooling is performed on layer107, layer108, layer110 and layer112 are obtained respectively, after pooling is completed, the layers are connected (concatene) to form layer114, dimension reduction is performed to 512 channels through 1 × 1, and after upsampling (sample) is performed on an FPN basis, a downsampling (downsampling) operation is added, so that feature fusion is realized, and an initial document anomaly detection network model is obtained.

And adopting M60 four-card training, properly adapting and adjusting the size of the image input into the initial document abnormality detection network model to 416 x 416 according to the size of the video memory, and inputting the image to be detected into the initial document abnormality detection network model. For training model parameters, mosaic data enhancement, label smoothing, CIOU, learning rate cosine annealing attenuation and Mish activation functions are adopted, in addition, in the training process, a general freezing training for extracting network characteristics by a backbone network can accelerate the training speed, and weight can also be prevented from being damaged in the initial training stage. Exemplarily, 200 epochs are trained (epochs are a training process), the initial learning rate of the first 100 epochs is set to be le-3, the initial learning rate of the batch \ "u size (batch _ size is the size of each batch of data) is 4, the initial learning rate of the last 100 epochs is set to be le-4 by trying to increase the training speed and reduce the video memory usage, and the initial learning rate of the batch \" u size is 2.

Referring to fig. 2, the training the initial document anomaly detection network model according to the size of the real bounding box, the training image index list, and the document anomaly training image corresponding to the training image index list in step S104 includes:

step S1041, loading marking information files corresponding to the training image index list through the initial document anomaly detection network model, acquiring document anomaly coordinate information of the loaded marking information files, and taking the size information of the real bounding box as input data of K-means clustering;

step S1042, training the initial document abnormality detection network model according to the document abnormality coordinate information, the size information of the real bounding box and the document abnormality training image.

Exemplarily, a marking xml file in a marking folder is loaded to obtain a minimum x coordinate, a minimum y coordinate, a maximum x coordinate, and a maximum y coordinate of each shielding and overlapping region position of marking, and then the minimum x coordinate, the maximum y coordinate, and the maximum y coordinate are used as input data of a K-means cluster, namely the width and the height of a real bounding box (ground bounding box), and considering scenes under different sizes, the size of each real bounding box is different, and it is very necessary to standardize the width and the height of the bounding box, and the width and the height of a standardized image.

It should be added that, the method for constructing a document anomaly detection network model provided in this embodiment further includes:

storing each document abnormal image into an image folder;

and storing each marking information file into a marking folder, wherein each marking information file under the marking folder corresponds to each abnormal document image under the image folder one by one.

In this embodiment, the text overlap sample and the text occlusion sample are saved in the same image folder. The automatically generated text overlapping samples and the automatically generated text shielding samples are all ten thousand-level, and the number of the automatically generated text overlapping samples and the automatically generated text shielding samples can be larger, and is not limited herein. Exemplarily, the marking xml file is placed in a marking folder.

It is further added that the method for constructing a document anomaly detection network model provided in this embodiment further includes:

determining a second sample number of the document abnormal image test set;

extracting marking information files with the second sample number from the plurality of marking information files, and generating a test image index list according to the marking information files with the second sample number;

and determining the false detection rate and the omission rate of the abnormal result of the document according to the abnormal coordinate information of the document corresponding to the test image index list and the abnormal test image of the document through the abnormal test network model of the document.

The method for constructing the document abnormality detection network model provided by this embodiment is to construct an initial document abnormality detection network model based on a YOLO framework, train the initial document abnormality detection network model according to the size of a real bounding box, the training image index list, and a document abnormality training image corresponding to the training image index list to obtain the document abnormality detection network model, and perform document abnormality detection on a document image through the document abnormality detection network model to improve the automation degree and accuracy of document abnormality detection.

Example 2

The embodiment of the disclosure provides a document abnormality detection method for a document image.

Referring to fig. 3, the document abnormality detection method of the document image includes:

step S301, inputting a document image to be detected to the document abnormity detection network model.

In this embodiment, the document abnormality detection network model is obtained according to the document abnormality detection network model construction method provided in embodiment 1.

In this embodiment, the document anomaly detection network model is obtained by using the document anomaly detection network model construction method provided in embodiment 1, and the detailed process is shown in embodiment 1, which does not avoid repetition and is not limited herein.

Step S302, detecting the document image to be detected through the document abnormity detection network model to obtain a document abnormity output result, wherein the document abnormity output result comprises document abnormity coordinate information, object confidence, class probability and the class.

Exemplarily, the object confidence (confidence) is the probability of a bounding box containing an object and the accuracy of the position (i.e. whether the occlusion area is just wrapped or not), the formula expresses Pr (hid) × IOU, IOU is the cross-over ratio between the predicted value and the actual value, pr (hid) × IOU of the label is 1 in the training process, and confidence is the predicted value in the prediction process; when whether the occlusion (occlusion region object name is hid) is predicted, the document abnormal category conditional probability value is provided, namely the category probability under the confidence, so that the final score (scores) is the confidence multiplied by the category probability. And performing primary screening by using the class probability larger than the preset parameter 0.5 to obtain all the prediction results under all the classes after the primary screening, sequencing all the residual prediction results under all the classes by using the confidence coefficient multiplied by the class probability, obtaining the maximum score (scores) of the result obtained by sequencing, and simultaneously satisfying the condition that the non-maximum inhibition removal coincidence degree is larger than the preset parameter 0.4 to obtain the final optimal prediction result under each class.

Step S303, generating a document abnormal detection result according to the document abnormal coordinate information, the object confidence coefficient, the category probability and the belonged category.

In one embodiment, step S303 includes:

determining an abnormal regression position according to the abnormal coordinate information of the document;

determining a document abnormal score according to the product of the object confidence and the category probability;

and determining the number of abnormal documents in each category according to the category to which the document belongs.

Exemplarily, the document abnormal output result is (x 1, y1, x2, y2, obj _ conf, class _ conf, class _ pred), which respectively represents a minimum x coordinate, a minimum y coordinate, a maximum x coordinate, a maximum y coordinate, an object confidence, a class probability, and a class to which the document abnormal output result belongs. For example, the number of the document occlusion anomalies, that is, the result of finding the occlusion region object hid belonging to the category, is determined, and if the number of the corresponding results is 0, there is no document occlusion anomaly. If N, the number of the corresponding document shielding exceptions is N; the position of the document occlusion can be directly obtained from the corresponding x1, y1, x2, y2 of the document abnormal output result, and the final score (scores) is obj _ conf × class _ conf in the document abnormal output result.

In the above example, the document occlusion regression position, the number of document occlusion anomalies, and the prediction score are obtained. The document overlapping regression position, the number of document overlapping anomalies, and the obtaining manner of the prediction scores are similar to the document shielding regression position, the number of document shielding anomalies, and the prediction score, which are not repeated herein.

Therefore, the detection of abnormal conditions such as document overlapping, document shielding and the like is realized, the false detection rate and the missing detection rate are close to below 0.1, the diagnosis capability is provided for the document image uploaded by a user, and the quality guarantee is provided for the uploaded document image layer.

According to the document abnormality detection method for the document image, document abnormality detection can be performed on the document image through the document abnormality detection network model, so that the influence of factors such as difficulty in positioning under the condition of very small characters, interference of missing parts of the characters and the like on the document abnormality detection is avoided, and the automation degree and accuracy of the document abnormality detection are improved.

Example 3

In addition, the embodiment of the disclosure provides a document anomaly detection network model construction device.

As shown in fig. 4, the document abnormality detection network model building apparatus 400 includes:

a selecting module 401, configured to randomly select a text region based on a normal document image, and generate a document abnormal image sample set according to the text region, where the document abnormal image sample set includes a plurality of document abnormal image samples;

a marking module 402, configured to perform document abnormal marking on each document abnormal image to obtain multiple marked image samples, and generate marking information files corresponding to each marked image sample;

a determining module 403, configured to determine a first number of samples of a document abnormal image training set, extract the marking information files of the first number of samples from the plurality of marking information files, and generate a training image index list according to the marking information files of the first number of samples;

the training module 404 is configured to construct an initial document anomaly detection network model based on a YOLO framework, and train the initial document anomaly detection network model according to the size of a real bounding box, the training image index list, and a document anomaly training image corresponding to the training image index list, so as to obtain a document anomaly detection network model.

In an embodiment, the training module 404 is further configured to load a marking information file corresponding to the training image index list through the initial document anomaly detection network model, obtain document anomaly coordinate information of the loaded marking information file, and use size information of the real bounding box as input data of K-means clustering;

and training the initial document abnormality detection network model according to the document abnormality coordinate information, the size information of the real boundary box and the document abnormality training image.

In an embodiment, the selecting module 401 is further configured to determine a text position of the text region through OCR, and obtain a text content set corresponding to the text position;

acquiring the background and font color of the text content set in the normal document image through an OPENCV image processing algorithm, calculating the font size through the width and height of the character position and the line number of the text, and constructing an edge frame according to the background, the font color and the font size;

In an embodiment, the document abnormal image sample includes a text overlapping sample and/or a text occlusion sample, and the selecting module 401 is further configured to construct the text overlapping sample by calculating an intersection ratio of the edge frame and an original text frame; and/or the presence of a gas in the gas,

In one embodiment, the document anomaly detection network model building apparatus 400 further includes:

the storage module is used for storing the abnormal images of the documents into an image folder;

a determining module 403, configured to determine a second sample number of the document abnormal image test set;

and determining the false detection rate and the omission rate of the abnormal result of the document according to the abnormal coordinate information of the document corresponding to the test image index list and the abnormal test image of the document by the abnormal test network model of the document.

The document abnormality detection network model construction device 400 provided in this embodiment can implement the document abnormality detection network model construction method provided in embodiment 1, and is not described herein again to avoid repetition.

The document anomaly detection network model construction device provided by the embodiment constructs an initial document anomaly detection network model based on a YOLO framework, trains the initial document anomaly detection network model according to the size of a real boundary box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model, and can perform document anomaly detection on a document image through the document anomaly detection network model to improve the automation degree and accuracy of document anomaly detection.

Example 4

In addition, the embodiment of the disclosure provides a document abnormality detection device for a document image.

As shown in fig. 5, the document abnormality detection apparatus 500 of the document image includes:

an input module 501, configured to input a document image to be detected to a document anomaly detection network model, where the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided in embodiment 1;

the detection module 502 is configured to detect the document image to be detected through the document anomaly detection network model to obtain a document anomaly output result, where the document anomaly output result includes document anomaly coordinate information, object confidence, category probability, and a category to which the document anomaly output result belongs;

and a generating module 503, configured to generate a document anomaly detection result according to the document anomaly coordinate information, the object confidence, the category probability, and the category to which the document anomaly detection result belongs.

In one embodiment, the generating module 503 is further configured to determine an abnormal regression position according to the document abnormal coordinate information;

and determining the number of abnormal documents in each category according to the belonged category.

Example 5

Furthermore, an embodiment of the present disclosure provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the computer program executes, when running on the processor, the method for constructing a document abnormality detection network model provided in embodiment 1 or the method for detecting document abnormality of a document image provided in embodiment 2.

Specifically, referring to fig. 6, the electronic device 600 includes: the transceiver 601, the bus interface and the processor 602, when running on the processor 602, the computer program performs the method for constructing the document anomaly detection network model provided in embodiment 1, and specifically, the processor 602 is configured to: randomly selecting a text area based on a normal document image, and generating a document abnormal image sample set according to the text area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples;

carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples, and generating marking information files corresponding to the marked image samples;

and constructing an initial document anomaly detection network model based on a YOLO framework, and training the initial document anomaly detection network model according to the size of a real boundary box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model.

In addition, the computer program executes the method for detecting document abnormality of a document image provided in embodiment 2 when running on the processor, and specifically, the processor 602 is further configured to: inputting a document image to be detected into a document anomaly detection network model, wherein the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided by the embodiment 1;

In the embodiment of the present invention, the electronic device 600 further includes: a memory 603. In FIG. 6, the bus architecture may include any number of interconnected buses and bridges, with one or more processors represented by processor 602 and various circuits of memory represented by memory 603 linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The transceiver 601 may be a number of elements including a transmitter and a receiver that provide a means for communicating with various other apparatus over a transmission medium. The processor 602 is responsible for managing the bus architecture and general processing, and the memory 603 may store data used by the processor 602 in performing operations.

The electronic device 600 provided in the embodiment of the present invention may execute the steps of the method for constructing a network model for detecting document abnormalities in the foregoing method embodiment 1, or the steps of the method for detecting document abnormalities in a document image in embodiment 2, which are not described again.

The electronic device provided in this embodiment constructs an initial document anomaly detection network model based on a YOLO framework, trains the initial document anomaly detection network model according to the size of a real bounding box, the training image index list, and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model, and can perform document anomaly detection on a document image through the document anomaly detection network model, thereby improving the automation degree and accuracy of document anomaly detection.

Example 6

The present application also provides a computer-readable storage medium on which a computer program is stored, where the computer program, when executed by a processor, implements the method for constructing a document abnormality detection network model provided in embodiment 1, or implements the method for detecting document abnormality of a document image provided in embodiment 2.

In this embodiment, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

The computer-readable storage medium provided in this embodiment may implement the method for constructing a network model for detecting document anomalies provided in embodiment 1, or implement the method for detecting document anomalies of a document image provided in embodiment 2, and is not described herein again to avoid repetition.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one of 8230, and" comprising 8230does not exclude the presence of additional like elements in a process, method, article, or terminal comprising the element.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present application or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for constructing a document anomaly detection network model is characterized by comprising the following steps:

constructing an initial document anomaly detection network model based on a YOLO framework, and training the initial document anomaly detection network model according to the size of a real bounding box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model;

the training of the initial document abnormality detection network model according to the size of the real bounding box, the training image index list and the document abnormality image corresponding to the training image index list comprises the following steps:

loading marking information files corresponding to the training image index list through the initial document anomaly detection network model, acquiring document anomaly coordinate information of the loaded marking information files, and taking the size information of the real bounding box as input data of K-means clustering;

2. The method of claim 1, wherein generating a document abnormal image sample according to the text region comprises:

3. The method according to claim 2, wherein the document abnormal image samples comprise text overlapping samples and/or text shading samples, and the generating of the document abnormal image samples according to the edge frame and the original text frame of the normal document image comprises:

4. The method of claim 1, further comprising:

storing each document abnormal image into an image folder;

5. The method of claim 1, wherein the marking information file includes custom image object name, image file path, image size, and document anomaly coordinate information.

6. The method according to claim 1, characterized in that it comprises:

determining a second sample number of the document abnormal image test set;

7. A document abnormality detection method for a document image, characterized by comprising:

inputting a document image to be detected into a document anomaly detection network model, wherein the document anomaly detection network model is obtained according to the document anomaly detection network model construction method of any one of claims 1-6;

detecting the document image to be detected through the document anomaly detection network model to obtain a document anomaly output result, wherein the document anomaly output result comprises document anomaly coordinate information, object confidence, category probability and a category to which the document anomaly output result belongs;

8. The method of claim 7, wherein generating the document anomaly detection result according to the document anomaly coordinate information, the object confidence level, the class probability and the belonged class comprises:

9. An apparatus for constructing a document anomaly detection network model, the apparatus comprising:

the determining module is used for determining the first sample number of the document abnormal image training set, extracting the marking information files of the first sample number from the marking information files, and generating a training image index list according to the marking information files of the first sample number;

the training module is used for constructing an initial document abnormity detection network model based on a YOLO framework, and training the initial document abnormity detection network model according to the size of a real bounding box, the training image index list and a document abnormity training image corresponding to the training image index list to obtain a document abnormity detection network model;

the training module is also used for loading marking information files corresponding to the training image index list through the initial document abnormity detection network model, obtaining document abnormity coordinate information of the loaded marking information files, and taking the size information of the real bounding box as input data of K-means clustering;

10. An apparatus for detecting document abnormality of a document image, the apparatus comprising:

an input module, configured to input a document image to be detected to a document anomaly detection network model, where the document anomaly detection network model is obtained according to the document anomaly detection network model construction method according to any one of claims 1 to 6;

the detection module is used for detecting the document image to be detected through the document abnormity detection network model to obtain a document abnormity output result, and the document abnormity output result comprises document abnormity coordinate information, object confidence, class probability and the class to which the document abnormity output result belongs;

11. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the computer program executes the method for constructing a document abnormality detection network model according to any one of claims 1 to 6 or the method for detecting document abnormality of a document image according to claim 7 or 8 when the processor runs.

12. A computer-readable storage medium characterized by storing a computer program which, when run on a processor, executes the document abnormality detection network model construction method of any one of claims 1 to 6, or executes the document abnormality detection method of a document image of claim 7 or 8.