CN110008956B - Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium - Google Patents

Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium Download PDF

Info

Publication number
CN110008956B
CN110008956B CN201910256914.7A CN201910256914A CN110008956B CN 110008956 B CN110008956 B CN 110008956B CN 201910256914 A CN201910256914 A CN 201910256914A CN 110008956 B CN110008956 B CN 110008956B
Authority
CN
China
Prior art keywords
neural network
convolutional neural
invoice
key information
training data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910256914.7A
Other languages
Chinese (zh)
Other versions
CN110008956A (en
Inventor
张欢
李爱林
张仕洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Smart Shield Technology Co.,Ltd.
Original Assignee
Shenzhen Huafu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huafu Technology Co ltd filed Critical Shenzhen Huafu Technology Co ltd
Priority to CN201910256914.7A priority Critical patent/CN110008956B/en
Publication of CN110008956A publication Critical patent/CN110008956A/en
Application granted granted Critical
Publication of CN110008956B publication Critical patent/CN110008956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning

Abstract

The invention relates to an invoice key information positioning method, a device, computer equipment and a storage medium, wherein the method comprises the steps of acquiring an invoice image to be positioned; inputting the invoice image to be positioned into a convolutional neural network model for information extraction to obtain invoice key information; the convolutional neural network model is obtained by training the U-Net convolutional neural network by taking invoice images with characteristic labels as training data. According to the invention, the invoice image to be positioned is acquired, the invoice image to be positioned is input into the convolutional neural network model taking the U-Net convolutional neural network as a basic network, the invoice image to be positioned is classified by utilizing the convolutional neural network model, then the invoice image to be positioned is positioned to the key information part of the invoice image to be positioned, and the invoice key information is output, so that the invoice key information can be rapidly positioned, and the positioning accuracy is high.

Description

Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium
Technical Field
The invention relates to a text detection method, in particular to an invoice key information positioning method, an invoice key information positioning device, computer equipment and a storage medium.
Background
The general tax bill of added value tax is to bring the general tax payer of added value tax except for commercial retail into the anti-fake tax control system of added value tax to issue and manage, that is to say, the general tax payer can use the same anti-fake tax control system of added value tax to issue special tax bill, general tax bill of added value tax, etc. Enterprises, society, individuals, government and the like have more and more invoices to be sorted, more and more time is consumed on the invoices, and manual input and manual retrieval are not only time-consuming, but also prone to errors. Automated equipment is urgently needed for identification.
The traditional character recognition system mostly adopts the traditional computer vision algorithm, does not adopt the neural network, has lower accuracy, mostly needs to be matched with equipment such as a scanner to scan, has a certain recognition effect when the bill is clean and tidy and clear, and can not be used for various blurred bills in natural scenes. Most of the existing algorithms have poor effect on real data of any angle photographed by a mobile phone at will in reality. Especially when considering invoice and easy fold, fade, and the cell-phone is shot and is faced the multiple light and weather condition, multi-angle, multi-equipment, perspective etc. difficulty.
Two existing methods for positioning text information, namely positioning the position of key information by means of a scanner and fine-tuning the position by a traditional computer vision means, are very inconvenient, but the method depends on a special scanner. Moreover, the bill has fewer folds, and cannot cope with the actual folded bill; secondly, the user is required to pre-align and align the bill on the picture according to the instruction, detect all the text lines, send all the detected text into the recognition system for recognition, and analyze the layout analysis by using a very complex post-processing means to analyze what the recognized information is. In general, it is difficult to structurally align a large number of information points on a bill in this way, and since many redundant information is detected, errors in the redundant information will have unexpected effects on the extraction and positioning of the information, and many computing resources will be required in subsequent recognition, processing and analysis.
Therefore, a new method is needed to be designed, so that the invoice key information can be rapidly positioned, and the positioning accuracy is high.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide an invoice key information positioning method, an invoice key information positioning device, computer equipment and a storage medium.
In order to achieve the above purpose, the present invention adopts the following technical scheme: the invoice key information positioning method comprises the following steps:
acquiring an invoice image to be positioned;
inputting the invoice image to be positioned into a convolutional neural network model for information extraction to obtain invoice key information;
the convolutional neural network model is obtained by training the U-Net convolutional neural network by taking invoice images with characteristic labels as training data.
The further technical scheme is as follows: the convolutional neural network model is obtained by training a U-Net convolutional neural network by using invoice images with characteristic labels as training data, and comprises the following steps:
acquiring invoice images with characteristic labels to obtain training data;
constructing a U-Net convolutional neural network;
and learning the U-Net convolutional neural network based on the deep learning framework by utilizing the training data to obtain a convolutional neural network model.
The further technical scheme is as follows: the training data is utilized to learn the U-Net convolutional neural network based on a deep learning framework so as to obtain a convolutional neural network model, and the method comprises the following steps:
inputting training data into a U-Net convolutional neural network to obtain sample key information;
calculating a loss value according to the sample key information;
and learning the U-Net convolutional neural network based on the deep learning framework according to the loss value to obtain a convolutional neural network model.
The further technical scheme is as follows: the U-Net convolutional neural network is a network that produces a multi-layer invoice image quarter resolution feature map.
The further technical scheme is as follows: the step of inputting training data into the U-Net convolutional neural network to obtain sample key information comprises the following steps:
inputting training data into a U-Net convolutional neural network for convolutional processing to obtain a text box;
and merging the text boxes to form sample key information.
The invention also provides an invoice key information positioning device, which comprises:
the data acquisition unit is used for acquiring an invoice image to be positioned;
and the extraction unit is used for inputting the invoice image to be positioned into the convolutional neural network model for information extraction so as to obtain invoice key information.
The further technical scheme is as follows: the apparatus further comprises:
and the model training unit is used for training the U-Net convolutional neural network by taking the invoice image with the characteristic tag as training data so as to obtain a convolutional neural network model.
The further technical scheme is as follows: the model training unit includes:
the training data acquisition subunit is used for acquiring invoice images with characteristic labels so as to obtain training data;
the network construction subunit is used for constructing a U-Net convolutional neural network;
and the learning subunit is used for learning the U-Net convolutional neural network based on the deep learning framework by utilizing the training data so as to obtain a convolutional neural network model.
The invention also provides a computer device which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the method when executing the computer program.
The present invention also provides a storage medium storing a computer program which, when executed by a processor, performs the above-described method.
Compared with the prior art, the invention has the beneficial effects that: according to the invention, the invoice image to be positioned is acquired, the invoice image to be positioned is input into the convolutional neural network model taking the U-Net convolutional neural network as a basic network, the invoice image to be positioned is classified by utilizing the convolutional neural network model, then the invoice image to be positioned is positioned to the key information part of the invoice image to be positioned, and the invoice key information is output, so that the invoice key information can be rapidly positioned, and the positioning accuracy is high.
The invention is further described below with reference to the drawings and specific embodiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of an application scenario of an invoice key information positioning method provided by an embodiment of the present invention;
FIG. 2 is a flow chart of an invoice key information positioning method according to an embodiment of the present invention;
FIG. 3 is a schematic sub-flowchart of an invoice key information positioning method according to an embodiment of the present invention;
FIG. 4 is a schematic sub-flowchart of an invoice key information positioning method according to an embodiment of the present invention;
FIG. 5 is a schematic sub-flowchart of an invoice key information positioning method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of exemplary road segmentation pixel level results processed using a U-Ne convolutional neural network, according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a visual feature map provided by an embodiment of the present invention;
FIG. 8 is a schematic diagram of sample key information provided by an embodiment of the present invention;
FIG. 9 is a schematic block diagram of an invoice key information positioning device provided by an embodiment of the invention;
fig. 10 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic diagram of an application scenario of an invoice key information positioning method according to an embodiment of the present invention. Fig. 2 is a schematic flow chart of an invoice key information positioning method provided by an embodiment of the invention. The invoice key information positioning method is applied to the server. And the server performs data interaction with the terminal, and after acquiring an invoice image to be positioned from the terminal, the invoice image is classified and positioned by the convolutional neural network model so as to output invoice key information and display the invoice key information to the terminal.
Fig. 2 is a flow chart of an invoice key information positioning method according to an embodiment of the present invention. As shown in fig. 2, the method includes the following steps S110 to S120.
S110, acquiring an invoice image to be positioned.
In this embodiment, the invoice image to be located refers to a photograph obtained by taking an invoice by the terminal.
S120, inputting the invoice image to be positioned into a convolutional neural network model for information extraction so as to obtain invoice key information.
In this embodiment, the invoice key information refers to text information such as an invoice code, an invoice number, an invoicing date, an amount, a tax, and a total amount on an invoice.
In this embodiment, the convolutional neural network model is obtained by training the U-Net convolutional neural network by using the invoice image with the feature tag as training data.
The convolutional neural network model is widely used for computer vision tasks such as target detection, instance segmentation, object classification and the like, achieves a very good effect, and shows good adaptability to the computer vision tasks. For key point location, a typical road segmentation pixel level result processed using a U-Net convolutional neural network can be seen in fig. 6, referencing the solution of the image segmentation task.
The U-Net convolutional neural network is used for extracting the feature layer, and due to the particularity of invoice image data, training data is not enough like general street scene data, and the effect is not good. And in order to enhance the anti-interference capability, the characteristics extracted by the U-Net convolutional neural network are constrained, and a special loss function is adopted to train the U-Net convolutional neural network.
In one embodiment, referring to fig. 3, the step S120 may include steps S121 to S123.
S121, acquiring invoice images with characteristic labels to obtain training data.
In this embodiment, the training data refers to an invoice image integrated dataset with text feature labels.
Specifically, a large number of invoice images can be downloaded from a website, and 14 layers of characteristics of the invoice images are labeled and then input into a network for training.
S12, constructing a U-Net convolutional neural network.
In this embodiment, the U-Net convolutional neural network is mainly composed of two parts: a contracted path and an expanded path. The shrink path is mainly used for capturing context information in the picture, and the symmetrical expansion path is used for accurately positioning the part required to be segmented in the picture. Many times deep learning architectures require a large number of instances and computational resources, but U-Net is improved based on FCN (full convolutional neural network, fully Convultional Neural Network) and some relatively few samples of data can be trained with data enhancement. The construction of the U-Net convolutional neural network can carry out convolutional classification on training data and output positioning information of information such as text and the like so as to form a feature map.
The U-Net convolutional neural network is a network that produces a multi-layer invoice image quarter resolution feature map.
The U-Net convolutional neural network is continuously trained by minimizing the difference between the predicted feature map and the actual feature label and the image, and the training method is a universal training method in deep learning, so that the predicted value and the actual value are almost the same. The first eight layers of feature maps of the U-Net convolutional neural network are used for multi-classification to classify the background, the invoice code, the invoice number, the billing date, the amount, the tax and the total amount. For each position on the first eight layers of feature maps, there are eight values of 0 to 1 on each layer of feature maps, the sum of these eight values being 1, the value being the largest on which layer of feature map, representing which class the position has the highest probability. This allows to find which category each pixel corresponds to. After predicting the background and the category in this way, predicting whether the generated corresponding position of the ninth layer of feature map is close to the boundary or not for the non-background pixel, and predicting whether each boundary position is a left boundary or a right boundary according to the tenth layer of feature map. Eleven to fourteen layers of feature maps predict the position of the boundary and the position of the nearest two boundary vertices, and finally the position of the feature map in each text box can predict one text box.
The invoice image with the 14-layer characteristic labels is used as training data, the U-Net convolutional neural network is trained, the whole U-Net convolutional neural network learns how to generate characteristics, the correct 14-layer characteristic diagram representation is given, the information of a text box can be recovered from the characteristic diagram representation, and the visualized characteristic diagram is shown in fig. 7.
S123, learning the U-Net convolutional neural network based on the deep learning framework by utilizing training data so as to obtain a convolutional neural network model.
In one embodiment, the step S123 may include steps S1231-S1233.
S1231, inputting training data into the U-Net convolutional neural network to obtain sample key information.
In this embodiment, the sample key information refers to text box information formed after the training data is classified and located by the U-Net convolutional neural network.
In one embodiment, the step S1231 includes steps S1231 a-S1232 b.
S1231a, inputting training data into the U-Net convolutional neural network to carry out convolutional processing so as to obtain a text box.
In this embodiment, the U-Net convolutional neural network performs the classification of the background and the invoice on the training data, and then performs the text positioning on the invoice itself to obtain the text box.
Specifically, firstly forming a characteristic diagram of an invoice, forming a text box through the characteristic diagram of the invoice, and generally, carrying out convolution processing on the invoice from a ninth layer characteristic diagram to a fourteenth layer characteristic diagram of a U-Net convolution neural network to obtain a visualized characteristic diagram of each layer, and positioning four endpoints of the characteristic diagram to form the text box.
S1231b, merging the text boxes to form sample key information.
In one embodiment, an invoice has multiple feature maps, and therefore multiple text boxes, and text information defined by the text boxes forms key information, so that the acquired text boxes need to be combined to form sample key information. Finally, the positions of the key information of the sample can be determined according to the positions of the four vertexes of the text box, as shown in fig. 8.
S1232, calculating a loss value according to the sample key information.
In this embodiment, a Loss value is calculated according to sample key information by using a Loss function, where the Loss value represents a difference value between sample key information output by a U-Net convolutional neural network and a feature tag in training data, when the Loss value is larger, the difference between the sample key information and the feature tag is larger, a model formed by the current U-Net convolutional neural network is not suitable for use, when the Loss value is larger, the difference between the sample key information and the feature tag is smaller, and when the Loss value approaches a certain threshold, the model formed by the current U-Net convolutional neural network is suitable for use, where the threshold may be determined according to practical situations.
S1233, learning the U-Net convolutional neural network based on the deep learning framework according to the loss value to obtain a convolutional neural network model.
In this embodiment, a tensorsurface deep learning framework is used as a training and learning method for the U-Net convolutional neural network.
In the learning process, extensive and sufficient experiments are performed to adjust various parameters of the network, so that the U-Net convolutional neural network in the paper can adapt to specific task and speed requirements. The U-Net convolutional neural network has higher efficiency, and only about 1.28 Gflow is needed for single forward calculation, so that the forward calculation can process a large number of text detection tasks in real time and in parallel.
By adopting the full convolution network, pictures with arbitrary resolution can be processed. In addition, the bill detection algorithm capable of being put into practical use needs to face a series of problems of blurring of pictures, poor illumination, physical deformation and the like. Through fine and extensive picture augmentation and generation, the problem is carefully handled, so that the algorithm achieves very good effect under the condition of real scenes and specific business inspection. The method can be operated on android equipment, iOS equipment and a server, and can rapidly locate invoice key information characters.
According to the invoice key information positioning method, the invoice image to be positioned is acquired and is input into the convolutional neural network model taking the U-Net convolutional neural network as the basic network, the invoice image to be positioned is classified by the convolutional neural network model, then is positioned to the key information part of the invoice image to be positioned, and the invoice key information is output, so that the invoice key information is rapidly positioned, and the positioning accuracy is high.
Fig. 9 is a schematic block diagram of an invoice key information positioning apparatus 300 according to an embodiment of the present invention. As shown in fig. 9, the present invention also provides an invoice key information positioning device 300 corresponding to the above invoice key information positioning method. The invoice key information positioning apparatus 300 includes a unit for performing the invoice key information positioning method described above, and may be configured in a server.
Specifically, referring to fig. 9, the invoice key information positioning apparatus 300 includes:
a data acquisition unit 301, configured to acquire an invoice image to be positioned;
and the extracting unit 302 is used for inputting the invoice image to be positioned into the convolutional neural network model for information extraction so as to obtain invoice key information.
In an embodiment, the device further comprises:
and the model training unit is used for training the U-Net convolutional neural network by taking the invoice image with the characteristic tag as training data so as to obtain a convolutional neural network model.
In an embodiment, the model training unit comprises:
the training data acquisition subunit is used for acquiring invoice images with characteristic labels so as to obtain training data;
the network construction subunit is used for constructing a U-Net convolutional neural network;
and the learning subunit is used for learning the U-Net convolutional neural network based on the deep learning framework by utilizing the training data so as to obtain a convolutional neural network model.
In an embodiment, the learning subunit comprises:
the sample key information forming module is used for inputting training data into the U-Net convolutional neural network to obtain sample key information;
the loss value calculation module is used for calculating a loss value according to the sample key information;
and the deep learning module is used for learning the U-Net convolutional neural network based on the deep learning framework according to the loss value so as to obtain a convolutional neural network model.
In one embodiment, the sample key information forming module includes:
the text box forming sub-module is used for inputting training data into the U-Net convolutional neural network to carry out convolutional processing so as to obtain a text box;
and the merging sub-module is used for merging the text boxes to form sample key information.
It should be noted that, as will be clearly understood by those skilled in the art, the specific implementation process of the invoice key information positioning device 300 and each unit may refer to the corresponding description in the foregoing method embodiment, and for convenience and brevity of description, the detailed description is omitted herein.
The invoice key information positioning apparatus 300 described above may be implemented in the form of a computer program which can be run on a computer device as shown in fig. 10.
Referring to fig. 10, fig. 10 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a server.
With reference to FIG. 10, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform a method of locating invoice key information.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform an invoice key information locating method.
The network interface 505 is used for network communication with other devices. Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of a portion of the architecture in connection with the present application and is not intended to limit the computer device 500 to which the present application is applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to implement the steps of:
acquiring an invoice image to be positioned;
inputting the invoice image to be positioned into a convolutional neural network model for information extraction to obtain invoice key information;
the convolutional neural network model is obtained by training the U-Net convolutional neural network by taking invoice images with characteristic labels as training data.
In one embodiment, when the processor 502 implements the convolutional neural network model, the steps obtained by training the U-Net convolutional neural network with the invoice image with the feature tag as training data, specifically implement the following steps:
acquiring invoice images with characteristic labels to obtain training data;
constructing a U-Net convolutional neural network;
and learning the U-Net convolutional neural network based on the deep learning framework by utilizing the training data to obtain a convolutional neural network model.
In one embodiment, when the step of learning the U-Net convolutional neural network based on the deep learning framework by using the training data to obtain the convolutional neural network model is implemented by the processor 502, the following steps are specifically implemented:
inputting training data into a U-Net convolutional neural network to obtain sample key information;
calculating a loss value according to the sample key information;
and learning the U-Net convolutional neural network based on the deep learning framework according to the loss value to obtain a convolutional neural network model.
Wherein the U-Net convolutional neural network is a network that produces a multi-layer invoice image quarter resolution feature map.
In one embodiment, the processor 502 performs the following steps when implementing the step of inputting training data into the U-Net convolutional neural network to obtain sample key information:
inputting training data into a U-Net convolutional neural network for convolutional processing to obtain a text box;
and merging the text boxes to form sample key information.
It should be appreciated that in embodiments of the present application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), the processor 502 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring an invoice image to be positioned;
inputting the invoice image to be positioned into a convolutional neural network model for information extraction to obtain invoice key information;
the convolutional neural network model is obtained by training the U-Net convolutional neural network by taking invoice images with characteristic labels as training data.
In one embodiment, when the processor executes the computer program to implement the convolutional neural network model as a step of training the U-Net convolutional neural network by using the invoice image with the feature tag as training data, the following steps are specifically implemented:
acquiring invoice images with characteristic labels to obtain training data;
constructing a U-Net convolutional neural network;
and learning the U-Net convolutional neural network based on the deep learning framework by utilizing the training data to obtain a convolutional neural network model.
In one embodiment, when the processor executes the computer program to realize the step of learning the U-Net convolutional neural network based on the deep learning framework by using training data to obtain a convolutional neural network model, the processor specifically realizes the following steps:
inputting training data into a U-Net convolutional neural network to obtain sample key information;
calculating a loss value according to the sample key information;
and learning the U-Net convolutional neural network based on the deep learning framework according to the loss value to obtain a convolutional neural network model.
Wherein the U-Net convolutional neural network is a network that produces a multi-layer invoice image quarter resolution feature map.
In one embodiment, the processor, when executing the computer program to implement the step of inputting training data into the U-Net convolutional neural network to obtain sample key information, specifically implements the following steps:
inputting training data into a U-Net convolutional neural network for convolutional processing to obtain a text box;
and merging the text boxes to form sample key information.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (6)

1. The invoice key information positioning method is characterized by comprising the following steps of:
acquiring an invoice image to be positioned;
inputting the invoice image to be positioned into a convolutional neural network model for information extraction to obtain invoice key information;
the convolutional neural network model is obtained by training a U-Net convolutional neural network by taking invoice images with characteristic labels as training data;
the convolutional neural network model is obtained by training a U-Net convolutional neural network by using invoice images with characteristic labels as training data, and comprises the following steps:
acquiring invoice images with characteristic labels to obtain training data;
constructing a U-Net convolutional neural network;
learning the U-Net convolutional neural network based on a deep learning framework by utilizing training data to obtain a convolutional neural network model;
the first eight layers of feature maps of the U-Net convolutional neural network are used for multi-classification; predicting whether the position corresponding to the ninth layer of feature map is close to the boundary or not, and predicting whether each boundary position is a left boundary or a right boundary according to the tenth layer of feature map; eleven to fourteen layers of feature images predict the position deviation of the boundary and the positions of the nearest two boundary vertexes; taking the invoice image with the 14-layer characteristic label as training data, and training the U-Net convolutional neural network;
the training data is utilized to learn the U-Net convolutional neural network based on a deep learning framework so as to obtain a convolutional neural network model, and the method comprises the following steps:
inputting training data into a U-Net convolutional neural network to obtain sample key information;
calculating a loss value according to the sample key information;
and learning the U-Net convolutional neural network based on the deep learning framework according to the loss value to obtain a convolutional neural network model.
2. The invoice critical information positioning method according to claim 1, wherein the U-Net convolutional neural network is a network that generates a multi-layered invoice image quarter resolution feature map.
3. The invoice key information locating method according to claim 1, wherein inputting training data into a U-Net convolutional neural network to obtain sample key information, comprises:
inputting training data into a U-Net convolutional neural network for convolutional processing to obtain a text box;
and merging the text boxes to form sample key information.
4. Invoice key information positioner, its characterized in that includes:
the data acquisition unit is used for acquiring an invoice image to be positioned;
the extraction unit is used for inputting the invoice image to be positioned into the convolutional neural network model for information extraction so as to obtain invoice key information;
the apparatus further comprises:
the model training unit is used for training the U-Net convolutional neural network by taking the invoice image with the characteristic tag as training data so as to obtain a convolutional neural network model;
the model training unit includes:
the training data acquisition subunit is used for acquiring invoice images with characteristic labels so as to obtain training data;
the network construction subunit is used for constructing a U-Net convolutional neural network;
the learning subunit is used for learning the U-Net convolutional neural network based on the deep learning framework by utilizing the training data so as to obtain a convolutional neural network model;
the first eight layers of feature maps of the U-Net convolutional neural network are used for multi-classification; predicting whether the position corresponding to the ninth layer of feature map is close to the boundary or not, and predicting whether each boundary position is a left boundary or a right boundary according to the tenth layer of feature map; eleven to fourteen layers of feature images predict the position deviation of the boundary and the positions of the nearest two boundary vertexes; taking the invoice image with the 14-layer characteristic label as training data, and training the U-Net convolutional neural network;
the learning subunit includes:
the sample key information forming module is used for inputting training data into the U-Net convolutional neural network to obtain sample key information;
the loss value calculation module is used for calculating a loss value according to the sample key information;
and the deep learning module is used for learning the U-Net convolutional neural network based on the deep learning framework according to the loss value so as to obtain a convolutional neural network model.
5. A computer device, characterized in that it comprises a memory on which a computer program is stored and a processor which, when executing the computer program, implements the method according to any of claims 1-3.
6. A storage medium storing a computer program which, when executed by a processor, performs the method of any one of claims 1 to 3.
CN201910256914.7A 2019-04-01 2019-04-01 Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium Active CN110008956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910256914.7A CN110008956B (en) 2019-04-01 2019-04-01 Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910256914.7A CN110008956B (en) 2019-04-01 2019-04-01 Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110008956A CN110008956A (en) 2019-07-12
CN110008956B true CN110008956B (en) 2023-07-07

Family

ID=67169206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910256914.7A Active CN110008956B (en) 2019-04-01 2019-04-01 Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110008956B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516541B (en) * 2019-07-19 2022-06-10 金蝶软件(中国)有限公司 Text positioning method and device, computer readable storage medium and computer equipment
CN110458162B (en) * 2019-07-25 2023-06-23 上海兑观信息科技技术有限公司 Method for intelligently extracting image text information
CN110517186B (en) * 2019-07-30 2023-07-07 金蝶软件(中国)有限公司 Method, device, storage medium and computer equipment for eliminating invoice seal
CN110738092B (en) * 2019-08-06 2024-04-02 深圳市华付信息技术有限公司 Invoice text detection method
CN110674815A (en) * 2019-09-29 2020-01-10 四川长虹电器股份有限公司 Invoice image distortion correction method based on deep learning key point detection
CN110751088A (en) * 2019-10-17 2020-02-04 深圳金蝶账无忧网络科技有限公司 Data processing method and related equipment
CN111652232B (en) * 2020-05-29 2023-08-22 泰康保险集团股份有限公司 Bill identification method and device, electronic equipment and computer readable storage medium
CN112069893A (en) * 2020-08-03 2020-12-11 中国铁道科学研究院集团有限公司电子计算技术研究所 Bill processing method and device, electronic equipment and storage medium
CN112115934A (en) * 2020-09-16 2020-12-22 四川长虹电器股份有限公司 Bill image text detection method based on deep learning example segmentation
CN112257712B (en) * 2020-10-29 2024-02-27 湖南星汉数智科技有限公司 Train ticket image alignment method and device, computer device and computer readable storage medium
CN112686307A (en) * 2020-12-30 2021-04-20 平安普惠企业管理有限公司 Method, device and storage medium for obtaining invoice based on artificial intelligence
CN116311297A (en) * 2023-04-12 2023-06-23 国网河北省电力有限公司 Electronic evidence image recognition and analysis method based on computer vision

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845440B (en) * 2017-02-13 2020-04-10 山东万腾电子科技有限公司 Augmented reality image processing method and system
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
CN108256555B (en) * 2017-12-21 2020-10-16 北京达佳互联信息技术有限公司 Image content identification method and device and terminal
CN108921163A (en) * 2018-06-08 2018-11-30 南京大学 A kind of packaging coding detection method based on deep learning
CN109214382A (en) * 2018-07-16 2019-01-15 顺丰科技有限公司 A kind of billing information recognizer, equipment and storage medium based on CRNN
CN109345553B (en) * 2018-08-31 2020-11-06 厦门熵基科技有限公司 Palm and key point detection method and device thereof, and terminal equipment
CN109345540B (en) * 2018-09-15 2021-07-13 北京市商汤科技开发有限公司 Image processing method, electronic device and storage medium

Also Published As

Publication number Publication date
CN110008956A (en) 2019-07-12

Similar Documents

Publication Publication Date Title
CN110008956B (en) Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium
CN110689037B (en) Method and system for automatic object annotation using deep networks
CN110135411B (en) Business card recognition method and device
CN108229303B (en) Detection recognition and training method, device, equipment and medium for detection recognition network
EP3844669A1 (en) Method and system for facilitating recognition of vehicle parts based on a neural network
CN111695486B (en) High-precision direction signboard target extraction method based on point cloud
CN108229418B (en) Human body key point detection method and apparatus, electronic device, storage medium, and program
CN111488873B (en) Character level scene text detection method and device based on weak supervision learning
CN113723377A (en) Traffic sign detection method based on LD-SSD network
CN110751146A (en) Text region detection method, text region detection device, electronic terminal and computer-readable storage medium
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN114677596A (en) Remote sensing image ship detection method and device based on attention model
CN110942456B (en) Tamper image detection method, device, equipment and storage medium
CN110781195B (en) System, method and device for updating point of interest information
Salunkhe et al. Recognition of multilingual text from signage boards
CN114708582B (en) AI and RPA-based electric power data intelligent inspection method and device
CN111178200A (en) Identification method of instrument panel indicator lamp and computing equipment
CN113065559B (en) Image comparison method and device, electronic equipment and storage medium
Mubarak et al. Effect of Gaussian filtered images on Mask RCNN in detection and segmentation of potholes in smart cities
CN112699898B (en) Image direction identification method based on multi-layer feature fusion
Xu et al. Pushing the envelope of thin crack detection
CN111680691B (en) Text detection method, text detection device, electronic equipment and computer readable storage medium
Rani et al. Object Detection in Natural Scene Images Using Thresholding Techniques
CN113628113A (en) Image splicing method and related equipment thereof
CN113408502B (en) Gesture recognition method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant after: Shenzhen Huafu Technology Co.,Ltd.

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant before: SHENZHEN HUAFU INFORMATION TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231214

Address after: 518000, 1702, Dashi Building, No. 28 Keji South 1st Road, Gaoxin District Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Patentee after: Shenzhen Smart Shield Technology Co.,Ltd.

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: Shenzhen Huafu Technology Co.,Ltd.

TR01 Transfer of patent right