Summary of the invention
It is an object of the invention to overcome the deficiencies of existing technologies, invoice key message localization method, device, calculating are provided
Machine equipment and storage medium.
To achieve the above object, the invention adopts the following technical scheme: invoice key message localization method, comprising:
Obtain invoice image to be positioned;
Invoice image to be positioned is inputted in convolutional neural networks model and carries out information extraction, to obtain invoice key letter
Breath;
Wherein, the convolutional neural networks model is by the invoice image with feature tag as training data training
U-Net convolutional neural networks are resulting.
Its further technical solution are as follows: the convolutional neural networks model is made by the invoice image with feature tag
It is resulting for training data training U-Net convolutional neural networks, comprising:
The invoice image for having feature tag is obtained, to obtain training data;
Construct U-Net convolutional neural networks;
It is based on deep learning frame using training data to learn U-Net convolutional neural networks, to obtain convolution mind
Through network model.
Its further technical solution are as follows: described to be based on deep learning frame to U-Net convolutional Neural net using training data
Network is learnt, to obtain convolutional neural networks model, comprising:
Training data is inputted in U-Net convolutional neural networks, to obtain sample key message;
Penalty values are calculated according to sample key message;
It is based on deep learning frame according to penalty values to learn U-Net convolutional neural networks, to obtain convolutional Neural
Network model.
Its further technical solution are as follows: the U-Net convolutional neural networks are the invoice image a quarters for generating multilayer
The network of the characteristic pattern of resolution ratio.
Its further technical solution are as follows: it is described to input training data in U-Net convolutional neural networks, to obtain sample pass
Key information, comprising:
Training data is inputted in U-Net convolutional neural networks and carries out process of convolution, to obtain text box;
Merge the text box, to form sample key message.
The present invention also provides invoice key message positioning devices, comprising:
Data capture unit, for obtaining invoice image to be positioned;
Extraction unit carries out information extraction for inputting invoice image to be positioned in convolutional neural networks model, with
To invoice key message.
Its further technical solution are as follows: described device further include:
Model training unit, for training U-Net convolution as training data by the invoice image with feature tag
Neural network, to obtain convolutional neural networks model.
Its further technical solution are as follows: the model training unit includes:
Training data obtains subelement, for obtaining the invoice image for having feature tag, to obtain training data;
Network struction subelement, for constructing U-Net convolutional neural networks;
Learn subelement, for being based on deep learning frame to U-Net convolutional neural networks using training data
It practises, to obtain convolutional neural networks model.
The present invention also provides a kind of computer equipment, the computer equipment includes memory and processor, described to deposit
Computer program is stored on reservoir, the processor realizes above-mentioned method when executing the computer program.
The present invention also provides a kind of storage medium, the storage medium is stored with computer program, the computer journey
Sequence can realize above-mentioned method when being executed by processor.
Compared with the prior art, the invention has the advantages that: the present invention, will be undetermined by obtaining invoice image to be positioned
Position invoice image is input to the convolutional neural networks model with U-Net convolutional neural networks for basic network, utilizes convolutional Neural
After network model is first sub-category to invoice image to be positioned progress, repositioning to the key message part of invoice image to be positioned,
And invoice key message is exported, realization quickly locates out invoice key message, and locating accuracy is high.
The invention will be further described in the following with reference to the drawings and specific embodiments.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair
Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall within the protection scope of the present invention.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction
Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded
Body, step, operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this description of the invention merely for the sake of description specific embodiment
And be not intended to limit the present invention.As description of the invention and it is used in the attached claims, unless on
Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in description of the invention and the appended claims is
Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
Fig. 1 and Fig. 2 are please referred to, Fig. 1 is the application scenarios of invoice key message localization method provided in an embodiment of the present invention
Schematic diagram.Fig. 2 is the schematic flow chart of invoice key message localization method provided in an embodiment of the present invention.Invoice key letter
Localization method is ceased to be applied in server.Data interaction is carried out between server and terminal, obtains invoice figure to be positioned from terminal
Classified and positioned as after, then by convolutional neural networks model, is shown with exporting invoice key message to terminal.
Fig. 2 is the flow diagram of invoice key message localization method provided in an embodiment of the present invention.As shown in Fig. 2, should
Method includes the following steps S110 to S120.
S110, invoice image to be positioned is obtained.
In the present embodiment, invoice image to be positioned, which refers to, shoots the resulting photo of invoice by terminal.
S120, information extraction will be carried out in invoice image to be positioned input convolutional neural networks model, to obtain invoice pass
Key information.
In the present embodiment, invoice key message refer to invoice codes on invoice, invoice number, the date of making out an invoice, the amount of money,
The information of the texts such as the amount of tax to be paid, total amount.
In the present embodiment, above-mentioned convolutional neural networks model is by the invoice image with feature tag as instruction
It is resulting to practice data training U-Net convolutional neural networks.
Convolutional neural networks model is widely used in target detection, and example is divided, in the Computer Vision Tasks such as object classification,
Very good effect is achieved, shows its adaptability good for Computer Vision Task.For crucial point location, borrow
The settling mode for image segmentation task of reflecting, the typical lane segmentation Pixel-level result handled using U-Net convolutional neural networks
It can be found in Fig. 6.
It goes to extract characteristic layer using U-Net convolutional neural networks, due to the particularity of invoice image data, training data is not
Street scene data are sufficient as, and effect is simultaneously bad.And in order to enhance anti-interference ability, U-Net convolutional neural networks are mentioned
The feature taken is constrained, and goes to train U-Net convolutional neural networks using special loss function.
In one embodiment, referring to Fig. 3, above-mentioned step S120 may include step S121~S123.
S121, the invoice image for having feature tag is obtained, to obtain training data.
In the present embodiment, training data refers to the integrated data set of the invoice image with text feature label.
Specifically, a large amount of invoice image can be downloaded from website, and 14 layers of feature are carried out to invoice image and carry out label mark
It after fixed, then is input in network, is trained.
S12, building U-Net convolutional neural networks.
In the present embodiment, U-Net convolutional neural networks are mainly made of two parts: constricted path and extensions path.It receives
Contracting path is primarily used to capture the contextual information in picture, and the extensions path claimed in contrast is then in order to in picture
The required part split carries out precise positioning.Many times the structure of deep learning needs a large amount of example and calculates money
Source, but U-Net, which is based on FCN (full convolutional neural networks, Fully Convultional Neural Network), to be changed
Into, and the data of some fewer samples can be trained using data enhancing.Constructing U-Net convolutional neural networks can
To carry out convolution classification to training data, and the location information of the information such as text is exported, to form characteristic pattern.
The U-Net convolutional neural networks are the nets for generating the characteristic pattern of invoice image a quarter resolution ratio of multilayer
Network.
By minimizing the characteristic pattern of prediction and the difference of actual feature tag and image, constantly training U-Net convolution
Neural network, the training method are training method general in deep learning, keep predicted value and actual value very nearly the same.Wherein,
The first eight layer of characteristic pattern of U-Net convolutional neural networks sorts out background, invoice codes, invoice number is made out an invoice day for classifying more
Phase, the amount of money, the amount of tax to be paid, total amount.For each position on the first eight layer of characteristic pattern, correspond on every layer of characteristic pattern eight big
The small value for being 0 to 1, this eight values and be 1, the value is maximum on the characteristic pattern of which layer, and representing the position has maximum probability category
In which kind of.Each pixel can be found in this way corresponding to which type.After predicting background and classification in this way, for
Non- background pixel, generates whether lean on proximal border in the 9th layer of characteristic pattern corresponding position prediction, and the tenth layer of characteristic pattern predicts each side
Boundary position is left margin or right margin.Ten one to ten four layers of characteristic pattern then predicted boundary position and two nearest border vertices
Positional shift, the position of the characteristic pattern in final each text box can predict a text box.
Using the invoice image with 14 layers of feature tag as training data, go to train U-Net convolutional neural networks, so that
Entire U-Net convolutional neural networks learn how generation feature, and providing correct 14 layers of characteristic pattern indicates, and then can be therefrom extensive
The information of multiple text box, visual characteristic pattern are as shown in Figure 7.
S123, U-Net convolutional neural networks are learnt based on deep learning frame using training data, to be rolled up
Product neural network model.
In one embodiment, above-mentioned step S123 may include step S1231~S1233.
S1231, training data is inputted in U-Net convolutional neural networks, to obtain sample key message.
In the present embodiment, sample key message refer to training data by U-Net convolutional neural networks carry out classification and
The text box information formed after positioning.
In one embodiment, above-mentioned step S1231 may include step S1231a~S1232b.
S1231a, process of convolution will be carried out in training data input U-Net convolutional neural networks, to obtain text box.
In the present embodiment, it is right after U-Net convolutional neural networks can first carry out the classification of background and invoice to training data
Invoice itself carries out the positioning of text, to obtain text box.
Specifically, it is initially formed the characteristic pattern of invoice, in the pattern image by invoice at text box, under normal circumstances, from
9th layer of characteristic pattern of U-Net convolutional neural networks carries out process of convolution to invoice to the 14th layer of characteristic pattern, each to obtain
The visual characteristic pattern of layer, then text box is formed by four endpoint locations of characteristic pattern.
S1231b, merge the text box, to form sample key message.
In one embodiment, there are multiple characteristic patterns on an invoice, therefore have multiple text boxes, this multiple text circle
Fixed text information forms key message, and therefore, it is necessary to the text boxes that will acquire to merge, and forms sample key message.
The position where sample key message can be finally determined according to the position on four vertex of text box, as shown in Figure 8.
S1232, penalty values are calculated according to sample key message.
In the present embodiment, penalty values are calculated using Loss function according to sample key message, penalty values expression passes through
The sample key message of U-Net convolutional neural networks output and the difference value of the feature tag in training data, when penalty values are got over
When big, the difference of the two is bigger, then current U-Net convolutional neural networks are formed by model and are not suitable for using working as penalty values
When bigger, the difference of the two is smaller, and when penalty values approach a certain threshold value, then current U-Net convolutional neural networks institute
The model of formation is suitable for using, the threshold value can according to the actual situation depending on.
S1233, U-Net convolutional neural networks are learnt based on deep learning frame according to penalty values, to be rolled up
Product neural network model.
In the present embodiment, use tensorflow deep learning frame as the training of U-Net convolutional neural networks and
Learning method.
In learning process, parameters that are extensive and adequately testing de-regulation network are carried out, the U- in paper is made
Net convolutional neural networks are adapted to specific task and rate request.The efficiency of U-Net convolutional neural networks is higher, single
Forward calculation only has about 1.28Gflops, and forward calculation being capable of a large amount of text detection tasks of real-time parallel processing.
Due to using full convolutional network, it is capable of handling the picture of arbitrary resolution.In addition, real actual use can be put into
Bill detection algorithm need to face the fuzzy of picture, a series of problems, such as illumination is bad, physical deformation etc..By fine and wide
General picture augmentation and generation, careful has handled this problem well, so that algorithm takes under reality scene, specific service test
Obtain extraordinary effect.This method may operate on android equipment, iOS device and server, can quickly position invoice pass
Key information text.
Above-mentioned invoice key message localization method, it is by obtaining invoice image to be positioned, invoice image to be positioned is defeated
Enter to the convolutional neural networks model for taking U-Net convolutional neural networks as basic network, it is first right using convolutional neural networks model
After invoice image progress to be positioned is sub-category, repositioning to the key message part of invoice image to be positioned, and export invoice pass
Key information, realization quickly locates out invoice key message, and locating accuracy is high.
Fig. 9 is a kind of schematic block diagram of invoice key message positioning device 300 provided in an embodiment of the present invention.Such as Fig. 9
It is shown, correspond to the above invoice key message localization method, the present invention also provides a kind of invoice key message positioning devices 300.
The invoice key message positioning device 300 includes the unit for executing above-mentioned invoice key message localization method, which can
To be configured in server.
Specifically, referring to Fig. 9, the invoice key message positioning device 300 includes:
Data capture unit 301, for obtaining invoice image to be positioned;
Extraction unit 302 carries out information extraction for inputting invoice image to be positioned in convolutional neural networks model, with
Obtain invoice key message.
In one embodiment, described device further include:
Model training unit, for training U-Net convolution as training data by the invoice image with feature tag
Neural network, to obtain convolutional neural networks model.
In one embodiment, the model training unit includes:
Training data obtains subelement, for obtaining the invoice image for having feature tag, to obtain training data;
Network struction subelement, for constructing U-Net convolutional neural networks;
Learn subelement, for being based on deep learning frame to U-Net convolutional neural networks using training data
It practises, to obtain convolutional neural networks model.
In one embodiment, the study subelement includes:
Sample key message forms module, for inputting training data in U-Net convolutional neural networks, to obtain sample
Key message;
Penalty values computing module, for calculating penalty values according to sample key message;
Deep learning module, for being based on deep learning frame to U-Net convolutional neural networks according to penalty values
It practises, to obtain convolutional neural networks model.
In one embodiment, the sample key message formation module includes:
Text box forms submodule, carries out process of convolution for inputting training data in U-Net convolutional neural networks, with
Obtain text box;
Merge submodule, for merging the text box, to form sample key message.
It should be noted that it is apparent to those skilled in the art that, above-mentioned invoice key message positioning
The specific implementation process of device 300 and each unit, can be with reference to the corresponding description in preceding method embodiment, for the side of description
Just and succinctly, details are not described herein.
Above-mentioned invoice key message positioning device 300 can be implemented as a kind of form of computer program, the computer journey
Sequence can be run in computer equipment as shown in Figure 10.
Referring to Fig. 10, Figure 10 is a kind of schematic block diagram of computer equipment provided by the embodiments of the present application.The calculating
Machine equipment 500 can be server.
Refering to fig. 10, which includes processor 502, memory and the net connected by system bus 501
Network interface 505, wherein memory may include non-volatile memory medium 503 and built-in storage 504.
The non-volatile memory medium 503 can storage program area 5031 and computer program 5032.The computer program
5032 include program instruction, which is performed, and processor 502 may make to execute a kind of invoice key message positioning side
Method.
The processor 502 is for providing calculating and control ability, to support the operation of entire computer equipment 500.
The built-in storage 504 provides environment for the operation of the computer program 5032 in non-volatile memory medium 503, should
When computer program 5032 is executed by processor 502, processor 502 may make to execute a kind of invoice key message localization method.
The network interface 505 is used to carry out network communication with other equipment.It will be understood by those skilled in the art that in Figure 10
The structure shown, only the block diagram of part-structure relevant to application scheme, does not constitute and is applied to application scheme
The restriction of computer equipment 500 thereon, specific computer equipment 500 may include more more or fewer than as shown in the figure
Component perhaps combines certain components or with different component layouts.
Wherein, the processor 502 is for running computer program 5032 stored in memory, to realize following step
It is rapid:
Obtain invoice image to be positioned;
Invoice image to be positioned is inputted in convolutional neural networks model and carries out information extraction, to obtain invoice key letter
Breath;
Wherein, the convolutional neural networks model is by the invoice image with feature tag as training data training
U-Net convolutional neural networks are resulting.
In one embodiment, processor 502 is realizing that the convolutional neural networks model is by with feature tag
When invoice image is as step obtained by training data training U-Net convolutional neural networks, it is implemented as follows step:
The invoice image for having feature tag is obtained, to obtain training data;
Construct U-Net convolutional neural networks;
It is based on deep learning frame using training data to learn U-Net convolutional neural networks, to obtain convolution mind
Through network model.
In one embodiment, processor 502 is based on deep learning frame to U-Net volumes in the realization utilization training data
Product neural network is learnt, and when obtaining convolutional neural networks model step, is implemented as follows step:
Training data is inputted in U-Net convolutional neural networks, to obtain sample key message;
Penalty values are calculated according to sample key message;
It is based on deep learning frame according to penalty values to learn U-Net convolutional neural networks, to obtain convolutional Neural
Network model.
Wherein, the U-Net convolutional neural networks are the characteristic patterns for generating the invoice image a quarter resolution ratio of multilayer
Network.
In one embodiment, processor 502 realize it is described by training data input U-Net convolutional neural networks in, with
When obtaining sample key message step, it is implemented as follows step:
Training data is inputted in U-Net convolutional neural networks and carries out process of convolution, to obtain text box;
Merge the text box, to form sample key message.
It should be appreciated that in the embodiment of the present application, processor 502 can be central processing unit (Central
Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital
Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit,
ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic
Device, discrete gate or transistor logic, discrete hardware components etc..Wherein, general processor can be microprocessor or
Person's processor is also possible to any conventional processor etc..
Those of ordinary skill in the art will appreciate that be realize above-described embodiment method in all or part of the process,
It is that relevant hardware can be instructed to complete by computer program.The computer program includes program instruction, computer journey
Sequence can be stored in a storage medium, which is computer readable storage medium.The program instruction is by the department of computer science
At least one processor in system executes, to realize the process step of the embodiment of the above method.
Therefore, the present invention also provides a kind of storage mediums.The storage medium can be computer readable storage medium.This is deposited
Storage media is stored with computer program, and processor is made to execute following steps when wherein the computer program is executed by processor:
Obtain invoice image to be positioned;
Invoice image to be positioned is inputted in convolutional neural networks model and carries out information extraction, to obtain invoice key letter
Breath;
Wherein, the convolutional neural networks model is by the invoice image with feature tag as training data training
U-Net convolutional neural networks are resulting.
In one embodiment, the processor realizes the convolutional neural networks model executing the computer program
It is tool when training step obtained by U-Net convolutional neural networks as training data as the invoice image with feature tag
Body realizes following steps:
The invoice image for having feature tag is obtained, to obtain training data;
Construct U-Net convolutional neural networks;
It is based on deep learning frame using training data to learn U-Net convolutional neural networks, to obtain convolution mind
Through network model.
In one embodiment, the processor is realized the utilization training data and is based in the execution computer program
Deep learning frame learns U-Net convolutional neural networks, when obtaining convolutional neural networks model step, specific implementation
Following steps:
Training data is inputted in U-Net convolutional neural networks, to obtain sample key message;
Penalty values are calculated according to sample key message;
It is based on deep learning frame according to penalty values to learn U-Net convolutional neural networks, to obtain convolutional Neural
Network model.
Wherein, the U-Net convolutional neural networks are the characteristic patterns for generating the invoice image a quarter resolution ratio of multilayer
Network.
In one embodiment, the processor is realized and described training data is inputted U- executing the computer program
In Net convolutional neural networks, when obtaining sample key message step, it is implemented as follows step:
Training data is inputted in U-Net convolutional neural networks and carries out process of convolution, to obtain text box;
Merge the text box, to form sample key message.
The storage medium can be USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), magnetic disk
Or the various computer readable storage mediums that can store program code such as CD.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware
With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This
A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially
Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not
It is considered as beyond the scope of this invention.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary.For example, the division of each unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation.Such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.
The steps in the embodiment of the present invention can be sequentially adjusted, merged and deleted according to actual needs.This hair
Unit in bright embodiment device can be combined, divided and deleted according to actual needs.In addition, in each implementation of the present invention
Each functional unit in example can integrate in one processing unit, is also possible to each unit and physically exists alone, can also be with
It is that two or more units are integrated in one unit.
If the integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product,
It can store in one storage medium.Based on this understanding, technical solution of the present invention is substantially in other words to existing skill
The all or part of part or the technical solution that art contributes can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, terminal or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or replace
It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right
It is required that protection scope subject to.