CN110390320A - A kind of includes the recognition methods and system of the image information of multiple documents - Google Patents
A kind of includes the recognition methods and system of the image information of multiple documents Download PDFInfo
- Publication number
- CN110390320A CN110390320A CN201910729840.4A CN201910729840A CN110390320A CN 110390320 A CN110390320 A CN 110390320A CN 201910729840 A CN201910729840 A CN 201910729840A CN 110390320 A CN110390320 A CN 110390320A
- Authority
- CN
- China
- Prior art keywords
- document
- region
- model
- image
- regions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000013528 artificial neural network Methods 0.000 claims abstract description 34
- 238000012549 training Methods 0.000 claims description 57
- 210000005036 nerve Anatomy 0.000 claims description 31
- 238000012545 processing Methods 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 20
- 238000012937 correction Methods 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 12
- 238000001514 detection method Methods 0.000 claims description 8
- 238000005520 cutting process Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 2
- 230000004044 response Effects 0.000 claims description 2
- 230000001537 neural effect Effects 0.000 claims 1
- 230000000306 recurrent effect Effects 0.000 claims 1
- 238000012360 testing method Methods 0.000 description 26
- 230000015654 memory Effects 0.000 description 19
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000011143 downstream manufacturing Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/1475—Inclination or skew detection or correction of characters or of image to be recognised
- G06V30/1478—Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/412—Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The present invention proposes the recognition methods and system of a kind of image information for including multiple documents, this method comprises: the first trained based on the image and in advance model, identifies each document region on image, wherein the first model is model neural network based;Image and the second trained in advance model based on each document, identify each region on document, and all or part of information recorded on each region and document is associated, wherein the second model is model neural network based;The image in each region is cut and obtains, the image in each region is by being parallel to horizontal rectangle or having inclined rectangle to define relative to horizontal line;And image and third model trained in advance based on each region, the character in each region is identified, so that it is determined that the information recorded on document, wherein third model is model neural network based.The present invention can efficiently and accurately identify the information recorded on various documents.
Description
Technical field
This disclosure relates to which a kind of includes the recognition methods and system of the image information of multiple documents.
Background technique
Accurately identify that the information recorded on various documents is remarkable.
Accordingly, there exist the demands to new technology.
Summary of the invention
One purpose of the disclosure is to provide the recognition methods and system of a kind of image information for including multiple documents.
According to the disclosure in a first aspect, provide a kind of recognition methods of image information for including multiple documents, wrap
Include: the first trained based on the image and in advance model identifies every in one or more document regions on the image
A document region, wherein first model is model neural network based;Image based on each document and in advance
The second trained model identifies each region in one or more regions on the document, one or more of regions
In each region and the document on all or part of information for recording it is associated, wherein second model is based on mind
Model through network;Cut and obtain the image in each region in one or more of regions, one or more of areas
The image in each region in domain is by being parallel to horizontal rectangle or having inclined rectangle to define relative to horizontal line;
And image based on each region in one or more of regions and third model trained in advance, described in identification
The character in each region in one or more regions, so that it is determined that the information recorded on the document, wherein the third
Model is model neural network based.
According to the second aspect of the disclosure, a kind of identifying system of image information for including multiple documents is provided, is wrapped
Include: the first model, first model are models neural network based;Second model, second model are based on nerve
The model of network;Third model, the third model are models neural network based;And one or more first devices,
One or more of first devices are configured as: the first trained based on the image and in advance model identifies the image
On one or more document regions in each document region;Image and second model based on each document,
Identify each region in one or more regions on the document, each region and institute in one or more of regions
It is associated to state all or part of information recorded on document;It cuts and obtains each region in one or more of regions
The image of image, each region in one or more of regions is by being parallel to horizontal rectangle or relative to horizontal line
There is inclined rectangle to define;And image and preparatory instruction based on each region in one or more of regions
Experienced third model identifies the character in each region in one or more of regions, so that it is determined that remembering on the document
The information of load.
According to the third aspect of the disclosure, a kind of non-transitorycomputer readable storage medium is provided, it is described non-provisional
Property computer readable storage medium on be stored with the executable instruction of series of computation machine, when the series of computation machine is executable
Instruction when being executed by one or more computing devices so that one or more of computing devices: based on the image and pre-
First the first model of training, identifies each document region in one or more document regions on the image, wherein described
First model is model neural network based;Image based on the document and in advance the second model of training, described in identification
Each region in one or more regions on document, on each region and the document in one or more of regions
All or part of information of record is associated, wherein second model is model neural network based;It cuts and obtains institute
The image in each region in one or more regions is stated, the image in each region in one or more of regions is by putting down
Row in horizontal rectangle or relative to horizontal line has inclined rectangle to define;And it is based on one or more of regions
In each region image and third model trained in advance, identify each region in one or more of regions
In character, so that it is determined that the information recorded on the document, wherein the third model is model neural network based.
By the detailed description referring to the drawings to the exemplary embodiment of the disclosure, the other feature of the disclosure and its
Advantage will become apparent.
Detailed description of the invention
The attached drawing for constituting part of specification describes embodiment of the disclosure, and together with the description for solving
Release the principle of the disclosure.
The disclosure can be more clearly understood according to following detailed description referring to attached drawing, in which:
Fig. 1 be schematically show suitable for some embodiments of the present disclosure include multiple documents image signal
Figure.
Fig. 2 is to schematically show the exemplary of document suitable for some embodiments of the present disclosure at least part of to show
It is intended to.
Fig. 3 A and Fig. 3 B are respectively schematically to show to record on the identification document according to some embodiments of the present disclosure
At least part of block diagram of the method for information.
Fig. 4 is the method for schematically showing the information recorded on the identification document according to some embodiments of the present disclosure
At least part of flow chart.
Fig. 5 is the system for schematically showing the information recorded on the identification document according to some embodiments of the present disclosure
At least part of structure chart.
Fig. 6 is the system for schematically showing the information recorded on the identification document according to some embodiments of the present disclosure
At least part of structure chart.
Fig. 7 be schematically show suitable for some embodiments of the present disclosure include multiple documents image signal
Figure.
Fig. 8 is to schematically show an example of the image of the document suitable for some embodiments of the present disclosure at least
The schematic diagram of a part.
Fig. 9 A is the schematic diagram in a region being identified in document shown in Fig. 8.
Fig. 9 B is the schematic diagram that slant correction is passed through in region shown in Fig. 9 A.
Fig. 9 C is the schematic diagram in a region being identified in document shown in Fig. 8.
Fig. 9 D is the schematic diagram that slant correction is passed through in region shown in Fig. 9 C.
Note that same appended drawing reference is used in conjunction between different attached drawings sometimes in embodiments described below
It indicates same section or part with the same function, and omits its repeated explanation.In the present specification, using similar mark
Number and letter indicate similar terms, therefore, once being defined in a certain Xiang Yi attached drawing, then do not needed in subsequent attached drawing pair
It is further discussed.
Specific embodiment
Hereinafter reference will be made to the drawings the various exemplary embodiments of the disclosure are described in detail.It should also be noted that unless in addition having
Body explanation, the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally
Scope of disclosure.In being described below, in order to preferably explain the disclosure, many details are elaborated, it being understood, however, that
The disclosure can also be practiced in the case where without these details.
Be to the description only actually of at least one exemplary embodiment below it is illustrative, never as to the disclosure
And its application or any restrictions used.In shown here and discussion all examples, any occurrence should be interpreted only
It is merely exemplary, not as limitation.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable
In the case of, the technology, method and apparatus should be considered as part of specification.
Present disclose provides it is a kind of include multiple documents image information recognition methods, comprising: be based on the image
First model of training in advance, identifies each document region in one or more document regions on the image, wherein
First model is model neural network based;Image and the second trained in advance model based on each document,
Identify each region in one or more regions on the document, each region and institute in one or more of regions
It is associated to state all or part of information recorded on document, wherein second model is model neural network based;Cutting
And the image in each region in one or more of regions is obtained, the shadow in each region in one or more of regions
It seem by being parallel to horizontal rectangle or thering is inclined rectangle to define relative to horizontal line;And based on one or
The image in each region in multiple regions and third model trained in advance, identify in one or more of regions
Character in each region, so that it is determined that the information recorded on the document, wherein the third model is based on neural network
Model.
In some embodiments, by the third model, based on each region in one or more of regions
Image and its position in whole document, to identify the character in each region in one or more of regions.
It should be appreciated that the disclosure so-called " document " refers to the entity for recording information on it, these information are with some
Mode is disposed on document, and is held by one of middle text, outer text, number, symbol, figure etc. or diversified forms
It carries.Some specific examples of the disclosure so-called " document " can be, and invoice, bill, duty receipt, receipt, shopping list, food and drink are small
Ticket, insurance policy, expense report, deposit flowing water list, credit card statement, express delivery list, stroke list, ticket, boarding card, patent disclosure text
The various documents filled in by artificial and/or machine such as this information page, ballot paper, questionnaire, evaluation table, attendance sheet, application form.
Wherein, expense report, which can be considered as, includes multiple invoices and pastes the document form on a piece of paper.Those skilled in the art
It is appreciated that the disclosure so-called " document " be not limited to these specific examples listed herein, and be not limited to finance or
Commercially related bill is also not necessarily limited to have the document of official seal thereon, can be the document with type fount and be also possible to band
Have the document of hand-written script, can be with regulation and/or general format document may not be with regulation and/or it is general
The document of format.
In some embodiments, each document region step in one or more document regions on the identification image
Afterwards, each document region is cut and obtains the image of each document, the image of each document is inputted into institute respectively later
The second model is stated to be handled.For example, for document image 10 as shown in Figure 1, it includes have multiple documents 100,200,
300, the different location being respectively distributed on the document image 10.The first trained based on the document image 10 and in advance mould
Type, identifies each document region 100,200,300 on the document image 10, and each document region is cut and obtained
The image of each document is inputted second model later respectively and handled by the image of each document.
Identify the character in each region in one or more regions on document, so that it may according to these character institutes
The information of carrying determines the information recorded on document.For example, for document 100 as shown in Figure 2, first based on training in advance
Second model identifies the region 110,120,130,140 on document 100, wherein record on each region and document 100 one
Kind information is associated;Cut and obtain the image in each region 110,120,130,140 on the document 100;It is then based on pre-
First trained third model identifies the character in each region 110,120,130,140, to just can determine that on document 100
The content of documented information in each region.For example, each region includes at least the minimum of character included in the region
Bounding box area encompassed.In some embodiments, be input to third model trained in advance is one or more of
The image in each region in region and its position in whole document, thus identified by third model one or
The character in each region in multiple regions.
It will be understood by those skilled in the art that Fig. 1 and document shown in Figure 2 10 and document 100 are only schematic
, it cannot be used to limit the disclosure.Although illustrating only corresponding multiple regions in Fig. 1 and Fig. 2, on the document of the disclosure
It region obviously can also be less or more.Although the boundary in region 110,120,130,140 illustrated in fig. 2 is with parallel
It is defined in horizontal rectangle, but its boundary can also be by having inclined rectangle relative to horizontal line, parallelogram, appointing
Meaning quadrangle etc. defines, or by round, ellipse, other polygons (such as triangle, trapezoidal, arbitrary polygon
Deng) and irregular shape etc. define.Any one region on the document of the disclosure can be arranged in appointing in document
What position, for example, in Fig. 2, region 110 and region 120, which can be, to be closer even adjacent, and region 130 can be with
It is positioned at the edge of document 100, region 140 can be lesser compared with other regions.Certainly, those skilled in the art may be used also
To understand, the arrangement mode, positional relationship, size of each region etc. on the document of the disclosure are not limited to Fig. 1 and Fig. 2
Shown in mode, this is related with the concrete condition of document, and Fig. 1 and document shown in Fig. 2 are only an example.
The image of document refers to the document, such as picture, the video of document etc. presented with visual means.To on document
It includes the boundary for identifying region that each region in one or more regions, which carries out identification,.For example, region boundary with flat
Row in the case where defining, can determine the region by determining at least two vertex of the rectangle in horizontal rectangle.
It, can be by determining the rectangle at least in the case where the boundary in region is to have inclined rectangle relative to horizontal line to define
Three vertex determine the region.Can be used the method for checking object based on R-CNN, the method for checking object based on YOLO,
Based on original detection target text detection (such as based on character, based on word, based on line of text etc.), be based on object boundary frame
Shape text detection (horizontal or close to horizontal text detection, more be oriented to text detection etc.).
In some embodiments, it is desirable to which the position in each region is input to third model with the character in identification region.
The position in region can be any form that can indicate position of this region in document, for example, the position in region can be with
It is coordinate (absolutely or relative coordinate) of the vertex (one or more) in region in document, is also possible to the vertex (one in region
It is a or multiple) coordinate in document (absolutely or relative coordinate) and side length (one or more), it can also be the center in region
The coordinate (absolutely or relative coordinate) and radius (one or more) of (one or more) in document.Character in each region
It can be one of middle text, outer text, number, symbol, figure etc. or a variety of.
In some embodiments, the image in each region in one or more regions is input to third model, to know
Character not in the region.The image in each region in one or more of regions be by be parallel to horizontal rectangle or
There is inclined rectangle to define relative to horizontal line.The standard defined above is to be in level in image according to whole document
Or heeling condition determines, and when the states such as inclination or distortion are presented in document, identified by the second model one or more
Each region in a region can also show horizontally or diagonally equal different conditions.
In some cases, for example, region boundary to have the case where inclined rectangle is to define relative to horizontal line
Under, slant correction can also be carried out to the image of each region, so that being input to the image in the region of third model to pass through
Image after slant correction.For example, can have inclined square relative to horizontal line by the determining boundary for being used to delimited area
Then the image in the region is rotated the angle relative to the horizontal inclined angle of institute by shape, so that being used to delimited area
Boundary rectangular parallel in horizontal line, to carry out slant correction.The tilt angle can be according to delimited area boundary
Rectangle apex coordinate is calculated.
Fig. 7 schematically show suitable for some embodiments of the present disclosure include multiple documents image signal
Figure.Content as described above includes that the images of multiple documents is input into the first model, and the first model identifies wherein
One or more document regions 600, the multiple document region 600 can be arbitrary form arrangement and arrangement, be identified
It is defined with frame indicated by appended drawing reference 600 in each document region out.It cuts and obtains each document field in region 600
Image, then the image of each document field 600 can be input to the second model and handled.
The example that Fig. 8 schematically shows the image of a document suitable for the disclosure.As described above interior
Holding, the image of document is input into the second model, and the second model identifies one or more of regions 610 to 690, wherein
It is defined with frame indicated by appended drawing reference 610 to 690 in each region being identified.It will be understood by those skilled in the art that the
Two models may recognize that regions more more or fewer than the region outlined in the accompanying drawings.
Second model may recognize that the class of information associated with each region when identifying each region
Type.For example, information associated with region 610 is the title and number of trade company, information associated with region 620 is raw for bill
At time, with detail and every amount of money that region 630,640 associated information are consumption, letter associated with region 650
Breath is the subtotal of spending amount, and information associated with region 660 is the expenses of taxation, and information associated with region 670 is consumption gold
The total of volume, information associated with region 680 are collection amount, and information associated with region 690 is change amount.
The image in each region in region 610 to 690 is cut and obtains, it then can will be every in region 610 to 690
The image in a region is sequentially inputted to third model, to identify the character in the region.It can be by region 610 as shown in Figure 9 A
Image or the image in region 650 as shown in Figure 9 C be input to third model to identify the character in the region.In addition, by
There is inclination relative to horizontal line in the frame for delimited area 610 to 690, it in some embodiments, can also be to each region
Image carry out slant correction after be input to third model again.For example, shown in the image in region 610 and Fig. 9 C shown in Fig. 9 A
Region 650 image in, obtain and be used to delimited area boundary and relative to horizontal line have inclined rectangle apex coordinate, pass through
The rectangle is calculated relative to the inclined angle of horizontal line institute, the image in the region is then rotated into the angle, so that being used to
The rectangular parallel on the boundary of delimited area is in horizontal line, to carry out slant correction.It later, can be by process as shown in Figure 9 B
The image in the region 610 of slant correction or the image in the region 650 by slant correction as shown in fig. 9d are input to third
Model is to identify the character in the region.
It later, can be according to the class of character and information associated with each region in each region identified
Type, to determine the information recorded on document.In the present embodiment, the name of the as final trade company for determining related document region 610
Claim text and numerical digit content, the document in region 620 generates time figure content, the consumption details in region 630,640 and each
Item amount of money digital content, the digital content of 650 spending amount subtotal of region, the digital content of 660 expenses of taxation of region, region 670 disappears
Take the digital content of amount of money total, the digital content of 680 collection amount of region, the letter of the digital content of 690 change amount of region
Breath.Information identified above can be directly displayed on each relevant range of document image, can also pass through list or segmentation text
Font formula additionally exports display.
The disclosure utilizes model neural network based, first identifies one or more tickets in the image of document to be identified
According to region, one or more regions in each bill image are then identified, then identify the character in each region, thus
It identifies the information recorded on each document, so, it is possible efficiently and accurately to identify the information recorded on various documents.Example
Such as, not high for resolution ratio, crooked, illegible, have it is being stained, paper curl, fill in (by artificial and/or machine
Device) position equal document lack of standardization image, can be identified using disclosed method and the system being described below.
First model is obtained by following process: including more to each of the first document image sample training concentration
The image sample for opening document is labeled processing, to mark out in one or more document regions in each image sample
Each document region;And by the first document image sample training collection by the mark processing, to the first mind
It is trained through network, to obtain first model.The first nerves network is based on algorithm of target detection
(Detection) neural network, in some embodiments, the first nerves network is based on convolutional neural networks
(CNN), the model foundations such as RCNN or Mask-RCNN.
Second model can be obtained by following process: each document image concentrated to the second document image sample training
Sample is labeled processing, to mark out each region in one or more regions in each document image sample, one
Or each region in multiple regions is associated with all or part of information in document image sample;And by by mark
Second document image sample training collection of processing, is trained first nerves network, to obtain the second model.For example, Fig. 8 institute
The example that shows can be the image of a document through marking.When marking out each region 610 to 690, can also mark
The type of information associated by each region 610 to 690 out.In some embodiments, nervus opticus network is residual based on depth
What poor network (Resnet) was established.
Being trained to nervus opticus network can also include: based on the second document image test sample collection, to by instructing
The output accuracy rate of experienced nervus opticus network is tested;If output accuracy rate is less than scheduled first threshold, increase by the
The quantity for the document image sample that two document image sample trainings are concentrated, each document shadow in increased document image sample
As sample standard deviation is handled by mark;And the second document image sample instruction after the quantity by increasing document image sample
Practice collection, nervus opticus network is trained again.It is then based on what the second document image test sample collection crossed re -training
The output accuracy rate of nervus opticus network is tested again, until the output accuracy rate of nervus opticus network is met the requirements i.e. not
Until scheduled first threshold.In this way, the nervus opticus network that output accuracy rate is met the requirements may be used as above-mentioned identification
Trained second model in the process.
First model and the second model use identical training and testing process, and may be incorporated in primary training
Or completed in testing process, to establish a general identification model, the image for including multiple documents is identified
After processing, can directly one-off recognition go out in one or more regions on the region and every document of every document
Each region.
Third model can be obtained by following process: each document image concentrated to third document image sample training
Sample is labeled processing, to mark out each region in one or more regions in each document image sample and every
Character in a region, each region in one or more regions are related to all or part of information in document image sample
Connection;And by the third document image sample training collection by mark processing, third nerve network is trained, to obtain
Third model.In some embodiments, on the image and document of the document concentrated based on third document image sample training
The position in each region in one or more regions is trained third nerve network to obtain third model.Some
In embodiment, the shadow in each region in one or more regions on document concentrated based on third document image sample training
Picture is trained nervus opticus network to obtain third model.Under some cases of these embodiments, for example, in region
Boundary in the case where defining, be input to third nerve network to be trained to have inclined rectangle relative to horizontal line
The image in region is the image after slant correction.For example, can by determine be used to delimited area boundary relative to
Horizontal line has inclined rectangle relative to the horizontal inclined angle of institute, and the image in the region is then rotated the angle, with
So that be used to delimited area boundary rectangular parallel in horizontal line, to carry out slant correction.The tilt angle can root
It is calculated according to the rectangle apex coordinate on delimited area boundary.In some embodiments, third nerve network is based on recurrence mind
It is established through network (RNN).
Being trained to third nerve network can also include: based on third document image test sample collection, to by instructing
The output accuracy rate of experienced third nerve network is tested;If exporting accuracy rate is less than scheduled threshold value, increase third list
According to the quantity for the document image sample that image sample training is concentrated, each document image sample in increased document image sample
This is handled by mark;And the third document image sample training after the quantity by increasing document image sample
Collection, is again trained third nerve network.It is then based on third document image test sample collection crosses re -training
The output accuracy rate of three neural networks is tested again, until the output accuracy rate of third nerve network meet the requirements it is i.e. not small
Until scheduled threshold value.In this way, the third nerve network that output accuracy rate is met the requirements may be used as in above-mentioned identification process
Trained third model.
It will be understood by those skilled in the art that for train first nerves network the first document image sample training collection,
Third document for training the second document image sample training collection of nervus opticus network and for training third nerve network
Image sample training collection can be identical set and be also possible to different set, it can including identical document shadow
Decent also may include not identical or not exactly the same document image sample.First for testing first nerves network is single
According to image test sample collection, the second document image test sample collection for testing nervus opticus network and it is used to test third mind
Third document image test sample collection through network can be identical set and be also possible to different set, it can including
Identical document image sample also may include not identical or not exactly the same document image sample.For in testing
Judge scheduled first threshold that whether the output accuracy rate of nervus opticus network meets the requirements with for judging the in testing
The scheduled second threshold whether the output accuracy rate of three neural networks meets the requirements can be identical value and be also possible to difference
Value.The quantity for the document image sample that first and second document image sample trainings are concentrated and the first, second, and third list
According to the quantity for the document image sample that image test sample is concentrated, can according to need to select.Document after identification
Image, can also be used as document image sample and be added in above-mentioned any one or more training sets or test set, thus
The quantity of document image sample for training and/or test is constantly increased, so that trained model
Precision improves.
Fig. 3 A and Fig. 3 B are respectively schematically to show to record on the identification document according to some embodiments of the present disclosure
At least part of block diagram of the method for information.Based on it is described include multiple documents image and in advance training the first mould
Type identifies each document region in one or more document regions on the image, is then based on each list to be identified
According to image 210 and in advance training the second model 220, to identify each region in one or more regions on document
230;The of image based on each region 230 in the one or more regions for cutting acquisition on document and in advance training
Three models 250, to identify the character 260 in each region in one or more regions on document;In some embodiments,
Image and the second model 220 based on each region 230 in one or more of regions, also identification and one or more areas
The information type 240 of the associated information in each region in domain;And based on the letter associated with each region identified
The character 260 in each region in the information type 240 of breath and the one or more regions identified, to determine document
The information of upper record;In some embodiments, image based on each region 230 in one or more of regions and its
Position in whole document, to identify the character 260 in each region in one or more of regions.
In some embodiments, in one or more regions on image 210, document based on each document to be identified
Each region 230 and third model 250 trained in advance, to identify each of one or more regions on document
Character 260 in region.In some embodiments, the image 210 based on document and the second model 220, also identification with one or
The information type 240 of the associated information in each region in multiple regions;And it is related to each region based on what is identified
The character 260 in each region in the information type 240 of the information of connection and the one or more regions identified comes true
Order according to upper record information.In some embodiments, every in the image based on whole document and one or more regions
The position in a region 230, to identify the character 260 in each region in one or more regions on document.
The information type of information associated with each region can be one or more types.For example, when document is certain
When kind application form, in one case, the information type of information associated with a region in document can be applicant
The information type of name, information associated with another region in document can be ID card No.;In another situation
Under, the information type of information associated with some region in document can be full name of applicant and ID card No..For example,
When document is certain invoice, in one case, the information type of information associated with a region in document can be with
It is invoice code name, the information type of information associated with another region in document can be the pre-tax amount of money;In another kind
In the case of, the information type of information associated with some region in document can be invoice code name and the pre-tax amount of money.With one
The information type of the associated information of different zones in a or multiple regions can be the same or different.For example, working as document
When for shopping list, in one case, first the information type of associated information can be purchased with multiple and different regions
The commodity bought.
Fig. 4 is the method for schematically showing the information recorded on the identification document according to some embodiments of the present disclosure
At least part of flow chart.In this embodiment, disclosed method includes: the first of based on the image and in advance training
Model identifies each document region (310) in one or more document regions on the image;Shadow based on each document
Picture and in advance training the second model, identify document on one or more regions in each region, and with one or more
The information type (320) of the associated information in each region in a region;Based on each of one or more of regions
The image in region and third model trained in advance identify the character in each region in one or more regions
(330);And information type based on the information associated with each region in one or more regions identified, with
And the character in each region in the one or more regions identified, to determine the information recorded on document (340).
In these embodiments, first model is obtained by following process: to the first document image sample training collection
Each of include that the image samples of multiple documents is labeled processing, to mark out one in each image sample
Or each document region in multiple document regions;And pass through the first document image sample by the mark processing
Training set is trained first nerves network, to obtain first model.Second model can be obtained by following process
To: each document image sample concentrated to the second document image sample training is labeled processing, to mark out each document
The information type in each region and information associated with each region in one or more regions in image sample, one
Each region in a or multiple regions is associated with all or part of information in document image sample;And by by mark
The the second document image sample training collection for infusing processing, is trained nervus opticus network, to obtain the second model.It can be with base
The output accuracy rate for the nervus opticus network trained is tested in the second document image test sample collection, if accuracy rate
It is unsatisfactory for requiring, that is, is less than scheduled first threshold, then increase by the second document image sample training and concentrate document image sample
Again nervus opticus network is trained after quantity, until the output accuracy rate of nervus opticus network meet the requirements it is i.e. not small
Until scheduled first threshold.In this way, the nervus opticus network that output accuracy rate is met the requirements may be used as above-mentioned identified
Trained second model in journey.
In some embodiments, before each region in the one or more regions identified on each document, this public affairs
The method opened further include: image and the 4th trained in advance model based on document identify the classification of document, wherein the 4th mould
Type is model neural network based;And it is selected according to the classification identified by the second model to be used and/or third
Model.In some embodiments, the classification of document includes at least the languages of document.For example, can image based on document and pre-
First the 4th model of training can be one in the following terms come the type for identifying the language for the information recorded on carrying document
It is or multinomial: the language such as Chinese, English, Japanese, the language that Morse code, pictograph, ASCII character etc. are presented with certain coding form
Speech etc..Then it is selected the second model to be used and/or third model according to the recognition result.In these cases, needle
To different languages, may train in advance has different the second model and/or third model, this facilitates mentioning for model accuracy
It is high.
4th model can be obtained by following process: each document image concentrated to the 4th document image sample training
Sample is labeled processing, to mark out the classification of each document image sample;And it is single by the 4th by mark processing
According to image sample training collection, fourth nerve network is trained, to obtain the 4th model.In some embodiments, the 4th mind
It is established through network based on depth convolutional neural networks (CNN).Fourth nerve network is trained further include: be based on the 4th
Document image test sample collection, tests the output accuracy rate of trained fourth nerve network;If exporting accuracy rate
Less than scheduled threshold value, then increase the quantity for the document image sample that the 4th document image sample training is concentrated, the increased list of institute
It is handled according to each document image sample standard deviation in image sample by mark;And the quantity by increasing document image sample
The 4th document image sample training collection later, is again trained fourth nerve network.It is then based on the 4th document image
Test sample collection tests the output accuracy rate for the fourth nerve network that re -training is crossed again, until fourth nerve network
Output accuracy rate meet the requirements i.e. not less than until scheduled third threshold value.In this way, exporting the accuracy rate is met the requirements the 4th
Neural network may be used as trained 4th model in above-mentioned identification process.
It will be understood by those skilled in the art that the 4th document image sample training collection and the first, second, and third document shadow
As sample training collection, it can be identical set and be also possible to different set.4th document image test sample collection and first,
Second and third document image test sample collection, it can be identical set and be also possible to different set.Third threshold value and the
One and second threshold, it can be identical value and be also possible to different values.4th document image sample training collection and the 4th document
The quantity for the document image sample that image test sample is concentrated, can according to need to select.Document after identification
Image can also be used as document image sample and be added in above-mentioned any one or more training sets or test set, to make
The quantity that must be used for training and/or the document image sample tested can constantly increase, so that the essence of trained model
Degree improves.
Fig. 5 is the system for schematically showing the information recorded on the identification document according to some embodiments of the present disclosure
400 at least part of structure chart.It will be understood by those skilled in the art that system 400 is an example, should not be regarded
To limit the scope of the present disclosure or features described herein.In this example, system 400 may include the first model 410,
Two models 420, third model 430 and one or more first devices 440.Wherein, the first model 410 is based on neural network
Model;Second model 420 is model neural network based;Third model 430 is model neural network based;One or
Multiple first devices 440 are configured as: the first trained based on the image and in advance model 410 identifies on the image
Each document region in one or more document regions;Image and the second model 420 based on each document identify on document
One or more regions in each region, the whole recorded on each region and document in one or more regions or portion
Divide information associated;Cut and obtain the image in each region in one or more of regions, one or more of areas
The image in each region in domain is by being parallel to horizontal rectangle or having inclined rectangle to define relative to horizontal line;
And image and third model 430 based on each region in one or more of regions, identify one or more areas
The character in each region in domain, so that it is determined that the information recorded on document.In some embodiments, one or more first
Device 440 is configured as: the image based on document, each region in one or more regions and third model 430 come
Identify the character in each region in one or more regions on document.
The image in each region in one or more of regions is by being parallel to horizontal rectangle or relative to water
What horizontal line had inclined rectangle to define, also, one or more of first devices 440 are also configured in response to opposite
There is inclined rectangle in horizontal line, is come by the third model 430 based on the image in each region by slant correction
Identify the character in each region in one or more of regions.
From the above description, it can be seen that one or more first devices 440 can be additionally configured to: based on the image and in advance
First the first model of training, identifies each document region in one or more document regions on the image;Based on each
The image of document and the second model also identify the info class of information associated with each region in one or more regions
Type;And the information type based on the information associated with each region in one or more regions identified, Yi Jishi
Not Chu one or more regions in each region in character, to determine the information recorded on document.The identification of the disclosure
The system 400 for the information recorded on document can also include the 4th model (not shown) neural network based.It is one or more
First device 440 can be additionally configured to: before each region in one or more regions on identification document, based on single
According to image and the 4th model, identify the classification of document;And it is selected according to the classification identified by the second mould to be used
Type and/or third model.
It will be understood by those skilled in the art that the various operations described above about one or more first devices 440,
Can be configured as and carried out in a first device 440, also can be configured as be distributed in multiple first devices 440 into
Row.Each of one or more first devices 440 may each be computing device, storage device or be provided simultaneously with calculating with
The device of store function.
Although in system 400 shown in fig. 5, the first model 410, the second model 420 and third model 430 and one or more
A first device 440 is indicated with individual rectangle frame respectively, but the first model 410, the second model 420 and third model
430 are also possible to be stored in one or more first devices 440, such as the first model 410, the second model 420 and third
Model 430 is stored in the same first device 440 or the first model 410, the second model 420 and third model 430 divide
Be not stored in different first devices 440 or the first model 410, the second model 420 and third model 430 in it is any one
A a part is stored in a first device 440, other parts are stored in other first devices 440;Certainly, first
Model 410, the second model 420 and third model 430 can also be not stored in one or more first devices 440, but be deposited
Storage is in other devices.
For the information recorded on the document that identifies, it can be used for further downstream processing.Downstream processing can lead to
One or more second devices are crossed (reference can be made to 520) shown in fig. 6 carry out.One or more second devices can be configured as:
Send the image of document to one or more first devices 440;And it is obtained from one or more first devices 440 digital
The information recorded on the document identified changed.These information digitized that one or more second devices obtain,
It can be used for downstream processing.For example, for the information on the Attendance Sheet that identifies, one or more second devices can be used
To count rate of attendance etc.;For the information on the food and drink receipt that identifies, one or more second devices, which can be logged into, to disappear
Take record etc..
It will be understood by those skilled in the art that sending the image of document the second of one or more first devices 440 to
Device, the second device with the information digitized is obtained from one or more first devices 440, can be identical device
It is also possible to different devices.
Fig. 6 is the system for schematically showing the information recorded on the identification document according to some embodiments of the present disclosure
500 at least part of structure chart.System 500 includes one or more first devices 510 and one or more second devices
520, wherein one or more first devices 510 are connect with one or more second devices 520 by network 530, and one or more
Each of a first device 510 and other one or more first devices 510 or the element of each and other
Each of one or more elements, and one or more second devices 520 and other one or more second devices
520 or each a element and other one or more elements, it can also be connected by network 530.
Each of one or more first devices 510 may each be computing device, storage device or be provided simultaneously with meter
Calculate the device with store function.Each of one or more first devices 510 can contain one or more processors
511, one or more memories 512 and the other assemblies being typically found in the devices such as computer.One or more first
Each of one or more memories 512 in device 510 can store and can be accessed by one or more processors 511
Content, including can be by instruction 513 that one or more processors 511 execute and can be by one or more processors 511
Come the data 514 retrieved, manipulated or store.
Instruction 513 can be any instruction set that will directly be executed by one or more processors 511, such as machine generation
Code, or any instruction set executed indirectly, such as script.Term " instruction " herein, " application ", " process ", " step
Suddenly it may be used interchangeably herein with " program " ".Instruction 513 can store as object code format so as to by one or more
Processor 511 is directly handled, or is stored as any other computer language, including explaining on demand or the independent source of just-ahead-of-time compilation
The script or set of code module.Instruction 513 may include causing the one or more of such as one or more first devices 510
Computing device serves as the instruction of first, second and/or third nerve network.Finger is explained in more detail in this paper other parts
Enable 513 function, method and routine.
One or more memories 512 can be that can store can be by the content that one or more processors 511 access
Any provisional or non-transitorycomputer readable storage medium, such as hard disk drive, storage card, ROM, RAM, DVD, CD,
USB storage, energy memory write and read-only memory etc..One or more of one or more memories 512 may include
Distributed memory system, wherein instruction 513 and/or data 514 can store and may be physically located at identical or different ground
It manages on multiple and different storage devices at position.One or more of one or more memories 512 can be via such as Fig. 5
Shown in network 530 be connected to one or more first devices 510, and/or can be attached directly to or be incorporated to one
Or in any one of multiple first devices 510.
Data 514 can be retrieved, be stored or be modified to one or more processors 511 according to instruction 513.It is stored in one
Or the data 514 in multiple memories 512 may include the image of document to be identified, various document image sample sets, be used for
First, second and/or the parameter of third nerve network etc..Other data not associated with the image of document or neural network
It can be stored in one or more memories 512.For example, although subject matter described herein is not by any certain number
It is limited according to structure, but data 514 may also be stored in computer register (not shown), as with many different words
The table or XML document of section and record are stored in relevant database.Data 514 can be formatted as any computing device
Readable format, such as, but not limited to binary value, ASCII or Unicode.In addition, data 514 may include being enough to identify phase
Close any information of information, such as number, descriptive text, proprietary code, pointer, to being stored at such as other network sites
It is used to calculate the information of related data Deng the reference of the data in other memories or by function.
One or more processors 511 can be any conventional processors, such as commercially available central processing list in the market
First (CPU), graphics processing unit (GPU) etc..Alternatively, one or more processors 511 can also be personal module, such as
Specific integrated circuit (ASIC) or other hardware based processors.Although being not required, one or more first is filled
Setting 510 may include special hardware component faster or to more efficiently carry out specific calculating process, such as to the shadow of document
As carrying out image procossing etc..
Although in Fig. 5 schematically by one or more first devices 510 and one or more processors 511, one or
Multiple memories 512 and other elements are shown in the same frame, but first device, processor, computing facillities or
Memory can actually include the multiple places being likely to be present in the same physical housings or in different multiple physical housings
Manage device, computing facillities or memory.For example, one in one or more memories 512 can be and be located at and one
Or hard disk drive or other storage mediums in the different shell of shell of each of multiple first devices 510.Therefore,
Reference processor, computing facillities or memory are understood to include reference may parallel work-flow or possible non-parallel
The set of the processor of operation, computing facillities or memory.For example, one or more first devices 510 may include
The server computational device operated as the server zone of load balance.In addition, though some functions described above are referred to
It is shown as occurring on the single computing device with single processor, but the various aspects of subject matter described herein can be with
It is for example in communication with each other by network 530 by multiple first devices 510 to realize.
Each of one or more first devices 510 can be located at the different nodes of network 530, and can be straight
Ground connection communicates with other nodes of network 530 indirectly.Although illustrating only the first and second devices 510,520 in Fig. 5,
It will be understood by those skilled in the art that system 500 can also include other devices, wherein each different device is respectively positioned on network
At 530 different nodes.Various agreements and system can be used by the component part in network 530 and system as described herein
(for example, the first and second devices, first, second, and third model etc.) interconnection, so that network 530 can be internet, ten thousand
Tie up a part of net, particular inline net, wide area network or local area network.Network 530 can use Ethernet, WiFi and HTTP etc.
Standard communication protocol, be for one or more companies proprietary agreement and aforementioned protocols various combinations.Although working as
To transmit or receive certain advantages are obtained when information as described above, but subject matter described herein be not limited to it is any specific
The mode of intelligence transmission.
Each of one or more second devices 520 can be configured as in one or more first devices 510
Each is similar, that is, have one or more processors 521 as described above, one or more memories 522 and
Instruction and data.Each second device 520 can be intended to the personal computing device used by user or be used by enterprise
Business computer device, and there are all components being usually used in combination with personal computing device or business computer device,
Such as central processing unit (CPU), storing data and the memory of instruction (for example, RAM and internal hard disk drive) are such as shown
Show device (for example, the monitor, touch screen, projector, TV with screen or other devices for being operable to display information), mouse
One or more I/O equipment 523 of mark, keyboard, touch screen, microphone, loudspeaker, and/or Network Interface Unit etc..One or
Multiple second devices 520 can also include for capture still image or record video flowing one or more cameras 524 and
All components for these elements to be connected to each other.
Although one or more second devices 520 can include respectively full-scale personal computing device, they can
The mobile computing device that can wirelessly exchange data with server by networks such as internets can be optionally included.Citing
For, one or more second devices 520 can be mobile phone, or PDA, tablet PC or energy that such as band is wirelessly supported
The devices such as enough net books that information is obtained via internet.In another example, one or more second devices 520 can be
Wearable computing system.
Word " A or B " in specification and claim includes " A and B " and " A or B ", rather than is exclusively only wrapped
Include " A " or only include " B ", unless otherwise specified.
In the disclosure, mean to combine embodiment description to " one embodiment ", referring to for " some embodiments "
Feature, structure or characteristic are included at least one embodiment, at least some embodiments of the disclosure.Therefore, phrase is " at one
In embodiment ", the appearance of " in some embodiments " everywhere in the disclosure be not necessarily referring to it is same or with some embodiments.This
It outside, in one or more embodiments, can in any suitable combination and/or sub-portfolio comes assemblage characteristic, structure or characteristic.
As used in this, word " illustrative " means " be used as example, example or explanation ", not as will be by
" model " accurately replicated.It is not necessarily to be interpreted than other implementations in any implementation of this exemplary description
It is preferred or advantageous.Moreover, the disclosure is not by above-mentioned technical field, background technique, summary of the invention or specific embodiment
Given in go out theory that is any stated or being implied limited.
As used in this, word " substantially " means comprising the appearance by the defect, device or the element that design or manufacture
Any small variation caused by difference, environment influence and/or other factors.Word " substantially " also allows by ghost effect, makes an uproar
Caused by sound and the other practical Considerations being likely to be present in actual implementation with perfect or ideal situation
Between difference.
Foregoing description can indicate to be " connected " or " coupled " element together or node or feature.As used herein
, unless explicitly stated otherwise, " connection " means an element/node/feature and another element/node/feature in electricity
Above, it is directly connected (or direct communication) mechanically, in logic or in other ways.Similarly, unless explicitly stated otherwise,
" coupling " mean an element/node/feature can with another element/node/feature in a manner of direct or be indirect in machine
On tool, electrically, in logic or in other ways link to allow to interact, even if the two features may not direct
Connection is also such.That is, " coupling " is intended to encompass the direct connection and connection, including benefit indirectly of element or other feature
With the connection of one or more intermediary elements.
In addition, middle certain term of use can also be described below, and thus not anticipate just to the purpose of reference
Figure limits.For example, unless clearly indicated by the context, be otherwise related to the word " first " of structure or element, " second " and it is other this
Class number word does not imply order or sequence.
It should also be understood that one word of "comprises/comprising" as used herein, illustrates that there are pointed feature, entirety, steps
Suddenly, operation, unit and/or component, but it is not excluded that in the presence of or increase one or more of the other feature, entirety, step, behaviour
Work, unit and/or component and/or their combination.
In the disclosure, term " component " and " system ", which are intended that, is related to an entity related with computer, or hard
Part, the combination of hardware and software, software or software in execution.For example, a component can be, but it is not limited to, is locating
Process, object, executable, execution thread, and/or the program etc. run on reason device.It is illustrated with, in a server
Both the application program of upper operation and the server can be a component.One or more components can reside in one
The process of execution and/or the inside of thread, and a component can be located on a computer and/or be distributed on two
Between platform or more.
It should be appreciated by those skilled in the art that the boundary between aforesaid operations is merely illustrative.Multiple operations
It can be combined into single operation, single operation can be distributed in additional operation, and operating can at least portion in time
Divide and overlappingly executes.Moreover, alternative embodiment may include multiple examples of specific operation, and in other various embodiments
In can change operation order.But others are modified, variations and alternatives are equally possible.Therefore, the specification and drawings
It should be counted as illustrative and not restrictive.
Although being described in detail by some specific embodiments of the example to the disclosure, the skill of this field
Art personnel it should be understood that above example merely to be illustrated, rather than in order to limit the scope of the present disclosure.It is disclosed herein
Each embodiment can in any combination, without departing from spirit and scope of the present disclosure.It is to be appreciated by one skilled in the art that can be with
A variety of modifications are carried out without departing from the scope and spirit of the disclosure to embodiment.The scope of the present disclosure is limited by appended claims
It is fixed.
Claims (17)
1. a kind of includes the recognition methods of the image information of multiple documents characterized by comprising
The first trained based on the image and in advance model, identifies every in one or more document regions on the image
A document region, wherein first model is model neural network based;
Image and the second trained in advance model based on each document, identify one or more regions on the document
In each region, all or part of information phase recorded on each region and the document in one or more of regions
Association, wherein second model is model neural network based;
Cut and obtain the image in each region in one or more of regions, each of one or more of regions
The image in region is by being parallel to horizontal rectangle or having inclined rectangle to define relative to horizontal line;And
Image based on each region in one or more of regions and third model trained in advance, described in identification
The character in each region in one or more regions, so that it is determined that the information recorded on the document, wherein the third
Model is model neural network based.
2. recognition methods according to claim 1, which is characterized in that one or more document areas on the identification image
After each document region step in domain, each document region is cut and obtains the image of each document, it later will be every
The image of a document inputs second model respectively and is handled.
3. recognition methods according to claim 1, which is characterized in that the cutting simultaneously obtains one or more of regions
In each region image step after, in response to having inclined rectangle relative to horizontal line, to the image in each region
Carry out slant correction processing, and the image in treated each region inputted into the third model, come identify it is one or
The character in each region in multiple regions.
4. recognition methods according to claim 1, which is characterized in that the method is based on institute by the third model
Image and its position in whole document in each region in one or more regions are stated, it is one or more to identify
The character in each region in a region.
5. recognition methods according to claim 1, which is characterized in that the method also includes:
Image and second model based on the document, also identification and each region phase in one or more of regions
The information type of associated information;And
The information type based on the information associated with each region in one or more of regions identified, with
And the character in each region in the one or more of regions identified, to determine the letter recorded on the document
Breath.
6. recognition methods according to claim 1, which is characterized in that identifying one or more regions on the document
In each region before, the method also includes:
Image and the 4th trained in advance model based on the document, identify the classification of the document, wherein the 4th mould
Type is model neural network based;And
It is selected according to the classification identified by second model to be used and/or the third model.
7. recognition methods according to claim 6, which is characterized in that the classification of the document includes at least languages.
8. recognition methods according to claim 1, which is characterized in that first model is obtained by following process:
Processing is labeled to the image sample that each of the first document image sample training concentration includes multiple documents, with mark
Outpour each document region in one or more document regions in each image sample;And
By the first document image sample training collection by the mark processing, first nerves network is trained,
To obtain first model.
9. recognition methods according to claim 1, which is characterized in that second model is obtained by following process:
The each document image sample concentrated to the second document image sample training is labeled processing, each described to mark out
Each region in one or more regions in document image sample, each region and institute in one or more of regions
The all or part of information stated in document image sample is associated;And
By the second document image sample training collection by the mark processing, nervus opticus network is trained,
To obtain second model.
10. recognition methods according to claim 1, which is characterized in that the third model is obtained by following process:
The each document image sample concentrated to third document image sample training is labeled processing, each described to mark out
The character in each region and each region in one or more regions in document image sample, it is one or more of
Each region in region is associated with all or part of information in the document image sample;And
By the third document image sample training collection by the mark processing, third nerve network is trained,
To obtain the third model.
11. recognition methods according to claim 6, which is characterized in that the 4th model is obtained by following process:
The each document image sample concentrated to the 4th document image sample training is labeled processing, each described to mark out
The classification of document image sample;And
By the 4th document image sample training collection by the mark processing, fourth nerve network is trained,
To obtain the 4th model.
12. recognition methods according to claim 8, which is characterized in that the first nerves network is based on target detection
The neural network of algorithm.
13. recognition methods according to claim 9, which is characterized in that the nervus opticus network is based on depth residual error
What network was established.
14. recognition methods according to claim 10, which is characterized in that the third nerve network is based on recurrent neural
What network was established.
15. recognition methods according to claim 11, which is characterized in that the fourth nerve network is based on depth convolution
Neural network.
16. a kind of identifying system of document information characterized by comprising
First model, first model are models neural network based;
Second model, second model are models neural network based;
Third model, the third model are models neural network based;And
One or more first devices, one or more of first devices are configured as:
The first trained based on the image and in advance model, identifies every in one or more document regions on the image
A document region;
Image and second model based on each document, identify every in one or more regions on the document
A region, all or part of information recorded on each region and the document in one or more of regions are associated;
Cut and obtain the image in each region in one or more of regions, each of one or more of regions
The image in region is by being parallel to horizontal rectangle or having inclined rectangle to define relative to horizontal line;And
Image based on each region in one or more of regions and third model trained in advance, described in identification
The character in each region in one or more regions, so that it is determined that the information recorded on the document.
17. a kind of non-transitorycomputer readable storage medium, which is characterized in that the non-transitory computer-readable storage medium
The executable instruction of series of computation machine is stored in matter, when the instruction that the series of computation machine can be performed is one or more
When computing device executes, so that one or more of computing devices:
The first trained based on the image and in advance model, identifies every in one or more document regions on the image
A document region, wherein first model is model neural network based;
Image and the second trained in advance model based on each document, identify one or more regions on the document
In each region, all or part of information phase recorded on each region and the document in one or more of regions
Association, wherein second model is model neural network based;
Cut and obtain the image in each region in one or more of regions, each of one or more of regions
The image in region is by being parallel to horizontal rectangle or having inclined rectangle to define relative to horizontal line;And
Image based on each region in one or more of regions and third model trained in advance, described in identification
The character in each region in one or more regions, so that it is determined that the information recorded on the document, wherein the third
Model is model neural network based.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810917232.1A CN109241857A (en) | 2018-08-13 | 2018-08-13 | A kind of recognition methods and system of document information |
CN2018109172321 | 2018-08-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110390320A true CN110390320A (en) | 2019-10-29 |
Family
ID=65070522
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810917232.1A Pending CN109241857A (en) | 2018-08-13 | 2018-08-13 | A kind of recognition methods and system of document information |
CN201910729840.4A Pending CN110390320A (en) | 2018-08-13 | 2019-08-08 | A kind of includes the recognition methods and system of the image information of multiple documents |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810917232.1A Pending CN109241857A (en) | 2018-08-13 | 2018-08-13 | A kind of recognition methods and system of document information |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN109241857A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866495A (en) * | 2019-11-14 | 2020-03-06 | 杭州睿琪软件有限公司 | Bill image recognition method, bill image recognition device, bill image recognition equipment, training method and storage medium |
CN111160188A (en) * | 2019-12-20 | 2020-05-15 | 中国建设银行股份有限公司 | Financial bill identification method, device, equipment and storage medium |
CN112241727A (en) * | 2020-10-30 | 2021-01-19 | 深圳供电局有限公司 | Multi-ticket identification method and system and readable storage medium |
CN112699860A (en) * | 2021-03-24 | 2021-04-23 | 成都新希望金融信息有限公司 | Method for automatically extracting and sorting effective information in personal tax APP operation video |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919125B (en) * | 2019-03-19 | 2021-03-26 | 厦门商集网络科技有限责任公司 | Travel route restoration method and system based on bill recognition |
CN110956739A (en) * | 2019-05-09 | 2020-04-03 | 杭州睿琪软件有限公司 | Bill identification method and device |
CN111967286A (en) | 2019-05-20 | 2020-11-20 | 京东方科技集团股份有限公司 | Method and device for identifying information bearing medium, computer equipment and medium |
CN110796145B (en) * | 2019-09-19 | 2024-01-19 | 平安科技(深圳)有限公司 | Multi-certificate segmentation association method and related equipment based on intelligent decision |
CN111461133B (en) * | 2020-04-20 | 2023-04-18 | 上海东普信息科技有限公司 | Express delivery surface single item name identification method, device, equipment and storage medium |
CN111768820A (en) * | 2020-06-04 | 2020-10-13 | 上海森亿医疗科技有限公司 | Paper medical record digitization and target detection model training method, device and storage medium |
CN112395995A (en) * | 2020-11-19 | 2021-02-23 | 深圳供电局有限公司 | Method and system for automatically filling and checking bill according to mobile financial bill |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105303363A (en) * | 2015-09-28 | 2016-02-03 | 四川长虹电器股份有限公司 | Data processing method and data processing system |
CN105354177A (en) * | 2015-09-28 | 2016-02-24 | 四川长虹电器股份有限公司 | Data processing system and data processing method |
CN107220648A (en) * | 2017-04-11 | 2017-09-29 | 平安科技(深圳)有限公司 | The character identifying method and server of Claims Resolution document |
-
2018
- 2018-08-13 CN CN201810917232.1A patent/CN109241857A/en active Pending
-
2019
- 2019-08-08 CN CN201910729840.4A patent/CN110390320A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105303363A (en) * | 2015-09-28 | 2016-02-03 | 四川长虹电器股份有限公司 | Data processing method and data processing system |
CN105354177A (en) * | 2015-09-28 | 2016-02-24 | 四川长虹电器股份有限公司 | Data processing system and data processing method |
CN107220648A (en) * | 2017-04-11 | 2017-09-29 | 平安科技(深圳)有限公司 | The character identifying method and server of Claims Resolution document |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866495A (en) * | 2019-11-14 | 2020-03-06 | 杭州睿琪软件有限公司 | Bill image recognition method, bill image recognition device, bill image recognition equipment, training method and storage medium |
CN110866495B (en) * | 2019-11-14 | 2022-06-28 | 杭州睿琪软件有限公司 | Bill image recognition method, bill image recognition device, bill image recognition equipment, training method and storage medium |
CN111160188A (en) * | 2019-12-20 | 2020-05-15 | 中国建设银行股份有限公司 | Financial bill identification method, device, equipment and storage medium |
CN112241727A (en) * | 2020-10-30 | 2021-01-19 | 深圳供电局有限公司 | Multi-ticket identification method and system and readable storage medium |
CN112699860A (en) * | 2021-03-24 | 2021-04-23 | 成都新希望金融信息有限公司 | Method for automatically extracting and sorting effective information in personal tax APP operation video |
CN112699860B (en) * | 2021-03-24 | 2021-06-22 | 成都新希望金融信息有限公司 | Method for automatically extracting and sorting effective information in personal tax APP operation video |
Also Published As
Publication number | Publication date |
---|---|
CN109241857A (en) | 2019-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110390320A (en) | A kind of includes the recognition methods and system of the image information of multiple documents | |
CN108564035A (en) | The method and system for the information recorded on identification document | |
Abbott | Applied predictive analytics: Principles and techniques for the professional data analyst | |
CN109101469B (en) | Extracting searchable information from digitized documents | |
CN111428599B (en) | Bill identification method, device and equipment | |
US10957086B1 (en) | Visual and digital content optimization | |
WO2020143377A1 (en) | Industry recognition model determination method and apparatus | |
CN108132887B (en) | User interface method of calibration, device, software testing system, terminal and medium | |
CN110363084A (en) | A kind of class state detection method, device, storage medium and electronics | |
CN107657056A (en) | Method and apparatus based on artificial intelligence displaying comment information | |
JPWO2019008766A1 (en) | Voucher processing system and voucher processing program | |
US20220292861A1 (en) | Docket Analysis Methods and Systems | |
CN105843818A (en) | Training device, training method, determining device, and recommendation device | |
US20240202433A1 (en) | Standardized form recognition method, associated computer program product, processing and learning systems | |
CN113807066A (en) | Chart generation method and device and electronic equipment | |
US9070010B2 (en) | Image check content estimation and use | |
US9418308B2 (en) | Automatic evaluation of line weights | |
JP6810303B1 (en) | Data processing equipment, data processing method and data processing program | |
Robiolo | How simple is it to measure software size and complexity for an it practitioner? | |
US7676427B1 (en) | System and method of continuous assurance | |
JP6844076B1 (en) | Data processing equipment, data processing methods and programs | |
JP6784788B2 (en) | Information processing equipment, information processing methods and programs | |
CN112766391B (en) | Method, system, equipment and medium for making document | |
WO2022049689A1 (en) | Data processing device, data processing method, and program | |
WO2022054136A1 (en) | Data processing device, data processing method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20191029 |
|
WD01 | Invention patent application deemed withdrawn after publication |