CN107958204A - Reference report recognition methods, device, computer equipment and storage medium - Google Patents
Reference report recognition methods, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN107958204A CN107958204A CN201711020665.9A CN201711020665A CN107958204A CN 107958204 A CN107958204 A CN 107958204A CN 201711020665 A CN201711020665 A CN 201711020665A CN 107958204 A CN107958204 A CN 107958204A
- Authority
- CN
- China
- Prior art keywords
- recognition result
- report
- reference report
- recognition
- ocr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Character Discrimination (AREA)
Abstract
The present invention relates to a kind of reference report recognition methods, device, computer equipment and storage medium, including:Obtain reference report, reference is reported as the photocopy data containing credit information, reference report carries unique mark, use OCR technique identification reference report that photocopy data is converted to text message, exported text message as recognition result, the accuracy rate of recognition result is detected, when the accuracy rate of recognition result meets preset condition, exports recognition result.Due to the use of OCR technique, when reference report is checked and audited, it is only necessary to which terminal is operated automatically, substantial amounts of man power and material need not be expended, moreover, carrying out that operating result is more accurate automatically by terminal, it is not easy to reveal the information in reference report.
Description
Technical field
The present invention relates to data processing field, more particularly to a kind of reference report recognition methods, device, computer equipment
And storage medium.
Background technology
Reference report is the primary information resource and foundation of financial industry credit evaluation.In the conventional technology, to reference report
The utilization of information is mainly by manually checking and auditing in announcement, since the information that every part of reference report contains is different, artificial
When checking and auditing, it usually needs check and classify to reference report portionwise, then auditing result is put in storage one by one.
This processing mode to reference report of conventional art is more complicated, it is necessary to expend substantial amounts of human and material resources, wealth
Power.Also, the result by manually checking examination & verification is often not accurate enough, the information being also easy in leakage reference report.
The content of the invention
Based on this, it is necessary to for above-mentioned auditing result it is not accurate enough and easily in the report of leakage reference information is asked
Topic, there is provided a kind of reference report recognition methods, device, computer equipment and storage medium.
A kind of reference reports recognition methods, the described method includes:
Reference report is obtained, the reference is reported as the photocopy data containing credit information, and the reference report carries
Unique mark;
OCR technique is used to identify the reference report so that the photocopy data is converted to text message, by the text
Information is exported as recognition result;
Detect the accuracy rate of the recognition result;
When the accuracy rate of the recognition result meets preset condition, recognition result is exported.
In one embodiment, the acquisition reference report, including:
Obtain the unique mark of reference report;
Under being reported as not there is no the reference during unique mark of the reference report obtained, then obtained in database
The reference report carried;
Record of the unique mark that the detection reference do not downloaded is reported in daily record, when there are during log recording,
The success of reference report acquisition.
In one embodiment, it is described to use OCR technique to identify the reference report so that the photocopy data to be converted to
Text message, including:
The classification of the reference report is obtained, corresponding default OCR recognition templates are obtained according to the classification, according to described
Default OCR recognition templates identify the reference report so that the audiovisual materials are converted to text message.
In one embodiment, further included before reference report is obtained:
Reference report sample is obtained, is classified to reference report sample;
According to the corresponding OCR recognition templates of the classification setting, and Template Location word is set in the OCR recognition templates
Symbol, character dependence and recognition result export structure.
In one embodiment, the accuracy rate of the detection recognition result, including:
Calculate the confidence level of the character in the recognition result;
The accuracy rate of the recognition result is drawn according to the confidence level of the character.
In one embodiment, it is described when the accuracy rate of the recognition result meets preset condition, recognition result is exported,
Including:
When the rate of accuracy reached of the recognition result is to preset characters precision, the recognition result is tied according to the identification
Fruit export structure is exported.
In one embodiment, after the output recognition result, further include:
When the recognition result of output reaches setting quantity, batch storage is carried out to the recognition result.
A kind of reference reports identification device, and described device includes:
Report acquisition module, for obtaining reference report, the reference is reported as the photocopy data containing credit information, institute
State reference report and carry unique mark;
Info conversion module, for using OCR technique to identify, the reference is reported so that the photocopy data is converted to text
This information, exports the text message as recognition result;
As a result detection module, for detecting the accuracy rate of the recognition result;
As a result output module, for when the accuracy rate of the recognition result meets preset condition, exporting recognition result.
A kind of computer equipment, including memory, processor and be stored in the memory and can be in the processing
The step of computer program run on device, the processor realizes method as described above when performing the computer program.
A kind of computer-readable recording medium, the computer-readable recording medium storage have computer program, the meter
The step of calculation machine program realizes method as described above when being executed by processor.
Above-mentioned reference report recognition methods, device, computer equipment and storage medium, obtain reference report, reference report
For the photocopy data containing credit information, reference report carries unique mark, uses OCR technique identification reference report with by shadow
Print data is converted to text message, is exported text message as recognition result, detects the accuracy rate of recognition result, when identification is tied
When the accuracy rate of fruit meets preset condition, recognition result is exported.Due to the use of OCR technique, checked to reference report
When with examination & verification, it is only necessary to which terminal is operated automatically, it is not necessary to substantial amounts of man power and material is expended, moreover, passing through terminal
Automatic progress operating result is more accurate, is not easy to reveal the information in reference report.
Brief description of the drawings
Fig. 1 is the applied environment figure that reference reports recognition methods in one embodiment;
Fig. 2 is the cut-away view of terminal in Fig. 1 in one embodiment;
Fig. 3 is the flow chart that reference reports recognition methods in one embodiment;
Fig. 4 is the method flow diagram that reference report is obtained in one embodiment;
Fig. 5 is the method flow diagram that template is set in one embodiment;
Fig. 6 is the method flow diagram that recognition result accuracy rate is detected in one embodiment;
Fig. 7 is the structure diagram that reference reports identification device in one embodiment;
Fig. 8 is the structure diagram that reference reports identification device in another embodiment;
Fig. 9 is the structure diagram that reference reports identification device in further embodiment.
Embodiment
To enable objects, features and advantages of the present invention more obvious understandable, below in conjunction with the accompanying drawings to the tool of the present invention
Body embodiment is described in detail.Many details are elaborated in the following description in order to fully understand the present invention.
But the invention can be embodied in many other ways as described herein, those skilled in the art can without prejudice to
Similar improvement is done in the case of intension of the present invention, therefore the present invention is not limited to the specific embodiments disclosed below.
Fig. 1 is the applied environment figure that reference reports recognition methods in one embodiment.As shown in Figure 1, the application environment bag
Terminal 110 and server 120 are included, wherein, communicated between terminal 110 and server 120 by network.
Terminal 110 can be laptop, desktop computer, individual digital computer, portable laptop computer etc., but simultaneously
It is not limited to this.Terminal 110 is photocopied a document by what server 120 obtained that reference reports, and to photocopying a document of getting into
Row detection, to determine successfully to obtain photocopying a document for reference report.Terminal 110 is known using OCR technique identification reference report
After other result and the accuracy rate to recognition result are detected, terminal 110 can export recognition result according to fixed form, then will
The recognition result batch of output is put in storage, and the recognition result of storage can be uploaded onto the server in 120 and be stored by terminal 110.
In one embodiment, there is provided a kind of computer equipment, the computer equipment can be terminals 110, in Fig. 1
The internal structure of terminal 110 is as shown in Fig. 2, the terminal 110 is included by the processor of system bus connection, storage medium, interior
Deposit, display and network interface.Wherein, the storage medium of terminal 110 is stored with operating system, database, further includes for real
Existing reference report recognition methods and the computer program of device.The processor is used to provide calculating and control ability, and support is whole
The operation of terminal 110.Display in terminal 110 is used to show information, for example, when reference report photocopy data obtains failure
When, mail notification can be received, display is used to show received mail, inside saves as and reference report identification is realized in storage medium
The operation of the computer program of method and apparatus provides environment, and network interface is used to carry out network service, example with server 120
Such as, the recognition result exported according to form batch can be put in storage and uploaded onto the server and 120 be stored by network interface.In Fig. 2
The structure shown, only with the block diagram of the relevant part-structure of application scheme, does not form and application scheme is applied
The restriction of terminal thereon, specific terminal can include more some than more or fewer components shown in figure, or combination
Component, or arranged with different components.
In one embodiment, there is provided a kind of reference reports recognition methods, with applied to the end in above application environment
End is come for example, as shown in figure 3, including the following steps:
Step S302, obtains reference report, and reference is reported as the photocopy data containing credit information, and reference report carries
Unique mark.
Wherein, reference report is the main source and foundation of financial industry credit evaluation, is divided into personal credit report and enterprise
Industry credit report, for querying individual or the social credibility of enterprise.Terminal gets reference from database by server and reports,
Here the reference report got is pure picture format, and every part of reference report all carries unique mark.
Step S304, uses OCR technique identification reference report so that photocopy data is converted to text message, by text message
Exported as recognition result.
Wherein, OCR technique is the abbreviation (Optical Character Recognition) of optical character identification, is logical
The word of various bills, newpapers and periodicals, books, manuscript and other printed matters is converted into image letter by the optics input modes such as overscanning
Breath, recycles character recognition technology that image information is converted into the computer input technology that can be used.
Reference report is identified using OCR technique, the photocopy that reference can be reported using the Text region in OCR
Data is converted into text message, this text message is exactly the recognition result reported using OCR technique reference.Terminal can incite somebody to action
This text message, that is, recognition result output.
Step S306, detects the accuracy rate of recognition result.
Accuracy rate reflects the order of accuarcy that reference report is identified, and after exporting recognition result, terminal can basis
Character machining involved in recognition result goes out the accuracy rate of recognition result.
Step S308, when the accuracy rate of recognition result meets preset condition, exports recognition result.
Wherein, preset condition is the scope of some numerical value pre-set, this scope is to identification according to reality
Precision and identification coverage rate carry out a definite number range.When recognition result falls into this number range, this identification knot
Fruit is exactly qualified, illustrates that it is sufficiently exact that reference report, which is identified, terminal can be eligible by this
Recognition result output.
Reported by obtaining reference, reference is reported as the photocopy data containing credit information, and reference report carries unique
Mark, uses OCR technique identification reference report so that photocopy data is converted to text message, using text message as recognition result
Output, detects the accuracy rate of recognition result, when the accuracy rate of recognition result meets preset condition, exports recognition result.Due to
OCR technique is used, when reference report is checked and audited, it is only necessary to which terminal is operated automatically, it is not necessary to
Substantial amounts of man power and material is expended, moreover, it is more accurate to carry out operating result automatically by terminal, is not easy to reveal in reference report
Information.
In one embodiment, there is provided a kind of reference report recognition methods further include obtain reference report process, such as
Shown in Fig. 4, include the following steps:
Step S402, obtains the unique mark of reference report.
Every part of reference report all containing unique mark, is distinguished for being reported with other references, and terminal is obtaining reference report
What is got first during announcement is the unique mark of reference report.
Step S404, when the unique mark of the reference obtained report is not present in database, then the reference obtained is reported
For the reference report do not downloaded.
The unique mark of many reference reports downloaded is stored with database.Obtain unique mark of reference report
After knowledge, searched in the mark of database, if in the presence of this part of reference report for showing to get is what is downloaded
Reference is reported.Conversely, when the unique mark of the reference report got is not present in the mark of database, then show to obtain
To this part of reference report be do not downloaded reference report.
Step S406, detects record of the unique mark for the reference report do not downloaded in daily record, when there are daily record note
During record, the success of reference report acquisition.
When the reference report got is the reference report do not downloaded, the report of this reference can have been recorded in daily record
Unique mark.Terminal can be detected the record in this daily record, if there are log recording, show reference report into
Work(obtains.When reference report acquisition is unsuccessful, terminal can eject a warning information, can also be by sending the side of mail
Formula informs that reference report acquisition is unsuccessful.
The unique mark reported by obtaining reference, whether the reference report for judging to get is the reference report do not downloaded
Accuse, then detect whether reference report obtains success.This series of process all need not be operated manually, improve acquisition sign
Believe the efficiency of report, due to being that terminal is performing these operations, ensure that information will not be revealed.
In one embodiment, there is provided a kind of report recognition methods of reference further include photocopy data be converted into text envelope
The process of breath, specifically includes:The classification of reference report is obtained, corresponding default OCR recognition templates are obtained according to classification, according to pre-
If OCR recognition templates identification reference is reported so that photocopy data is converted to text message.
Wherein, the classification of reference report includes personal essential information class, transaction with credit info class and other information class.Into
One step, the information involved in every kind of reference report category is all different, and default OCR recognition templates can include personal basic
Information class template, transaction with credit information class template and other information class template.By default OCR recognition templates, recycle
OCR identification technologies, it is possible to reference is reported text message is converted to photocopy data.
The classification of reference report is obtained, corresponding default OCR recognition templates are obtained according to classification, are identified according to default OCR
Template identification reference is reported so that photocopy data is converted to text message.It is this to be changed photocopy data using OCR identification technologies
For the mode of text message, the recognition efficiency of reference report is not only increased, and substantial amounts of man power and material need not be expended.
In one embodiment, as shown in Figure 5, there is provided a kind of reference report recognition methods further include set template mistake
Journey, comprises the following steps that:
Step S502, obtains reference report sample, classifies to reference report sample.
Terminal can get the sample of reference report from server, and sample is reported to reference according to the classification of reference report
Classify.It is divided into personal essential information class reference report, the report of transaction with credit info class reference for example, can report reference
And other information class reference report.
Step S504, according to the corresponding OCR recognition templates of classification setting, and sets Template Location in OCR recognition templates
Character, character dependence and recognition result export structure.
The corresponding OCR recognition templates of different reference report categories are different, and Template Location can be set in OCR recognition templates
Character, character dependence and recognition result export structure.Wherein, Template Location character is used in OCR recognition templates really
Determine the position of character, easy to export recognition result when makes character be exported according to specific position.Character dependence refers to
Context between character, easy to export recognition result when, make character be exported according to specific tandem.Template is determined
Position character and character dependence are used to set different OCR recognition templates according to the position of character and context, easy to know
The output of other result.Recognition result export structure defines the data structure form of recognition result output.For example, know according to OCR
The structure output recognition result of other template.
Sample is reported by obtaining reference, is classified to reference report sample, is identified according to the corresponding OCR of classification setting
Template, and Template Location character, character dependence and recognition result export structure are set in OCR recognition templates.Due to
The classification of each reference report is all corresponding with OCR recognition templates, these templates, which only need to set according to classification, once can
, improve the efficiency that reference report identifies.
In one embodiment, there is provided a kind of reference report recognition methods further include detection recognition result accuracy rate mistake
Journey, as shown in fig. 6, including the following steps:
Step S602, calculates the confidence level of the character in recognition result.
Wherein, confidence level is also referred to as reliability, be estimate with population parameter the phase within the error range necessarily allowed
The probability answered.Terminal can calculate corresponding confidence level by the character in recognition result.
Step S604, the accuracy rate of recognition result is drawn according to the confidence level of character.
The confidence level of character is a probability, and terminal can draw the accuracy rate of recognition result according to this probability.
By calculating the confidence level of the character in recognition result, the accurate of recognition result is drawn according to the confidence level of character
Rate.Terminal obtains the accuracy rate of recognition result according to the confidence level of character, due to the position of each character be it is relatively-stationary,
According to confidence level recognition result can be made more accurate.
In one embodiment, there is provided a kind of reference report recognition methods further include when the accuracy rate of recognition result meets
During preset condition, recognition result is exported, is specifically included:When the rate of accuracy reached of recognition result is to preset characters precision, will identify
As a result exported according to recognition result export structure.
Wherein, preset characters precision has certain value range, this value is that accuracy of identification and identification are covered according to reality
Rate carrys out a definite numerical value., will be according to Template Location when the rate of accuracy reached of recognition result is to default character accuracy rating
The recognition result that character, character dependence obtain, is exported according to recognition result export structure.
When the rate of accuracy reached of recognition result is to preset characters precision, by recognition result according to recognition result export structure into
Row output.By judging the accuracy rate of recognition result, when reaching precision, recognition result is exported, improves output
As a result accuracy rate.
In one embodiment, there is provided a kind of reference report recognition methods further include:When the recognition result of output reaches
When setting quantity, batch storage is carried out to recognition result.
When the recognition result of output has multiple, terminal can carry out batch storage according to predetermined quantity to recognition result.Example
Such as, when the quantity of recognition result reaches 20, batch storage just is carried out to recognition result.Terminal can also be according to number of days pair
Recognition result carries out batch storage.Be put in storage for example, every other day just carrying out once batch to recognition result.
When the recognition result of output reaches setting quantity, batch storage is carried out to recognition result.When recognition result adds up
To it is a certain amount of when be put in storage again, so can not only reduce network service burden, efficiency can also be improved.
In one embodiment, there is provided a kind of reference reports recognition methods, realizes that comprising the following steps that for this method is described:
First, terminal needs to obtain reference report sample, and classifies to reference report sample.Terminal can be from service
Device gets the sample of reference report, classifies according to the classification of reference report to reference report sample.Set further according to classification
Corresponding OCR recognition templates are put, and Template Location character, character dependence and identification knot are set in OCR recognition templates
Fruit export structure.The corresponding OCR recognition templates of different reference report categories are different, and mould can be set in OCR recognition templates
Plate location character, character dependence and recognition result output result.Wherein, Template Location character and character dependence are used
Different OCR recognition templates are set in the position according to character and context, easy to the output of recognition result.These templates are only
Need to make once.
Then, terminal can obtain the unique mark of reference report, when the reference report that acquisition is not present in database
During the unique mark of announcement, then the reference obtained is reported as the reference report do not downloaded, detects what the reference do not downloaded was reported
Record of the unique mark in daily record, when there are during log recording, reference report acquisition is successful.
Then, after successfully getting reference report, the classification of reference report is obtained, is obtained according to classification corresponding default
OCR recognition templates, identify reference report so that photocopy data is converted to text message according to default OCR recognition templates.Further
, the information involved in every kind of reference report category is all different, and default OCR recognition templates can include personal essential information
Class template, transaction with credit information class template and other information class template.By default OCR recognition templates, OCR is recycled to know
Other technology, it is possible to reference is reported text message is converted to photocopy data.
Further, the accuracy rate of recognition result is detected.Specifically include:The confidence level of the character in recognition result is calculated,
The accuracy rate of recognition result is drawn according to the confidence level of character.Terminal can be calculated accordingly by the character in recognition result
Confidence level, the accuracy rate of recognition result is drawn further according to this probability.
When the accuracy rate of recognition result meets preset condition, recognition result is exported.Preset condition is pre-set
The scope of some numerical value, this scope are come a definite numerical value model according to reality to accuracy of identification and identification coverage rate
Enclose.When recognition result falls into this number range, this recognition result is exactly qualified, and terminal can accord with this
The recognition result output of conjunction condition.Specifically include:During by the rate of accuracy reached of recognition result to preset characters precision, by recognition result
Exported according to recognition result export structure.Wherein, preset characters precision has certain value range, this value is according to reality
A definite numerical value is come to accuracy of identification and identification coverage rate.When the rate of accuracy reached of recognition result is to default character precision model
When enclosing, the recognition result that will be obtained according to Template Location character, character dependence, carries out defeated according to recognition result export structure
Go out.When the recognition result of output reaches sets requirement, batch storage is carried out to recognition result.The recognition result of output has multiple
When, terminal can carry out batch storage according to predetermined quantity to recognition result.For example, whenever the quantity of recognition result reaches 20
When, batch storage just is carried out to recognition result.Terminal can also carry out batch storage according to number of days to recognition result.It is for example, every
Just once batch is carried out to recognition result every two days to be put in storage.
As shown in fig. 7, in one embodiment, there is provided a kind of reference reports identification device, including:
Report acquisition module 710, for obtaining reference report, reference is reported as the photocopy data containing credit information, levies
Letter report carries unique mark.
Info conversion module 720, for using OCR technique identification reference report that photocopy data is converted to text envelope
Breath, exports text message as recognition result.
As a result detection module 730, for detecting the accuracy rate of recognition result.
As a result output module 740, for when the accuracy rate of recognition result meets preset condition, exporting recognition result.
In one embodiment, report acquisition module 710 be used for obtain reference report unique mark, when in database not
During the unique mark reported there are the reference of acquisition, then the reference that obtains is reported as the reference report do not downloaded, detection
Record of the unique mark for the reference report do not downloaded in daily record, when there are during log recording, reference report acquisition is successful.
In one embodiment, info conversion module 720 is used for the classification for obtaining reference report, is obtained and corresponded to according to classification
Default OCR recognition templates, identify reference report photocopy data is converted to text message according to default OCR recognition templates.
As shown in figure 8, in one embodiment, there is provided a kind of reference report identification device further include:
Sample acquisition module 750, for obtaining reference report sample, classifies reference report sample.
Template-setup module 760, for according to the corresponding OCR recognition templates of classification setting, and sets in OCR recognition templates
Put Template Location character, character dependence and recognition result export structure.
In one embodiment, as a result detection module 730 is used for the confidence level for calculating the character in recognition result;According to word
The confidence level of symbol draws the accuracy rate of recognition result.
In one embodiment, as a result output module 740 is used for the rate of accuracy reached when recognition result to preset characters precision
When, recognition result is exported according to recognition result export structure.
As shown in figure 9, in one embodiment, there is provided a kind of reference report identification device further include:
As a result library module 770 is entered, for when the recognition result of output reaches sets requirement, batch to be carried out to recognition result
Storage.
In one embodiment, a kind of computer-readable recording medium is also provided, the computer-readable recording medium storage
There is computer program, which realizes the step of the interface test method in above-mentioned each embodiment when being executed by processor
Suddenly.
One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, it is non-volatile computer-readable that the program can be stored in one
Take in storage medium, in the embodiment of the present invention, which can be stored in the non-volatile memory medium of computer system, and
Performed by least one processor in the computer system, to realize the flow for including the embodiment such as above-mentioned each method.Its
In, the storage medium can be magnetic disc, CD, read-only memory (Read-Only Memory, ROM) or random storage
Memory body (Random Access Memory, RAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, to make description succinct, not to above-mentioned reality
Apply all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, the scope that this specification is recorded all is considered to be.
Embodiment described above only expresses the several embodiments of the present invention, its description is more specific and detailed, but simultaneously
Cannot therefore it be construed as limiting the scope of the patent.It should be pointed out that come for those of ordinary skill in the art
Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention
Scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (10)
1. a kind of reference reports recognition methods, it is characterised in that the described method includes:
Reference report is obtained, the reference is reported as the photocopy data containing credit information, and the reference report carries unique
Mark;
OCR technique is used to identify the reference report so that the photocopy data is converted to text message, by the text message
Exported as recognition result;
Detect the accuracy rate of the recognition result;
When the accuracy rate of the recognition result meets preset condition, recognition result is exported.
2. according to the method described in claim 1, it is characterized in that, it is described acquisition reference report, including:
Obtain the unique mark of reference report;
When the unique mark of the reference obtained report is not present in database, then the reference obtained is reported as not downloading
Reference report;
Record of the unique mark that the detection reference do not downloaded is reported in daily record, when there are during log recording, reference
Report acquisition success.
3. according to the method described in claim 1, it is characterized in that, described use OCR technique to identify the reference report to incite somebody to action
The photocopy data is converted to text message, including:
The classification of the reference report is obtained, corresponding default OCR recognition templates are obtained according to the classification, according to described default
OCR recognition templates identify the reference report so that the photocopy data is converted to text message.
4. according to the method described in claim 3, it is characterized in that, further included before reference report is obtained:
Reference report sample is obtained, is classified to reference report sample;
According to the corresponding OCR recognition templates of the classification setting, and set in the OCR recognition templates Template Location character,
Character dependence and recognition result export structure.
5. according to the method described in claim 1, it is characterized in that, the accuracy rate of the detection recognition result, including:
Calculate the confidence level of the character in the recognition result;
The accuracy rate of the recognition result is drawn according to the confidence level of the character.
It is 6. according to the method described in claim 5, it is characterized in that, described when the accuracy rate of the recognition result meets default bar
During part, recognition result is exported, including:
It is when the rate of accuracy reached of the recognition result is to preset characters precision, the recognition result is defeated according to the recognition result
Go out structure to be exported.
7. according to the method described in claim 1, it is characterized in that, after the output recognition result, further include:
When the recognition result of output reaches sets requirement, batch storage is carried out to the recognition result.
8. a kind of reference reports identification device, it is characterised in that described device includes:
Report acquisition module, for obtaining reference report, the reference is reported as the photocopy data containing credit information, the sign
Letter report carries unique mark;
Info conversion module, for using OCR technique to identify, the reference is reported so that the photocopy data is converted to text envelope
Breath, exports the text message as recognition result;
As a result detection module, for detecting the accuracy rate of the recognition result;
As a result output module, for when the accuracy rate of the recognition result meets preset condition, exporting recognition result.
9. a kind of computer equipment, including memory, processor and it is stored in the memory and can be in the processor
The computer program of upper operation, it is characterised in that the processor realized when performing the computer program as claim 1 to
The step of any one of 7 the method.
10. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, its feature exists
In when the computer program is executed by processor the step of realization such as any one of claim 1 to 7 the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711020665.9A CN107958204A (en) | 2017-10-27 | 2017-10-27 | Reference report recognition methods, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711020665.9A CN107958204A (en) | 2017-10-27 | 2017-10-27 | Reference report recognition methods, device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107958204A true CN107958204A (en) | 2018-04-24 |
Family
ID=61964053
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711020665.9A Pending CN107958204A (en) | 2017-10-27 | 2017-10-27 | Reference report recognition methods, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107958204A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232328A (en) * | 2019-05-21 | 2019-09-13 | 深圳壹账通智能科技有限公司 | A kind of reference report analytic method, device and computer readable storage medium |
CN111383124A (en) * | 2020-05-29 | 2020-07-07 | 支付宝(杭州)信息技术有限公司 | User material verification method and device |
CN112581699A (en) * | 2020-12-23 | 2021-03-30 | 华言融信科技成都有限公司 | Credit report self-service interpretation equipment |
CN112598503A (en) * | 2020-12-25 | 2021-04-02 | 四川享宇金信金融科技有限公司 | OCR recognition system and method based on credit investigation recognition |
CN112819003A (en) * | 2021-04-19 | 2021-05-18 | 北京妙医佳健康科技集团有限公司 | Method and device for improving OCR recognition accuracy of physical examination report |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106572100A (en) * | 2016-10-25 | 2017-04-19 | 中国建设银行股份有限公司 | Service data transfer audit method, device and system |
CN106911751A (en) * | 2015-12-23 | 2017-06-30 | 北京奇虎科技有限公司 | File acquisition method, device and system |
CN107067228A (en) * | 2017-03-31 | 2017-08-18 | 南京钧元网络科技有限公司 | A kind of hand-held authentication intelligent checks system and its checking method |
CN107145562A (en) * | 2017-05-02 | 2017-09-08 | 北京奇艺世纪科技有限公司 | A kind of method of data synchronization, apparatus and system |
-
2017
- 2017-10-27 CN CN201711020665.9A patent/CN107958204A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106911751A (en) * | 2015-12-23 | 2017-06-30 | 北京奇虎科技有限公司 | File acquisition method, device and system |
CN106572100A (en) * | 2016-10-25 | 2017-04-19 | 中国建设银行股份有限公司 | Service data transfer audit method, device and system |
CN107067228A (en) * | 2017-03-31 | 2017-08-18 | 南京钧元网络科技有限公司 | A kind of hand-held authentication intelligent checks system and its checking method |
CN107145562A (en) * | 2017-05-02 | 2017-09-08 | 北京奇艺世纪科技有限公司 | A kind of method of data synchronization, apparatus and system |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232328A (en) * | 2019-05-21 | 2019-09-13 | 深圳壹账通智能科技有限公司 | A kind of reference report analytic method, device and computer readable storage medium |
CN111383124A (en) * | 2020-05-29 | 2020-07-07 | 支付宝(杭州)信息技术有限公司 | User material verification method and device |
CN112581699A (en) * | 2020-12-23 | 2021-03-30 | 华言融信科技成都有限公司 | Credit report self-service interpretation equipment |
CN112598503A (en) * | 2020-12-25 | 2021-04-02 | 四川享宇金信金融科技有限公司 | OCR recognition system and method based on credit investigation recognition |
CN112819003A (en) * | 2021-04-19 | 2021-05-18 | 北京妙医佳健康科技集团有限公司 | Method and device for improving OCR recognition accuracy of physical examination report |
CN112819003B (en) * | 2021-04-19 | 2021-08-27 | 北京妙医佳健康科技集团有限公司 | Method and device for improving OCR recognition accuracy of physical examination report |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107958204A (en) | Reference report recognition methods, device, computer equipment and storage medium | |
US20190286898A1 (en) | System and method for data extraction and searching | |
CN112613501A (en) | Information auditing classification model construction method and information auditing method | |
JP4829920B2 (en) | Form automatic embedding method and apparatus, graphical user interface apparatus | |
US7983468B2 (en) | Method and system for extracting information from documents by document segregation | |
CN111046784A (en) | Document layout analysis and identification method and device, electronic equipment and storage medium | |
CN107220648A (en) | The character identifying method and server of Claims Resolution document | |
CN107622263B (en) | The character identifying method and device of document image | |
US8099384B2 (en) | Operation procedure extrapolating system, operation procedure extrapolating method, computer-readable medium and computer data signal | |
US20160092730A1 (en) | Content-based document image classification | |
US20050207635A1 (en) | Method and apparatus for printing documents that include MICR characters | |
CN110110726A (en) | Power equipment nameplate identification method and device, computer equipment and storage medium | |
CN108595544A (en) | A kind of document picture classification method | |
CN112668640B (en) | Text image quality evaluation method, device, equipment and medium | |
CN111598099B (en) | Image text recognition performance testing method, device, testing equipment and medium | |
CN116740723A (en) | PDF document identification method based on open source Paddle framework | |
CN112668444A (en) | Bird detection and identification method based on YOLOv5 | |
CN113496115B (en) | File content comparison method and device | |
CN117371049A (en) | Machine-generated text detection method and system based on blockchain and generated countermeasure network | |
CN112307101A (en) | Project pricing auditing method, device, computer equipment and system | |
CN107705000A (en) | Choosing method, device, storage medium and the computer equipment of scanning device | |
CN110188073A (en) | Method, apparatus, storage medium and the computer equipment of In vivo detection log parsing | |
CN109145308B (en) | Secret-related text recognition method based on improved naive Bayes | |
TWI768744B (en) | Reference document generation method and system | |
EP4167139A1 (en) | Method and apparatus for data augmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180424 |