CN107958204A - Reference report recognition methods, device, computer equipment and storage medium - Google Patents

Reference report recognition methods, device, computer equipment and storage medium Download PDF

Info

Publication number
CN107958204A
CN107958204A CN201711020665.9A CN201711020665A CN107958204A CN 107958204 A CN107958204 A CN 107958204A CN 201711020665 A CN201711020665 A CN 201711020665A CN 107958204 A CN107958204 A CN 107958204A
Authority
CN
China
Prior art keywords
recognition result
report
reference report
recognition
ocr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711020665.9A
Other languages
Chinese (zh)
Inventor
秦祎晗
刘奕慧
郭玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Dingfeng Cattle Technology Co Ltd
Original Assignee
Shenzhen Dingfeng Cattle Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Dingfeng Cattle Technology Co Ltd filed Critical Shenzhen Dingfeng Cattle Technology Co Ltd
Priority to CN201711020665.9A priority Critical patent/CN107958204A/en
Publication of CN107958204A publication Critical patent/CN107958204A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to a kind of reference report recognition methods, device, computer equipment and storage medium, including:Obtain reference report, reference is reported as the photocopy data containing credit information, reference report carries unique mark, use OCR technique identification reference report that photocopy data is converted to text message, exported text message as recognition result, the accuracy rate of recognition result is detected, when the accuracy rate of recognition result meets preset condition, exports recognition result.Due to the use of OCR technique, when reference report is checked and audited, it is only necessary to which terminal is operated automatically, substantial amounts of man power and material need not be expended, moreover, carrying out that operating result is more accurate automatically by terminal, it is not easy to reveal the information in reference report.

Description

Reference report recognition methods, device, computer equipment and storage medium
Technical field
The present invention relates to data processing field, more particularly to a kind of reference report recognition methods, device, computer equipment And storage medium.
Background technology
Reference report is the primary information resource and foundation of financial industry credit evaluation.In the conventional technology, to reference report The utilization of information is mainly by manually checking and auditing in announcement, since the information that every part of reference report contains is different, artificial When checking and auditing, it usually needs check and classify to reference report portionwise, then auditing result is put in storage one by one.
This processing mode to reference report of conventional art is more complicated, it is necessary to expend substantial amounts of human and material resources, wealth Power.Also, the result by manually checking examination & verification is often not accurate enough, the information being also easy in leakage reference report.
The content of the invention
Based on this, it is necessary to for above-mentioned auditing result it is not accurate enough and easily in the report of leakage reference information is asked Topic, there is provided a kind of reference report recognition methods, device, computer equipment and storage medium.
A kind of reference reports recognition methods, the described method includes:
Reference report is obtained, the reference is reported as the photocopy data containing credit information, and the reference report carries Unique mark;
OCR technique is used to identify the reference report so that the photocopy data is converted to text message, by the text Information is exported as recognition result;
Detect the accuracy rate of the recognition result;
When the accuracy rate of the recognition result meets preset condition, recognition result is exported.
In one embodiment, the acquisition reference report, including:
Obtain the unique mark of reference report;
Under being reported as not there is no the reference during unique mark of the reference report obtained, then obtained in database The reference report carried;
Record of the unique mark that the detection reference do not downloaded is reported in daily record, when there are during log recording, The success of reference report acquisition.
In one embodiment, it is described to use OCR technique to identify the reference report so that the photocopy data to be converted to Text message, including:
The classification of the reference report is obtained, corresponding default OCR recognition templates are obtained according to the classification, according to described Default OCR recognition templates identify the reference report so that the audiovisual materials are converted to text message.
In one embodiment, further included before reference report is obtained:
Reference report sample is obtained, is classified to reference report sample;
According to the corresponding OCR recognition templates of the classification setting, and Template Location word is set in the OCR recognition templates Symbol, character dependence and recognition result export structure.
In one embodiment, the accuracy rate of the detection recognition result, including:
Calculate the confidence level of the character in the recognition result;
The accuracy rate of the recognition result is drawn according to the confidence level of the character.
In one embodiment, it is described when the accuracy rate of the recognition result meets preset condition, recognition result is exported, Including:
When the rate of accuracy reached of the recognition result is to preset characters precision, the recognition result is tied according to the identification Fruit export structure is exported.
In one embodiment, after the output recognition result, further include:
When the recognition result of output reaches setting quantity, batch storage is carried out to the recognition result.
A kind of reference reports identification device, and described device includes:
Report acquisition module, for obtaining reference report, the reference is reported as the photocopy data containing credit information, institute State reference report and carry unique mark;
Info conversion module, for using OCR technique to identify, the reference is reported so that the photocopy data is converted to text This information, exports the text message as recognition result;
As a result detection module, for detecting the accuracy rate of the recognition result;
As a result output module, for when the accuracy rate of the recognition result meets preset condition, exporting recognition result.
A kind of computer equipment, including memory, processor and be stored in the memory and can be in the processing The step of computer program run on device, the processor realizes method as described above when performing the computer program.
A kind of computer-readable recording medium, the computer-readable recording medium storage have computer program, the meter The step of calculation machine program realizes method as described above when being executed by processor.
Above-mentioned reference report recognition methods, device, computer equipment and storage medium, obtain reference report, reference report For the photocopy data containing credit information, reference report carries unique mark, uses OCR technique identification reference report with by shadow Print data is converted to text message, is exported text message as recognition result, detects the accuracy rate of recognition result, when identification is tied When the accuracy rate of fruit meets preset condition, recognition result is exported.Due to the use of OCR technique, checked to reference report When with examination & verification, it is only necessary to which terminal is operated automatically, it is not necessary to substantial amounts of man power and material is expended, moreover, passing through terminal Automatic progress operating result is more accurate, is not easy to reveal the information in reference report.
Brief description of the drawings
Fig. 1 is the applied environment figure that reference reports recognition methods in one embodiment;
Fig. 2 is the cut-away view of terminal in Fig. 1 in one embodiment;
Fig. 3 is the flow chart that reference reports recognition methods in one embodiment;
Fig. 4 is the method flow diagram that reference report is obtained in one embodiment;
Fig. 5 is the method flow diagram that template is set in one embodiment;
Fig. 6 is the method flow diagram that recognition result accuracy rate is detected in one embodiment;
Fig. 7 is the structure diagram that reference reports identification device in one embodiment;
Fig. 8 is the structure diagram that reference reports identification device in another embodiment;
Fig. 9 is the structure diagram that reference reports identification device in further embodiment.
Embodiment
To enable objects, features and advantages of the present invention more obvious understandable, below in conjunction with the accompanying drawings to the tool of the present invention Body embodiment is described in detail.Many details are elaborated in the following description in order to fully understand the present invention. But the invention can be embodied in many other ways as described herein, those skilled in the art can without prejudice to Similar improvement is done in the case of intension of the present invention, therefore the present invention is not limited to the specific embodiments disclosed below.
Fig. 1 is the applied environment figure that reference reports recognition methods in one embodiment.As shown in Figure 1, the application environment bag Terminal 110 and server 120 are included, wherein, communicated between terminal 110 and server 120 by network.
Terminal 110 can be laptop, desktop computer, individual digital computer, portable laptop computer etc., but simultaneously It is not limited to this.Terminal 110 is photocopied a document by what server 120 obtained that reference reports, and to photocopying a document of getting into Row detection, to determine successfully to obtain photocopying a document for reference report.Terminal 110 is known using OCR technique identification reference report After other result and the accuracy rate to recognition result are detected, terminal 110 can export recognition result according to fixed form, then will The recognition result batch of output is put in storage, and the recognition result of storage can be uploaded onto the server in 120 and be stored by terminal 110.
In one embodiment, there is provided a kind of computer equipment, the computer equipment can be terminals 110, in Fig. 1 The internal structure of terminal 110 is as shown in Fig. 2, the terminal 110 is included by the processor of system bus connection, storage medium, interior Deposit, display and network interface.Wherein, the storage medium of terminal 110 is stored with operating system, database, further includes for real Existing reference report recognition methods and the computer program of device.The processor is used to provide calculating and control ability, and support is whole The operation of terminal 110.Display in terminal 110 is used to show information, for example, when reference report photocopy data obtains failure When, mail notification can be received, display is used to show received mail, inside saves as and reference report identification is realized in storage medium The operation of the computer program of method and apparatus provides environment, and network interface is used to carry out network service, example with server 120 Such as, the recognition result exported according to form batch can be put in storage and uploaded onto the server and 120 be stored by network interface.In Fig. 2 The structure shown, only with the block diagram of the relevant part-structure of application scheme, does not form and application scheme is applied The restriction of terminal thereon, specific terminal can include more some than more or fewer components shown in figure, or combination Component, or arranged with different components.
In one embodiment, there is provided a kind of reference reports recognition methods, with applied to the end in above application environment End is come for example, as shown in figure 3, including the following steps:
Step S302, obtains reference report, and reference is reported as the photocopy data containing credit information, and reference report carries Unique mark.
Wherein, reference report is the main source and foundation of financial industry credit evaluation, is divided into personal credit report and enterprise Industry credit report, for querying individual or the social credibility of enterprise.Terminal gets reference from database by server and reports, Here the reference report got is pure picture format, and every part of reference report all carries unique mark.
Step S304, uses OCR technique identification reference report so that photocopy data is converted to text message, by text message Exported as recognition result.
Wherein, OCR technique is the abbreviation (Optical Character Recognition) of optical character identification, is logical The word of various bills, newpapers and periodicals, books, manuscript and other printed matters is converted into image letter by the optics input modes such as overscanning Breath, recycles character recognition technology that image information is converted into the computer input technology that can be used.
Reference report is identified using OCR technique, the photocopy that reference can be reported using the Text region in OCR Data is converted into text message, this text message is exactly the recognition result reported using OCR technique reference.Terminal can incite somebody to action This text message, that is, recognition result output.
Step S306, detects the accuracy rate of recognition result.
Accuracy rate reflects the order of accuarcy that reference report is identified, and after exporting recognition result, terminal can basis Character machining involved in recognition result goes out the accuracy rate of recognition result.
Step S308, when the accuracy rate of recognition result meets preset condition, exports recognition result.
Wherein, preset condition is the scope of some numerical value pre-set, this scope is to identification according to reality Precision and identification coverage rate carry out a definite number range.When recognition result falls into this number range, this identification knot Fruit is exactly qualified, illustrates that it is sufficiently exact that reference report, which is identified, terminal can be eligible by this Recognition result output.
Reported by obtaining reference, reference is reported as the photocopy data containing credit information, and reference report carries unique Mark, uses OCR technique identification reference report so that photocopy data is converted to text message, using text message as recognition result Output, detects the accuracy rate of recognition result, when the accuracy rate of recognition result meets preset condition, exports recognition result.Due to OCR technique is used, when reference report is checked and audited, it is only necessary to which terminal is operated automatically, it is not necessary to Substantial amounts of man power and material is expended, moreover, it is more accurate to carry out operating result automatically by terminal, is not easy to reveal in reference report Information.
In one embodiment, there is provided a kind of reference report recognition methods further include obtain reference report process, such as Shown in Fig. 4, include the following steps:
Step S402, obtains the unique mark of reference report.
Every part of reference report all containing unique mark, is distinguished for being reported with other references, and terminal is obtaining reference report What is got first during announcement is the unique mark of reference report.
Step S404, when the unique mark of the reference obtained report is not present in database, then the reference obtained is reported For the reference report do not downloaded.
The unique mark of many reference reports downloaded is stored with database.Obtain unique mark of reference report After knowledge, searched in the mark of database, if in the presence of this part of reference report for showing to get is what is downloaded Reference is reported.Conversely, when the unique mark of the reference report got is not present in the mark of database, then show to obtain To this part of reference report be do not downloaded reference report.
Step S406, detects record of the unique mark for the reference report do not downloaded in daily record, when there are daily record note During record, the success of reference report acquisition.
When the reference report got is the reference report do not downloaded, the report of this reference can have been recorded in daily record Unique mark.Terminal can be detected the record in this daily record, if there are log recording, show reference report into Work(obtains.When reference report acquisition is unsuccessful, terminal can eject a warning information, can also be by sending the side of mail Formula informs that reference report acquisition is unsuccessful.
The unique mark reported by obtaining reference, whether the reference report for judging to get is the reference report do not downloaded Accuse, then detect whether reference report obtains success.This series of process all need not be operated manually, improve acquisition sign Believe the efficiency of report, due to being that terminal is performing these operations, ensure that information will not be revealed.
In one embodiment, there is provided a kind of report recognition methods of reference further include photocopy data be converted into text envelope The process of breath, specifically includes:The classification of reference report is obtained, corresponding default OCR recognition templates are obtained according to classification, according to pre- If OCR recognition templates identification reference is reported so that photocopy data is converted to text message.
Wherein, the classification of reference report includes personal essential information class, transaction with credit info class and other information class.Into One step, the information involved in every kind of reference report category is all different, and default OCR recognition templates can include personal basic Information class template, transaction with credit information class template and other information class template.By default OCR recognition templates, recycle OCR identification technologies, it is possible to reference is reported text message is converted to photocopy data.
The classification of reference report is obtained, corresponding default OCR recognition templates are obtained according to classification, are identified according to default OCR Template identification reference is reported so that photocopy data is converted to text message.It is this to be changed photocopy data using OCR identification technologies For the mode of text message, the recognition efficiency of reference report is not only increased, and substantial amounts of man power and material need not be expended.
In one embodiment, as shown in Figure 5, there is provided a kind of reference report recognition methods further include set template mistake Journey, comprises the following steps that:
Step S502, obtains reference report sample, classifies to reference report sample.
Terminal can get the sample of reference report from server, and sample is reported to reference according to the classification of reference report Classify.It is divided into personal essential information class reference report, the report of transaction with credit info class reference for example, can report reference And other information class reference report.
Step S504, according to the corresponding OCR recognition templates of classification setting, and sets Template Location in OCR recognition templates Character, character dependence and recognition result export structure.
The corresponding OCR recognition templates of different reference report categories are different, and Template Location can be set in OCR recognition templates Character, character dependence and recognition result export structure.Wherein, Template Location character is used in OCR recognition templates really Determine the position of character, easy to export recognition result when makes character be exported according to specific position.Character dependence refers to Context between character, easy to export recognition result when, make character be exported according to specific tandem.Template is determined Position character and character dependence are used to set different OCR recognition templates according to the position of character and context, easy to know The output of other result.Recognition result export structure defines the data structure form of recognition result output.For example, know according to OCR The structure output recognition result of other template.
Sample is reported by obtaining reference, is classified to reference report sample, is identified according to the corresponding OCR of classification setting Template, and Template Location character, character dependence and recognition result export structure are set in OCR recognition templates.Due to The classification of each reference report is all corresponding with OCR recognition templates, these templates, which only need to set according to classification, once can , improve the efficiency that reference report identifies.
In one embodiment, there is provided a kind of reference report recognition methods further include detection recognition result accuracy rate mistake Journey, as shown in fig. 6, including the following steps:
Step S602, calculates the confidence level of the character in recognition result.
Wherein, confidence level is also referred to as reliability, be estimate with population parameter the phase within the error range necessarily allowed The probability answered.Terminal can calculate corresponding confidence level by the character in recognition result.
Step S604, the accuracy rate of recognition result is drawn according to the confidence level of character.
The confidence level of character is a probability, and terminal can draw the accuracy rate of recognition result according to this probability.
By calculating the confidence level of the character in recognition result, the accurate of recognition result is drawn according to the confidence level of character Rate.Terminal obtains the accuracy rate of recognition result according to the confidence level of character, due to the position of each character be it is relatively-stationary, According to confidence level recognition result can be made more accurate.
In one embodiment, there is provided a kind of reference report recognition methods further include when the accuracy rate of recognition result meets During preset condition, recognition result is exported, is specifically included:When the rate of accuracy reached of recognition result is to preset characters precision, will identify As a result exported according to recognition result export structure.
Wherein, preset characters precision has certain value range, this value is that accuracy of identification and identification are covered according to reality Rate carrys out a definite numerical value., will be according to Template Location when the rate of accuracy reached of recognition result is to default character accuracy rating The recognition result that character, character dependence obtain, is exported according to recognition result export structure.
When the rate of accuracy reached of recognition result is to preset characters precision, by recognition result according to recognition result export structure into Row output.By judging the accuracy rate of recognition result, when reaching precision, recognition result is exported, improves output As a result accuracy rate.
In one embodiment, there is provided a kind of reference report recognition methods further include:When the recognition result of output reaches When setting quantity, batch storage is carried out to recognition result.
When the recognition result of output has multiple, terminal can carry out batch storage according to predetermined quantity to recognition result.Example Such as, when the quantity of recognition result reaches 20, batch storage just is carried out to recognition result.Terminal can also be according to number of days pair Recognition result carries out batch storage.Be put in storage for example, every other day just carrying out once batch to recognition result.
When the recognition result of output reaches setting quantity, batch storage is carried out to recognition result.When recognition result adds up To it is a certain amount of when be put in storage again, so can not only reduce network service burden, efficiency can also be improved.
In one embodiment, there is provided a kind of reference reports recognition methods, realizes that comprising the following steps that for this method is described:
First, terminal needs to obtain reference report sample, and classifies to reference report sample.Terminal can be from service Device gets the sample of reference report, classifies according to the classification of reference report to reference report sample.Set further according to classification Corresponding OCR recognition templates are put, and Template Location character, character dependence and identification knot are set in OCR recognition templates Fruit export structure.The corresponding OCR recognition templates of different reference report categories are different, and mould can be set in OCR recognition templates Plate location character, character dependence and recognition result output result.Wherein, Template Location character and character dependence are used Different OCR recognition templates are set in the position according to character and context, easy to the output of recognition result.These templates are only Need to make once.
Then, terminal can obtain the unique mark of reference report, when the reference report that acquisition is not present in database During the unique mark of announcement, then the reference obtained is reported as the reference report do not downloaded, detects what the reference do not downloaded was reported Record of the unique mark in daily record, when there are during log recording, reference report acquisition is successful.
Then, after successfully getting reference report, the classification of reference report is obtained, is obtained according to classification corresponding default OCR recognition templates, identify reference report so that photocopy data is converted to text message according to default OCR recognition templates.Further , the information involved in every kind of reference report category is all different, and default OCR recognition templates can include personal essential information Class template, transaction with credit information class template and other information class template.By default OCR recognition templates, OCR is recycled to know Other technology, it is possible to reference is reported text message is converted to photocopy data.
Further, the accuracy rate of recognition result is detected.Specifically include:The confidence level of the character in recognition result is calculated, The accuracy rate of recognition result is drawn according to the confidence level of character.Terminal can be calculated accordingly by the character in recognition result Confidence level, the accuracy rate of recognition result is drawn further according to this probability.
When the accuracy rate of recognition result meets preset condition, recognition result is exported.Preset condition is pre-set The scope of some numerical value, this scope are come a definite numerical value model according to reality to accuracy of identification and identification coverage rate Enclose.When recognition result falls into this number range, this recognition result is exactly qualified, and terminal can accord with this The recognition result output of conjunction condition.Specifically include:During by the rate of accuracy reached of recognition result to preset characters precision, by recognition result Exported according to recognition result export structure.Wherein, preset characters precision has certain value range, this value is according to reality A definite numerical value is come to accuracy of identification and identification coverage rate.When the rate of accuracy reached of recognition result is to default character precision model When enclosing, the recognition result that will be obtained according to Template Location character, character dependence, carries out defeated according to recognition result export structure Go out.When the recognition result of output reaches sets requirement, batch storage is carried out to recognition result.The recognition result of output has multiple When, terminal can carry out batch storage according to predetermined quantity to recognition result.For example, whenever the quantity of recognition result reaches 20 When, batch storage just is carried out to recognition result.Terminal can also carry out batch storage according to number of days to recognition result.It is for example, every Just once batch is carried out to recognition result every two days to be put in storage.
As shown in fig. 7, in one embodiment, there is provided a kind of reference reports identification device, including:
Report acquisition module 710, for obtaining reference report, reference is reported as the photocopy data containing credit information, levies Letter report carries unique mark.
Info conversion module 720, for using OCR technique identification reference report that photocopy data is converted to text envelope Breath, exports text message as recognition result.
As a result detection module 730, for detecting the accuracy rate of recognition result.
As a result output module 740, for when the accuracy rate of recognition result meets preset condition, exporting recognition result.
In one embodiment, report acquisition module 710 be used for obtain reference report unique mark, when in database not During the unique mark reported there are the reference of acquisition, then the reference that obtains is reported as the reference report do not downloaded, detection Record of the unique mark for the reference report do not downloaded in daily record, when there are during log recording, reference report acquisition is successful.
In one embodiment, info conversion module 720 is used for the classification for obtaining reference report, is obtained and corresponded to according to classification Default OCR recognition templates, identify reference report photocopy data is converted to text message according to default OCR recognition templates.
As shown in figure 8, in one embodiment, there is provided a kind of reference report identification device further include:
Sample acquisition module 750, for obtaining reference report sample, classifies reference report sample.
Template-setup module 760, for according to the corresponding OCR recognition templates of classification setting, and sets in OCR recognition templates Put Template Location character, character dependence and recognition result export structure.
In one embodiment, as a result detection module 730 is used for the confidence level for calculating the character in recognition result;According to word The confidence level of symbol draws the accuracy rate of recognition result.
In one embodiment, as a result output module 740 is used for the rate of accuracy reached when recognition result to preset characters precision When, recognition result is exported according to recognition result export structure.
As shown in figure 9, in one embodiment, there is provided a kind of reference report identification device further include:
As a result library module 770 is entered, for when the recognition result of output reaches sets requirement, batch to be carried out to recognition result Storage.
In one embodiment, a kind of computer-readable recording medium is also provided, the computer-readable recording medium storage There is computer program, which realizes the step of the interface test method in above-mentioned each embodiment when being executed by processor Suddenly.
One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, it is non-volatile computer-readable that the program can be stored in one Take in storage medium, in the embodiment of the present invention, which can be stored in the non-volatile memory medium of computer system, and Performed by least one processor in the computer system, to realize the flow for including the embodiment such as above-mentioned each method.Its In, the storage medium can be magnetic disc, CD, read-only memory (Read-Only Memory, ROM) or random storage Memory body (Random Access Memory, RAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, to make description succinct, not to above-mentioned reality Apply all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, the scope that this specification is recorded all is considered to be.
Embodiment described above only expresses the several embodiments of the present invention, its description is more specific and detailed, but simultaneously Cannot therefore it be construed as limiting the scope of the patent.It should be pointed out that come for those of ordinary skill in the art Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention Scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims (10)

1. a kind of reference reports recognition methods, it is characterised in that the described method includes:
Reference report is obtained, the reference is reported as the photocopy data containing credit information, and the reference report carries unique Mark;
OCR technique is used to identify the reference report so that the photocopy data is converted to text message, by the text message Exported as recognition result;
Detect the accuracy rate of the recognition result;
When the accuracy rate of the recognition result meets preset condition, recognition result is exported.
2. according to the method described in claim 1, it is characterized in that, it is described acquisition reference report, including:
Obtain the unique mark of reference report;
When the unique mark of the reference obtained report is not present in database, then the reference obtained is reported as not downloading Reference report;
Record of the unique mark that the detection reference do not downloaded is reported in daily record, when there are during log recording, reference Report acquisition success.
3. according to the method described in claim 1, it is characterized in that, described use OCR technique to identify the reference report to incite somebody to action The photocopy data is converted to text message, including:
The classification of the reference report is obtained, corresponding default OCR recognition templates are obtained according to the classification, according to described default OCR recognition templates identify the reference report so that the photocopy data is converted to text message.
4. according to the method described in claim 3, it is characterized in that, further included before reference report is obtained:
Reference report sample is obtained, is classified to reference report sample;
According to the corresponding OCR recognition templates of the classification setting, and set in the OCR recognition templates Template Location character, Character dependence and recognition result export structure.
5. according to the method described in claim 1, it is characterized in that, the accuracy rate of the detection recognition result, including:
Calculate the confidence level of the character in the recognition result;
The accuracy rate of the recognition result is drawn according to the confidence level of the character.
It is 6. according to the method described in claim 5, it is characterized in that, described when the accuracy rate of the recognition result meets default bar During part, recognition result is exported, including:
It is when the rate of accuracy reached of the recognition result is to preset characters precision, the recognition result is defeated according to the recognition result Go out structure to be exported.
7. according to the method described in claim 1, it is characterized in that, after the output recognition result, further include:
When the recognition result of output reaches sets requirement, batch storage is carried out to the recognition result.
8. a kind of reference reports identification device, it is characterised in that described device includes:
Report acquisition module, for obtaining reference report, the reference is reported as the photocopy data containing credit information, the sign Letter report carries unique mark;
Info conversion module, for using OCR technique to identify, the reference is reported so that the photocopy data is converted to text envelope Breath, exports the text message as recognition result;
As a result detection module, for detecting the accuracy rate of the recognition result;
As a result output module, for when the accuracy rate of the recognition result meets preset condition, exporting recognition result.
9. a kind of computer equipment, including memory, processor and it is stored in the memory and can be in the processor The computer program of upper operation, it is characterised in that the processor realized when performing the computer program as claim 1 to The step of any one of 7 the method.
10. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, its feature exists In when the computer program is executed by processor the step of realization such as any one of claim 1 to 7 the method.
CN201711020665.9A 2017-10-27 2017-10-27 Reference report recognition methods, device, computer equipment and storage medium Pending CN107958204A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711020665.9A CN107958204A (en) 2017-10-27 2017-10-27 Reference report recognition methods, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711020665.9A CN107958204A (en) 2017-10-27 2017-10-27 Reference report recognition methods, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN107958204A true CN107958204A (en) 2018-04-24

Family

ID=61964053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711020665.9A Pending CN107958204A (en) 2017-10-27 2017-10-27 Reference report recognition methods, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN107958204A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232328A (en) * 2019-05-21 2019-09-13 深圳壹账通智能科技有限公司 A kind of reference report analytic method, device and computer readable storage medium
CN111383124A (en) * 2020-05-29 2020-07-07 支付宝(杭州)信息技术有限公司 User material verification method and device
CN112581699A (en) * 2020-12-23 2021-03-30 华言融信科技成都有限公司 Credit report self-service interpretation equipment
CN112598503A (en) * 2020-12-25 2021-04-02 四川享宇金信金融科技有限公司 OCR recognition system and method based on credit investigation recognition
CN112819003A (en) * 2021-04-19 2021-05-18 北京妙医佳健康科技集团有限公司 Method and device for improving OCR recognition accuracy of physical examination report

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106572100A (en) * 2016-10-25 2017-04-19 中国建设银行股份有限公司 Service data transfer audit method, device and system
CN106911751A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 File acquisition method, device and system
CN107067228A (en) * 2017-03-31 2017-08-18 南京钧元网络科技有限公司 A kind of hand-held authentication intelligent checks system and its checking method
CN107145562A (en) * 2017-05-02 2017-09-08 北京奇艺世纪科技有限公司 A kind of method of data synchronization, apparatus and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106911751A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 File acquisition method, device and system
CN106572100A (en) * 2016-10-25 2017-04-19 中国建设银行股份有限公司 Service data transfer audit method, device and system
CN107067228A (en) * 2017-03-31 2017-08-18 南京钧元网络科技有限公司 A kind of hand-held authentication intelligent checks system and its checking method
CN107145562A (en) * 2017-05-02 2017-09-08 北京奇艺世纪科技有限公司 A kind of method of data synchronization, apparatus and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232328A (en) * 2019-05-21 2019-09-13 深圳壹账通智能科技有限公司 A kind of reference report analytic method, device and computer readable storage medium
CN111383124A (en) * 2020-05-29 2020-07-07 支付宝(杭州)信息技术有限公司 User material verification method and device
CN112581699A (en) * 2020-12-23 2021-03-30 华言融信科技成都有限公司 Credit report self-service interpretation equipment
CN112598503A (en) * 2020-12-25 2021-04-02 四川享宇金信金融科技有限公司 OCR recognition system and method based on credit investigation recognition
CN112819003A (en) * 2021-04-19 2021-05-18 北京妙医佳健康科技集团有限公司 Method and device for improving OCR recognition accuracy of physical examination report
CN112819003B (en) * 2021-04-19 2021-08-27 北京妙医佳健康科技集团有限公司 Method and device for improving OCR recognition accuracy of physical examination report

Similar Documents

Publication Publication Date Title
CN107958204A (en) Reference report recognition methods, device, computer equipment and storage medium
US20190286898A1 (en) System and method for data extraction and searching
CN112613501A (en) Information auditing classification model construction method and information auditing method
JP4829920B2 (en) Form automatic embedding method and apparatus, graphical user interface apparatus
US7983468B2 (en) Method and system for extracting information from documents by document segregation
CN111046784A (en) Document layout analysis and identification method and device, electronic equipment and storage medium
CN107220648A (en) The character identifying method and server of Claims Resolution document
CN107622263B (en) The character identifying method and device of document image
US8099384B2 (en) Operation procedure extrapolating system, operation procedure extrapolating method, computer-readable medium and computer data signal
US20160092730A1 (en) Content-based document image classification
US20050207635A1 (en) Method and apparatus for printing documents that include MICR characters
CN110110726A (en) Power equipment nameplate identification method and device, computer equipment and storage medium
CN108595544A (en) A kind of document picture classification method
CN112668640B (en) Text image quality evaluation method, device, equipment and medium
CN111598099B (en) Image text recognition performance testing method, device, testing equipment and medium
CN116740723A (en) PDF document identification method based on open source Paddle framework
CN112668444A (en) Bird detection and identification method based on YOLOv5
CN113496115B (en) File content comparison method and device
CN117371049A (en) Machine-generated text detection method and system based on blockchain and generated countermeasure network
CN112307101A (en) Project pricing auditing method, device, computer equipment and system
CN107705000A (en) Choosing method, device, storage medium and the computer equipment of scanning device
CN110188073A (en) Method, apparatus, storage medium and the computer equipment of In vivo detection log parsing
CN109145308B (en) Secret-related text recognition method based on improved naive Bayes
TWI768744B (en) Reference document generation method and system
EP4167139A1 (en) Method and apparatus for data augmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180424