CN103246953A - Document audit method - Google Patents

Document audit method Download PDF

Info

Publication number
CN103246953A
CN103246953A CN2013101486845A CN201310148684A CN103246953A CN 103246953 A CN103246953 A CN 103246953A CN 2013101486845 A CN2013101486845 A CN 2013101486845A CN 201310148684 A CN201310148684 A CN 201310148684A CN 103246953 A CN103246953 A CN 103246953A
Authority
CN
China
Prior art keywords
document
current
numbering
business
bar code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013101486845A
Other languages
Chinese (zh)
Inventor
杨嘉琛
许龙胜
杨柳
王斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN2013101486845A priority Critical patent/CN103246953A/en
Publication of CN103246953A publication Critical patent/CN103246953A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of computer information management, and relates to a document audit method. The method comprises the steps as follows: when a document business is transacted at the front desk, according to the business needs and business segments, a file template is applied for generation of electronic document forms, as for the generated electronic document forms, a bar code containing relevant information is integrated on a placeholder of the electronic document forms, a black horizontal discriminant line and a vertical break line throughout the whole page are generated in the designated area of each page of the electronic document forms, and the intersection point of the two lines is taken as a starting coordinate; before the current generated document is printed out, the serial number of the current document and the identification position of each data form are extracted and stored in the server; and in the background, the filled paper document is scanned, and analysis and information extraction are performed on the image of each paper document. According to the invention, automation degree and tick marking efficiency for document audit can be improved.

Description

A kind of document auditing method
Technical field
The invention belongs to the computer information management technical field, relate to a kind of document auditing method.
Background technology
In recent years, along with society and the progress in epoch, though be the document of bank's class or government to the document of public affairs to private business, or be the document that large enterprise inside produces when handling miscellaneous service, its kind and quantity are all in rapid increase.Check in the flow process that at traditional business paper operating personnel adopt manual mode usually, the professional ticket that produces in foreground is re-entered in the computing machine, to realize the accounting checking of transaction journal.
The document audit system of at present existing robotization, pass through scan mode, realize collection, processing, the whole-course automation of storing and having access to and the electronization of business voucher image, adopt the mode that the OCR technology is identified automatically and artificial amended record combines simultaneously, for invoices and vouchers is checked image data and set up index.And document image information and accounting data be stored in the mass memory unit, set up the storage of document archives and inquiry system that a cover is contained miscellaneous service, realize between comprehensive accounting system business flowing water and the essential of voucher collude automatically right.
But present this document audit system has certain restricted:
1, because document is of a great variety, and all kinds of document form is fixed mode, all kinds of documents must be carried out taxonomic revision before the scanning of backstage document, unifies to scan and typing, waste of manpower and time again.
2, the automatic identified region of the OCR that adopts in the typing process must be by manually being set at the fixed position, and inefficiency can not be regulated automatically.
3, an AM/BAM system associated invoice paper information does not have other subsidiary checking and Query Information.
Summary of the invention
The objective of the invention is to overcome the above-mentioned deficiency of prior art, provide a kind of document that can improve to check the method for colluding efficient.Technical scheme of the present invention is as follows:
A kind of document auditing method comprises the following steps:
When (1) handling the document business on the foreground, according to service needed, apply mechanically the generation that file template carries out the electronic documents form according to the business classification; The electronic documents form that generates merges the bar code that contains relevant information at the reserved location of this electronic documents form; Generating the vertical broken string with of a black level discriminant line run through full page in each page appointed area of electronic documents form, is origin coordinates with the intersection point of two lines;
(2) before the current document that generates is printed, the serial number of current document and the recognizing site of each data form are extracted and store in the server;
(3) on the backstage, the document after the filling in of papery is scanned, the image of every papery document to be analyzed and information extraction, method is as follows:
1) at first identifies black level discriminant line and vertical section line according to the image file of current papery document, obtain two line intersection points thus as new true origin;
2) the horizontal tilt angle of measuring and calculating black level discriminant line and current scanning background;
3) by new true origin and horizontal tilt angle, again according to original barcode position, calculate position and the regional extent of current bar code, this scope is carried out bar-code identification and extracted the serial number of this document;
4) basis is by extracting the current document serial number that obtains is retrieved every data form of this document from database positional information in the bar code;
5) calculate position and the zone of current every form under current coordinate;
6) identify and extract Word message in every form, and the professional every data message during with the foreground business operation carries out and colludes checking by current document serial number related.
As preferred implementation, integrated document types numbering, document flowing water numbering during the bar code establishment, information such as numbering, running time numbering are sorted out in document operating personnel numbering, document operating unit numbering, document storage.
The present invention checks that with foreground business platform and background scanning platform combines, by unified interrelated information, with the foreground business datum of every document and backstage image data collude efficiently to, common document audit system foreground and backstage break away from, cause the backstage audit system can not take full advantage of data with existing and the information on foreground, and foreground system of the present invention generates the bills data form according to type of service automatically according to template, and all form positions and area information and current document flowing water are saved in the database; Simultaneously, generate the bar code contain document flowing water at electronic documents, the backstage carries out relatedly colluding rightly with foreground professional flowing water after scanning analysis document bar code information automatically, has improved automaticity that document checks greatly and has colluded efficient.
Description of drawings
Fig. 1 is entire system flowage structure involved in the present invention.
Fig. 2 is document scanning front and back synoptic diagram among the present invention.
Embodiment
The present invention will be described below in conjunction with drawings and Examples.
Referring to Fig. 1, concrete implementation step of the present invention is as follows:
1. the high-level efficiency that will realize document is checked, at first will set about from the foundation of document, sets up standardization flow process and form, is convenient to background scanning, information extraction and filing management.Therefore the foreground document is handled system according to service needed, applies mechanically the generation that file template carries out the document form automatically according to the business classification, as name, Business Name, amount of money etc.
2. after the electronic documents form generates, preserve and management for the ease of the papery document that prints, the bar code that system will produce on reserved location merges on this e bill automatically, integrated document types numbering, document flowing water numbering during the bar code establishment, information such as numbering, running time numbering are sorted out in document operating personnel numbering, document operating unit numbering, document storage.The present invention's figure place of encoding is 13.This document can carry out flowing water by scanning barcode and follow the tracks of or filing management after printing.Wherein document flowing water numbering is unique separately, and each document information can be retrieved according to the document serial number by system.
3. every form text design of document is under barcode position.System can reserve the fixed position of bar code before document generates.
The present invention with bar code establish apart from summit, the page upper right corner (15Pixel, 50Pixel) position is the summit, the upper right corner of bar code, the bar code size is fixed as high 70Pixel, wide 452Pixel.Cryptoprinciple is 39 code systems.
4. the electron image of backstage after to scanning be when carrying out each regional textual scan, thereby the inclination document when entering scanner will be done the free of discontinuities high-speed cruising that certain processing guarantees system.Therefore the foreground is when generating the current business document, and system generates a black level discriminant line in current document page appointed area, when making things convenient for background scanning to the horizontal correction of inclination document.Convenient for the ease of backstage identification, the present invention should horizontal discriminant line be located at apart from page coboundary 25Pixel and begins, and width is 10Pixel, through about full page.
5. locate each identified region for convenience, system generates an origin coordinates in the document page upper left corner, is made of the intersection point of another vertical broken string with the black level discriminant line.Being the convenient identification of background system, should vertical line segment placing apart from page left margin 20Pixel to begin, highly is 80Pixel, and width is 5Pixel; Two line intersection points be true origin (ox1, oy1).
6. before the electronic documents that will generate was printed, system extracted the recognizing site of current document serial number and each data form and store in the server, and positional information is preserved with the form of pixel coordinate, so that (ox1 oy1) is true origin.
7. background system carries out high-velocity scanning to the document of papery, and the image of every document is analyzed and information extraction
1) system at first identifies the black level discriminant line from the image recognition function of current document image file utilization maturation, identifies the vertical coordinate start line again after the horizontal discriminant line of acquisition, obtain thus two line intersection points as new true origin (ox2, oy2);
2) system calculates the angle of inclination a of horizontal datum and current scanning background automatically;
3) (ox2 oy2) with horizontal tilt angle a, by original barcode position, can calculate position and the regional extent of current bar code, and system carries out bar-code identification to this scope and extracts the serial number of this document by new true origin;
4) positional information of every data form of this document is retrieved by system from database according to the current document serial number that is obtained by extraction in the bar code;
5) positional information that obtains according to retrieval be each data form with (ox1, oy1) be the original position-information of true origin, (ox2 oy2) and horizontal tilt angle a, calculates position and the zone of current every form under current coordinate in conjunction with the new true origin of current scanning; The Word message in every form is also extracted in identification, and the professional every data message during with the foreground business operation carries out related and colludes checking by current document serial number.

Claims (2)

1. a document auditing method comprises the following steps:
When (1) handling the document business on the foreground, according to service needed, apply mechanically the generation that file template carries out the electronic documents form according to the business classification; The electronic documents form that generates merges the bar code that contains relevant information at the reserved location of this electronic documents form; Generating the vertical broken string with of a black level discriminant line run through full page in each page appointed area of electronic documents form, is origin coordinates with the intersection point of two lines;
(2) before the current document that generates is printed, the serial number of current document and the recognizing site of each data form are extracted and store in the server;
(3) on the backstage, the document after the filling in of papery is scanned, the image of every papery document to be analyzed and information extraction, method is as follows:
1) at first identifies black level discriminant line and vertical section line according to the image file of current papery document, obtain two line intersection points thus as new true origin;
2) the horizontal tilt angle of measuring and calculating black level discriminant line and current scanning background;
3) by new true origin and horizontal tilt angle, again according to original barcode position, calculate position and the regional extent of current bar code, this scope is carried out bar-code identification and extracted the serial number of this document;
4) basis is by extracting the current document serial number that obtains is retrieved every data form of this document from database positional information in the bar code;
5) calculate position and the zone of current every form under current coordinate;
6) identify and extract Word message in every form, and the professional every data message during with the foreground business operation carries out and colludes checking by current document serial number related.
2. document auditing method according to claim 1, it is characterized in that, integrated document types numbering, document flowing water numbering during the bar code establishment, information such as numbering, running time numbering are sorted out in document operating personnel numbering, document operating unit numbering, document storage.
CN2013101486845A 2013-04-25 2013-04-25 Document audit method Pending CN103246953A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013101486845A CN103246953A (en) 2013-04-25 2013-04-25 Document audit method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013101486845A CN103246953A (en) 2013-04-25 2013-04-25 Document audit method

Publications (1)

Publication Number Publication Date
CN103246953A true CN103246953A (en) 2013-08-14

Family

ID=48926463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013101486845A Pending CN103246953A (en) 2013-04-25 2013-04-25 Document audit method

Country Status (1)

Country Link
CN (1) CN103246953A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902995A (en) * 2014-03-17 2014-07-02 西安汇龙科技股份有限公司 Method and device for automatically typing in form content
CN104077682A (en) * 2014-06-30 2014-10-01 昆山云景网络科技有限公司 Document data entry method based on OCR and task fragmentization
CN106095826A (en) * 2016-05-31 2016-11-09 杭州云为科技有限公司 A kind of method and system uploading papery document
CN107463868A (en) * 2016-06-02 2017-12-12 阿里巴巴集团控股有限公司 A kind of electronic spreadsheet verification method and device
CN108021340A (en) * 2016-10-31 2018-05-11 北京京东尚科信息技术有限公司 A kind of label printing method and system
CN112364790A (en) * 2020-11-16 2021-02-12 中国民航大学 Airport work order information identification method and system based on convolutional neural network
CN114331292A (en) * 2022-01-05 2022-04-12 成都以专信息技术有限公司 Management information system for sorting examination papers of student admission examination

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004079634A1 (en) * 2003-03-03 2004-09-16 Bradley A W System, method, and apparatus for identifying and authenticating the presence of high value assets at remote locations
CN101059885A (en) * 2006-11-03 2007-10-24 朱杰 A ticket true/false verifying system and method
CN102567764A (en) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 Bill certificate and system for improving electronic image recognition efficiency

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004079634A1 (en) * 2003-03-03 2004-09-16 Bradley A W System, method, and apparatus for identifying and authenticating the presence of high value assets at remote locations
CN101059885A (en) * 2006-11-03 2007-10-24 朱杰 A ticket true/false verifying system and method
CN102567764A (en) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 Bill certificate and system for improving electronic image recognition efficiency

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902995A (en) * 2014-03-17 2014-07-02 西安汇龙科技股份有限公司 Method and device for automatically typing in form content
CN103902995B (en) * 2014-03-17 2017-11-07 西安汇龙科技股份有限公司 Table content method for automatically inputting and device
CN104077682A (en) * 2014-06-30 2014-10-01 昆山云景网络科技有限公司 Document data entry method based on OCR and task fragmentization
CN106095826A (en) * 2016-05-31 2016-11-09 杭州云为科技有限公司 A kind of method and system uploading papery document
CN107463868A (en) * 2016-06-02 2017-12-12 阿里巴巴集团控股有限公司 A kind of electronic spreadsheet verification method and device
CN107463868B (en) * 2016-06-02 2021-02-23 阿里巴巴集团控股有限公司 Electronic form verification method and device
CN108021340A (en) * 2016-10-31 2018-05-11 北京京东尚科信息技术有限公司 A kind of label printing method and system
CN108021340B (en) * 2016-10-31 2021-08-17 北京京东振世信息技术有限公司 Label printing method and system
CN112364790A (en) * 2020-11-16 2021-02-12 中国民航大学 Airport work order information identification method and system based on convolutional neural network
CN112364790B (en) * 2020-11-16 2022-10-25 中国民航大学 Airport work order information identification method and system based on convolutional neural network
CN114331292A (en) * 2022-01-05 2022-04-12 成都以专信息技术有限公司 Management information system for sorting examination papers of student admission examination

Similar Documents

Publication Publication Date Title
CN103246953A (en) Document audit method
US20180268448A1 (en) System and methods of an expense management system based upon business document analysis
CN108960223B (en) Method for automatically generating voucher based on intelligent bill identification
US10943105B2 (en) Document field detection and parsing
CN106156761B (en) Image table detection and identification method for mobile terminal shooting
CN109887153B (en) Finance and tax processing method and system
CN102567764B (en) A kind of bill evidence and system improving electron image recognition efficiency
CN109308476A (en) Billing information processing method, system and computer readable storage medium
CN1103087C (en) Optical scanning list recognition and correction method
US8520888B2 (en) Apparatus, method and programmable product for identification of a document with feature analysis
US7668372B2 (en) Method and system for collecting data from a plurality of machine readable documents
CN103488923B (en) A kind of electric endorsement method, Apparatus and system
CN109190611A (en) Pedigree system makes are compiled in a kind of internet based on crowdsourcing
CN111898433B (en) Paper bill digitizing method and device
CN112418812A (en) Distributed full-link automatic intelligent clearance system, method and storage medium
CN1204522C (en) File, file processing system and file generating system
CN106803088A (en) A kind of scaling method and device based on rectangle auxiliary calibration frame
CN102968638A (en) Image sharpness judgment method based on keyword optical character recognition
WO2016186137A1 (en) Accounting assistance system
CN109726369A (en) A kind of intelligent template questions record Implementation Technology based on normative document
CN114529932A (en) Credit investigation report identification method
US20170154025A1 (en) Method and system for generating a graphical organization of a page
CN116798061A (en) Bill auditing and identifying method, device, terminal and storage medium
CN111241955B (en) Bill information extraction method and system
TWM622650U (en) Accounting management system that recognizes accounting voucher images to automatically obtain accounting related information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130814