CN106529386A - Paper archive digitization method and system - Google Patents

Paper archive digitization method and system Download PDF

Info

Publication number
CN106529386A
CN106529386A CN201610780075.5A CN201610780075A CN106529386A CN 106529386 A CN106529386 A CN 106529386A CN 201610780075 A CN201610780075 A CN 201610780075A CN 106529386 A CN106529386 A CN 106529386A
Authority
CN
China
Prior art keywords
picture
archives
paper quality
corresponding entry
generation module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610780075.5A
Other languages
Chinese (zh)
Inventor
陈宁斌
陈高蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Thousand Wave Wave Information Technology Service Co Ltd
Original Assignee
Suzhou Thousand Wave Wave Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Thousand Wave Wave Information Technology Service Co Ltd filed Critical Suzhou Thousand Wave Wave Information Technology Service Co Ltd
Priority to CN201610780075.5A priority Critical patent/CN106529386A/en
Publication of CN106529386A publication Critical patent/CN106529386A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention provides a paper archive digitization method and system. The method includes the following steps that: a picture corresponding to a paper archive is obtained; corresponding entries are generated according to information content displayed on the picture; and relevant content displayed on the picture is acquired according to the corresponding entries, and an electronic file can be generated. The system includes a picture acquisition module, a corresponding entry generation module and an electronic document generation module. According to the method of the invention, the picture corresponding to the paper archive is obtained, the corresponding entries are generated according to the information content displayed on the picture, the relevant content displayed on the picture is acquired according to the corresponding entries, and the electronic file can be generated, and therefore, operation can be simple, the efficiency of the work of paper archive digitization can be improved, and a misoperation rate can be decreased.

Description

Archives of paper quality method for digitizing and system
Technical field
The present invention relates to electronic informzation technique field, more particularly to a kind of archives of paper quality method for digitizing and system.
Background technology
Archives of paper quality digitlization operation is that archives large database concept builds most basic work, and its operating process includes archives The steps such as taxonomic revision, image scanning, words input and arrangement warehouse-in.The digitized presentation of archives of paper quality, is by reality at present Thing archives of paper quality, the archives for becoming electronic document (form such as JPG, PDF or TFF) are stored, and be its objective is as information-based clothes Business, it is therefore necessary to can be read and be used by related software system.
It is this when electronic record database is set up, for each archives of paper quality, it is necessary to generate two electronic documents:One Individual is the picture of the archives of paper quality, and another is and the one-to-one information of the picture.Current solution is to be fabricated to " Electronic pictures " add " EXCEL entries ".Such as 1 archives of paper quality in kind, it is scanned after, generate the entitled " 031-053-01- of picture The electronic pictures of 019-01.jpg ", but only can not fully understand that substantially which is all interior from " 031-053-01-019-01.jpg " Appearance information, accordingly, it would be desirable to by the information covered on this archives of paper quality (as file number, class-mark, the time, archives kind, page name, Unit, department is filled and presented, which kind of is belonged to, is had the contents such as several pages) it is input in " corresponding entry " of EXCEL file.As can be seen here, " digitlization " for completing a piece of paper matter archives needs to do two things:One is scanning archives of paper quality, and two is that input archive content is arrived In " corresponding entry " of EXCEL file, its workload is very huge.
Although scanner (high photographing instrument) common on the market at present can do some process to the picture for scanning, generally lack Crawl to content information is simultaneously generated in " corresponding entry " of EXCEL file.Certainly with technological progress, also occur in that and carry The high-grade scanner of optical character identification (Optical Character Recognition, abbreviation OCR), but mistake so far Behaviour leads the requirement of " less than 0.5% " that can not meet National archives digitlization regulation;Even if using the high-grade scanner of import, though So behaviour leads and can reduce several orders of magnitude by mistake, but can not meet requirement, and the high-grade scanner price of such import is held high Expensive, hundreds of thousands is even up to a hundred just in case platform easily, and its cost is excessively high.
So above general company's archival digitalization working procedure social so far, or being all same people's secondary operation, or Operation before and after two people of streamline, its working procedure are complicated, cause inefficiency, and personnel cost is too high.
The content of the invention
It is an object of the invention to provide a kind of archives of paper quality method for digitizing and system, to solve in prior art Archives of paper quality digitization procedure complexity, the problem for causing inefficiency.
To achieve these goals, the first aspect of the invention is to provide a kind of archives of paper quality method for digitizing, bag Include following steps:
Obtain the corresponding picture of archives of paper quality;
Corresponding entry is generated according to the information content represented on picture;
The related content that represents is obtained on picture according to corresponding entry and e-file is generated.
Further, the operation for obtaining the corresponding picture of archives of paper quality is specifically included:Archives of paper quality is entered by scanner Row scanning, to obtain the picture after archives of paper quality scanning.
Another aspect of the present invention is to provide a kind of digitized system of archives of paper quality, including picture acquisition module, right Entry generation module and e-file generation module is answered, wherein,
Picture acquisition module is used for obtaining the corresponding picture of archives of paper quality;
Corresponding entry generation module is for according to the information content generation corresponding entry represented on picture;
E-file generation module is for obtaining on picture the related content that represents according to corresponding entry and generates electronics text Part.
Further, picture acquisition module adopts scanner.
Using the beneficial effect of the invention described above technical scheme it is:By the corresponding picture of acquisition archives of paper quality, and according to The information content represented on picture generates corresponding entry, then obtains related content the life represented on picture according to corresponding entry Into e-file so that simple to operate, so as to improve the digitized operating efficiency of archives of paper quality, while reduce behaviour by mistake leading.
Description of the drawings
Fig. 1 is archives of paper quality method for digitizing flow chart of the present invention;
Fig. 2 is the structural representation of archives of paper quality digitization system of the present invention;
Fig. 3 is that Wujiang cities and towns of the present invention emphasis helps the scanning of unemployment (being laid off) personnel's situation survey graft (two) to illustrate Figure.
Specific embodiment
For making purpose, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is The a part of embodiment of the present invention, rather than the embodiment of whole.
The invention discloses a kind of archives of paper quality method for digitizing, as shown in figure 1, comprising the steps:
Step S101, obtains the corresponding picture of archives of paper quality;
Specifically, in the present embodiment, it would be desirable to which the archives of paper quality being digitized is scanned by scanner, to obtain Take the picture after the archives of paper quality scanning.
Step S102, generates corresponding entry according to the information content represented on picture;
Specifically, as shown in figure 3, in the present embodiment, with " Wujiang cities and towns emphasis assistance unemployment (being laid off) personnel's situation Illustrate as a example by application form (two) ", after being scanned to the table, then can be generated according to the information content represented on the table Corresponding entry, such as file number ID, time, page name, fill and present the entries such as unit, department, archives kind, class-mark, classification number, number of pages. Specifically corresponding entry can be generated according to the content included by the picture of actual scanning.
Step S103, obtains on picture the related content that represents according to corresponding entry and generates e-file.
In the present embodiment, the related content represented on picture, such as file number is obtained according to the corresponding entry of above-mentioned generation ID be " 031-053 ", class-mark for " 01-019-01 ", time for " 2003 ", archives kind be long-term, the entitled Shangri-La Town city of page Town unemployed people's census list and rural laborer's application form, unit is filled and presented for " Wujiang the urban unemployed's census and registration table-peach Source work social security institute ", department for " South neighbourhood committee ", archives kind for " social security ", number of pages for " 1 " waits concrete interior Hold, and automatically generated by backstage in " corresponding entry " of EXCEL file, so as to realize the digitlization to archives of paper quality.
Specifically, in the present embodiment, the concrete mode for obtaining the related content represented on picture can pass through scanner Respective field is obtained into corresponding entry, is then manually proofreaded, reduce while operating efficiency mistake so as to be improved The purpose that behaviour leads;Content corresponding with entry can also be input into by corresponding entry by the corresponding pictorial information of scanning, in this reality Apply in example, such as continuously input identical word three times in input process can be arranged on, or add up input phase within the start cycle Same word ten times, then automatically generate drop-down word, so as to mitigate input service amount, improves input efficiency, and ensure that defeated The accuracy for entering.
The method of the present invention is by obtaining the corresponding picture of archives of paper quality, and is generated according to the information content represented on picture Corresponding entry, then obtains on picture the related content that represents according to corresponding entry and generates e-file so that be simple to operate, So as to improve the digitized operating efficiency of archives of paper quality, while reduce behaviour by mistake leading.
The invention also discloses a kind of digitized system of archives of paper quality, as shown in Fig. 2 including picture acquisition module 201, Corresponding entry generation module 202 and e-file generation module 203, wherein, picture acquisition module is used for obtaining archives of paper quality Corresponding picture;Corresponding entry generation module is for according to the information content generation corresponding entry represented on picture;E-file Generation module is for obtaining on picture the related content that represents according to corresponding entry and generates e-file.Specifically, picture is obtained Delivery block can be realized using scanner.
The archives of paper quality digitization system of the present embodiment can be used for the technical scheme for performing embodiment of the method shown in Fig. 1, Which realizes that principle is similar with technique effect, and here is omitted.
One of ordinary skill in the art will appreciate that:Realize that all or part of step of above-mentioned each method embodiment can be led to Cross the related hardware of programmed instruction to complete.Aforesaid program can be stored in a computer read/write memory medium.The journey Sequence upon execution, performs the step of including above-mentioned each method embodiment;And aforesaid storage medium includes:ROM, RAM, magnetic disc or Person's CD etc. is various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above only to illustrate technical scheme, rather than a limitation;To the greatest extent Pipe has been described in detail to the present invention with reference to foregoing embodiments, it will be understood by those within the art that:Its according to So the technical scheme described in foregoing embodiments can be modified, or which part or all technical characteristic are entered Row equivalent;And these modifications or replacement, do not make the essence of appropriate technical solution depart from various embodiments of the present invention technology The scope of scheme.

Claims (4)

1. a kind of archives of paper quality method for digitizing, it is characterised in that comprise the steps:
Obtain the corresponding picture of archives of paper quality;
Corresponding entry is generated according to the information content represented on the picture;
The related content that represents is obtained on picture according to the corresponding entry and e-file is generated.
2. archives of paper quality method for digitizing according to claim 1, it is characterised in that the acquisition archives of paper quality correspondence The operation of picture specifically include:Archives of paper quality is scanned by scanner, to obtain after the archives of paper quality scanning Picture.
3. the digitized system of a kind of archives of paper quality, it is characterised in that including picture acquisition module, corresponding entry generation module with And e-file generation module,
The picture acquisition module is used for obtaining the corresponding picture of archives of paper quality;
The corresponding entry generation module is for according to the information content generation corresponding entry represented on the picture;
The e-file generation module is for obtaining on picture the related content that represents according to the corresponding entry and generates electricity Subfile.
4. the digitized system of archives of paper quality according to claim 3, it is characterised in that the picture acquisition module is adopted Scanner.
CN201610780075.5A 2016-08-31 2016-08-31 Paper archive digitization method and system Pending CN106529386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610780075.5A CN106529386A (en) 2016-08-31 2016-08-31 Paper archive digitization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610780075.5A CN106529386A (en) 2016-08-31 2016-08-31 Paper archive digitization method and system

Publications (1)

Publication Number Publication Date
CN106529386A true CN106529386A (en) 2017-03-22

Family

ID=58343696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610780075.5A Pending CN106529386A (en) 2016-08-31 2016-08-31 Paper archive digitization method and system

Country Status (1)

Country Link
CN (1) CN106529386A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169543A (en) * 2017-05-17 2017-09-15 苏州市千尺浪信息科技服务有限公司 A kind of reduction image, the acquisition system of word accounting
CN108197119A (en) * 2018-02-05 2018-06-22 成都卓观信息技术有限公司 The archives of paper quality digitizing solution of knowledge based collection of illustrative plates

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081615A (en) * 2009-11-28 2011-06-01 东莞市万维网络科技信息有限公司 Archives arranging and digital processing system based on information resource planning of archives
CN102968426A (en) * 2012-07-04 2013-03-13 南京斯谱蓝自动化科技有限公司 Archive comprehensive management system
CN103077625A (en) * 2013-01-30 2013-05-01 中国盲文出版社 Blind electronic reader and blind assistance reading method
CN105740857A (en) * 2016-01-31 2016-07-06 华南理工大学 OCR based automatic acquisition and recognition system for fast pencil-and-paper voting result

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081615A (en) * 2009-11-28 2011-06-01 东莞市万维网络科技信息有限公司 Archives arranging and digital processing system based on information resource planning of archives
CN102968426A (en) * 2012-07-04 2013-03-13 南京斯谱蓝自动化科技有限公司 Archive comprehensive management system
CN103077625A (en) * 2013-01-30 2013-05-01 中国盲文出版社 Blind electronic reader and blind assistance reading method
CN105740857A (en) * 2016-01-31 2016-07-06 华南理工大学 OCR based automatic acquisition and recognition system for fast pencil-and-paper voting result

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
俞传正 等: "《图书馆实用信息技术》", 30 April 2010 *
吕奇 等: "《计算机辅助翻译入门》", 31 May 2015 *
徐德光 等: "基于FlexPaper的高校纸质档案数字化平台建设", 《中国信息技术教育》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169543A (en) * 2017-05-17 2017-09-15 苏州市千尺浪信息科技服务有限公司 A kind of reduction image, the acquisition system of word accounting
CN108197119A (en) * 2018-02-05 2018-06-22 成都卓观信息技术有限公司 The archives of paper quality digitizing solution of knowledge based collection of illustrative plates

Similar Documents

Publication Publication Date Title
RU2284670C2 (en) Mobile digital camera recognizing text and graphic information in image
CN108108342B (en) Structured text generation method, search method and device
Terras Digitization and digital resources in the humanities
CN105913218A (en) Electronic invoice invoicing reimbursement method and electronic invoice reimbursement information extraction method
CN112560411A (en) Intelligent personnel information input method and system
CN105335453B (en) Image file dividing method
CN105160466A (en) Method for applying two-dimensional code to highway engineering
JP7082333B2 (en) Question automatic generation program and question automatic generation device
CN106529386A (en) Paper archive digitization method and system
Thammarak et al. Automated data digitization system for vehicle registration certificates using google cloud vision API
JP2014220708A (en) Document computerization method and document computerization system
CN114359533A (en) Page number identification method based on page text and computer equipment
CN111626029B (en) Project consultation budget method and device and electronic equipment
KR102328034B1 (en) Database building device that can build a knowledge database from a table-inserted image and operating method thereof
CN108121960A (en) A kind of standard resource processes overall process electronization management-control method
CN115828856A (en) Test paper generation method, device, equipment and storage medium
CN112149679B (en) Method and device for extracting document elements based on OCR character recognition
CN207249688U (en) A kind of computer marking device based on artificial intelligence character recognition technology
Ansari et al. Library automation in Indian central universities: Issues and challenges
CN109376554B (en) Multi-terminal electronic document examination and signature method and system based on labels and views
CN114328804A (en) Method and system for searching key words containing character pictures
CN206431622U (en) A kind of assistant learning system based on Quick Response Code
KR20200106397A (en) Automatic eCRF generation from pCRF using AItechnology
US10606928B2 (en) Assistive technology for the impaired
Bautista et al. Adoption of an Open Source Optical Character Recognition (OCR) for Database Buildup of the Students' Scholastic Records

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170322

RJ01 Rejection of invention patent application after publication