CN106529386A - Paper archive digitization method and system - Google Patents
Paper archive digitization method and system Download PDFInfo
- Publication number
- CN106529386A CN106529386A CN201610780075.5A CN201610780075A CN106529386A CN 106529386 A CN106529386 A CN 106529386A CN 201610780075 A CN201610780075 A CN 201610780075A CN 106529386 A CN106529386 A CN 106529386A
- Authority
- CN
- China
- Prior art keywords
- picture
- archives
- paper quality
- corresponding entry
- generation module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/174—Form filling; Merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention provides a paper archive digitization method and system. The method includes the following steps that: a picture corresponding to a paper archive is obtained; corresponding entries are generated according to information content displayed on the picture; and relevant content displayed on the picture is acquired according to the corresponding entries, and an electronic file can be generated. The system includes a picture acquisition module, a corresponding entry generation module and an electronic document generation module. According to the method of the invention, the picture corresponding to the paper archive is obtained, the corresponding entries are generated according to the information content displayed on the picture, the relevant content displayed on the picture is acquired according to the corresponding entries, and the electronic file can be generated, and therefore, operation can be simple, the efficiency of the work of paper archive digitization can be improved, and a misoperation rate can be decreased.
Description
Technical field
The present invention relates to electronic informzation technique field, more particularly to a kind of archives of paper quality method for digitizing and system.
Background technology
Archives of paper quality digitlization operation is that archives large database concept builds most basic work, and its operating process includes archives
The steps such as taxonomic revision, image scanning, words input and arrangement warehouse-in.The digitized presentation of archives of paper quality, is by reality at present
Thing archives of paper quality, the archives for becoming electronic document (form such as JPG, PDF or TFF) are stored, and be its objective is as information-based clothes
Business, it is therefore necessary to can be read and be used by related software system.
It is this when electronic record database is set up, for each archives of paper quality, it is necessary to generate two electronic documents:One
Individual is the picture of the archives of paper quality, and another is and the one-to-one information of the picture.Current solution is to be fabricated to "
Electronic pictures " add " EXCEL entries ".Such as 1 archives of paper quality in kind, it is scanned after, generate the entitled " 031-053-01- of picture
The electronic pictures of 019-01.jpg ", but only can not fully understand that substantially which is all interior from " 031-053-01-019-01.jpg "
Appearance information, accordingly, it would be desirable to by the information covered on this archives of paper quality (as file number, class-mark, the time, archives kind, page name,
Unit, department is filled and presented, which kind of is belonged to, is had the contents such as several pages) it is input in " corresponding entry " of EXCEL file.As can be seen here,
" digitlization " for completing a piece of paper matter archives needs to do two things:One is scanning archives of paper quality, and two is that input archive content is arrived
In " corresponding entry " of EXCEL file, its workload is very huge.
Although scanner (high photographing instrument) common on the market at present can do some process to the picture for scanning, generally lack
Crawl to content information is simultaneously generated in " corresponding entry " of EXCEL file.Certainly with technological progress, also occur in that and carry
The high-grade scanner of optical character identification (Optical Character Recognition, abbreviation OCR), but mistake so far
Behaviour leads the requirement of " less than 0.5% " that can not meet National archives digitlization regulation;Even if using the high-grade scanner of import, though
So behaviour leads and can reduce several orders of magnitude by mistake, but can not meet requirement, and the high-grade scanner price of such import is held high
Expensive, hundreds of thousands is even up to a hundred just in case platform easily, and its cost is excessively high.
So above general company's archival digitalization working procedure social so far, or being all same people's secondary operation, or
Operation before and after two people of streamline, its working procedure are complicated, cause inefficiency, and personnel cost is too high.
The content of the invention
It is an object of the invention to provide a kind of archives of paper quality method for digitizing and system, to solve in prior art
Archives of paper quality digitization procedure complexity, the problem for causing inefficiency.
To achieve these goals, the first aspect of the invention is to provide a kind of archives of paper quality method for digitizing, bag
Include following steps:
Obtain the corresponding picture of archives of paper quality;
Corresponding entry is generated according to the information content represented on picture;
The related content that represents is obtained on picture according to corresponding entry and e-file is generated.
Further, the operation for obtaining the corresponding picture of archives of paper quality is specifically included:Archives of paper quality is entered by scanner
Row scanning, to obtain the picture after archives of paper quality scanning.
Another aspect of the present invention is to provide a kind of digitized system of archives of paper quality, including picture acquisition module, right
Entry generation module and e-file generation module is answered, wherein,
Picture acquisition module is used for obtaining the corresponding picture of archives of paper quality;
Corresponding entry generation module is for according to the information content generation corresponding entry represented on picture;
E-file generation module is for obtaining on picture the related content that represents according to corresponding entry and generates electronics text
Part.
Further, picture acquisition module adopts scanner.
Using the beneficial effect of the invention described above technical scheme it is:By the corresponding picture of acquisition archives of paper quality, and according to
The information content represented on picture generates corresponding entry, then obtains related content the life represented on picture according to corresponding entry
Into e-file so that simple to operate, so as to improve the digitized operating efficiency of archives of paper quality, while reduce behaviour by mistake leading.
Description of the drawings
Fig. 1 is archives of paper quality method for digitizing flow chart of the present invention;
Fig. 2 is the structural representation of archives of paper quality digitization system of the present invention;
Fig. 3 is that Wujiang cities and towns of the present invention emphasis helps the scanning of unemployment (being laid off) personnel's situation survey graft (two) to illustrate
Figure.
Specific embodiment
For making purpose, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
The a part of embodiment of the present invention, rather than the embodiment of whole.
The invention discloses a kind of archives of paper quality method for digitizing, as shown in figure 1, comprising the steps:
Step S101, obtains the corresponding picture of archives of paper quality;
Specifically, in the present embodiment, it would be desirable to which the archives of paper quality being digitized is scanned by scanner, to obtain
Take the picture after the archives of paper quality scanning.
Step S102, generates corresponding entry according to the information content represented on picture;
Specifically, as shown in figure 3, in the present embodiment, with " Wujiang cities and towns emphasis assistance unemployment (being laid off) personnel's situation
Illustrate as a example by application form (two) ", after being scanned to the table, then can be generated according to the information content represented on the table
Corresponding entry, such as file number ID, time, page name, fill and present the entries such as unit, department, archives kind, class-mark, classification number, number of pages.
Specifically corresponding entry can be generated according to the content included by the picture of actual scanning.
Step S103, obtains on picture the related content that represents according to corresponding entry and generates e-file.
In the present embodiment, the related content represented on picture, such as file number is obtained according to the corresponding entry of above-mentioned generation
ID be " 031-053 ", class-mark for " 01-019-01 ", time for " 2003 ", archives kind be long-term, the entitled Shangri-La Town city of page
Town unemployed people's census list and rural laborer's application form, unit is filled and presented for " Wujiang the urban unemployed's census and registration table-peach
Source work social security institute ", department for " South neighbourhood committee ", archives kind for " social security ", number of pages for " 1 " waits concrete interior
Hold, and automatically generated by backstage in " corresponding entry " of EXCEL file, so as to realize the digitlization to archives of paper quality.
Specifically, in the present embodiment, the concrete mode for obtaining the related content represented on picture can pass through scanner
Respective field is obtained into corresponding entry, is then manually proofreaded, reduce while operating efficiency mistake so as to be improved
The purpose that behaviour leads;Content corresponding with entry can also be input into by corresponding entry by the corresponding pictorial information of scanning, in this reality
Apply in example, such as continuously input identical word three times in input process can be arranged on, or add up input phase within the start cycle
Same word ten times, then automatically generate drop-down word, so as to mitigate input service amount, improves input efficiency, and ensure that defeated
The accuracy for entering.
The method of the present invention is by obtaining the corresponding picture of archives of paper quality, and is generated according to the information content represented on picture
Corresponding entry, then obtains on picture the related content that represents according to corresponding entry and generates e-file so that be simple to operate,
So as to improve the digitized operating efficiency of archives of paper quality, while reduce behaviour by mistake leading.
The invention also discloses a kind of digitized system of archives of paper quality, as shown in Fig. 2 including picture acquisition module 201,
Corresponding entry generation module 202 and e-file generation module 203, wherein, picture acquisition module is used for obtaining archives of paper quality
Corresponding picture;Corresponding entry generation module is for according to the information content generation corresponding entry represented on picture;E-file
Generation module is for obtaining on picture the related content that represents according to corresponding entry and generates e-file.Specifically, picture is obtained
Delivery block can be realized using scanner.
The archives of paper quality digitization system of the present embodiment can be used for the technical scheme for performing embodiment of the method shown in Fig. 1,
Which realizes that principle is similar with technique effect, and here is omitted.
One of ordinary skill in the art will appreciate that:Realize that all or part of step of above-mentioned each method embodiment can be led to
Cross the related hardware of programmed instruction to complete.Aforesaid program can be stored in a computer read/write memory medium.The journey
Sequence upon execution, performs the step of including above-mentioned each method embodiment;And aforesaid storage medium includes:ROM, RAM, magnetic disc or
Person's CD etc. is various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above only to illustrate technical scheme, rather than a limitation;To the greatest extent
Pipe has been described in detail to the present invention with reference to foregoing embodiments, it will be understood by those within the art that:Its according to
So the technical scheme described in foregoing embodiments can be modified, or which part or all technical characteristic are entered
Row equivalent;And these modifications or replacement, do not make the essence of appropriate technical solution depart from various embodiments of the present invention technology
The scope of scheme.
Claims (4)
1. a kind of archives of paper quality method for digitizing, it is characterised in that comprise the steps:
Obtain the corresponding picture of archives of paper quality;
Corresponding entry is generated according to the information content represented on the picture;
The related content that represents is obtained on picture according to the corresponding entry and e-file is generated.
2. archives of paper quality method for digitizing according to claim 1, it is characterised in that the acquisition archives of paper quality correspondence
The operation of picture specifically include:Archives of paper quality is scanned by scanner, to obtain after the archives of paper quality scanning
Picture.
3. the digitized system of a kind of archives of paper quality, it is characterised in that including picture acquisition module, corresponding entry generation module with
And e-file generation module,
The picture acquisition module is used for obtaining the corresponding picture of archives of paper quality;
The corresponding entry generation module is for according to the information content generation corresponding entry represented on the picture;
The e-file generation module is for obtaining on picture the related content that represents according to the corresponding entry and generates electricity
Subfile.
4. the digitized system of archives of paper quality according to claim 3, it is characterised in that the picture acquisition module is adopted
Scanner.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610780075.5A CN106529386A (en) | 2016-08-31 | 2016-08-31 | Paper archive digitization method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610780075.5A CN106529386A (en) | 2016-08-31 | 2016-08-31 | Paper archive digitization method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106529386A true CN106529386A (en) | 2017-03-22 |
Family
ID=58343696
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610780075.5A Pending CN106529386A (en) | 2016-08-31 | 2016-08-31 | Paper archive digitization method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106529386A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169543A (en) * | 2017-05-17 | 2017-09-15 | 苏州市千尺浪信息科技服务有限公司 | A kind of reduction image, the acquisition system of word accounting |
CN108197119A (en) * | 2018-02-05 | 2018-06-22 | 成都卓观信息技术有限公司 | The archives of paper quality digitizing solution of knowledge based collection of illustrative plates |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081615A (en) * | 2009-11-28 | 2011-06-01 | 东莞市万维网络科技信息有限公司 | Archives arranging and digital processing system based on information resource planning of archives |
CN102968426A (en) * | 2012-07-04 | 2013-03-13 | 南京斯谱蓝自动化科技有限公司 | Archive comprehensive management system |
CN103077625A (en) * | 2013-01-30 | 2013-05-01 | 中国盲文出版社 | Blind electronic reader and blind assistance reading method |
CN105740857A (en) * | 2016-01-31 | 2016-07-06 | 华南理工大学 | OCR based automatic acquisition and recognition system for fast pencil-and-paper voting result |
-
2016
- 2016-08-31 CN CN201610780075.5A patent/CN106529386A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081615A (en) * | 2009-11-28 | 2011-06-01 | 东莞市万维网络科技信息有限公司 | Archives arranging and digital processing system based on information resource planning of archives |
CN102968426A (en) * | 2012-07-04 | 2013-03-13 | 南京斯谱蓝自动化科技有限公司 | Archive comprehensive management system |
CN103077625A (en) * | 2013-01-30 | 2013-05-01 | 中国盲文出版社 | Blind electronic reader and blind assistance reading method |
CN105740857A (en) * | 2016-01-31 | 2016-07-06 | 华南理工大学 | OCR based automatic acquisition and recognition system for fast pencil-and-paper voting result |
Non-Patent Citations (3)
Title |
---|
俞传正 等: "《图书馆实用信息技术》", 30 April 2010 * |
吕奇 等: "《计算机辅助翻译入门》", 31 May 2015 * |
徐德光 等: "基于FlexPaper的高校纸质档案数字化平台建设", 《中国信息技术教育》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169543A (en) * | 2017-05-17 | 2017-09-15 | 苏州市千尺浪信息科技服务有限公司 | A kind of reduction image, the acquisition system of word accounting |
CN108197119A (en) * | 2018-02-05 | 2018-06-22 | 成都卓观信息技术有限公司 | The archives of paper quality digitizing solution of knowledge based collection of illustrative plates |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2284670C2 (en) | Mobile digital camera recognizing text and graphic information in image | |
CN108108342B (en) | Structured text generation method, search method and device | |
Terras | Digitization and digital resources in the humanities | |
CN105913218A (en) | Electronic invoice invoicing reimbursement method and electronic invoice reimbursement information extraction method | |
CN112560411A (en) | Intelligent personnel information input method and system | |
CN105335453B (en) | Image file dividing method | |
CN105160466A (en) | Method for applying two-dimensional code to highway engineering | |
JP7082333B2 (en) | Question automatic generation program and question automatic generation device | |
CN106529386A (en) | Paper archive digitization method and system | |
Thammarak et al. | Automated data digitization system for vehicle registration certificates using google cloud vision API | |
JP2014220708A (en) | Document computerization method and document computerization system | |
CN114359533A (en) | Page number identification method based on page text and computer equipment | |
CN111626029B (en) | Project consultation budget method and device and electronic equipment | |
KR102328034B1 (en) | Database building device that can build a knowledge database from a table-inserted image and operating method thereof | |
CN108121960A (en) | A kind of standard resource processes overall process electronization management-control method | |
CN115828856A (en) | Test paper generation method, device, equipment and storage medium | |
CN112149679B (en) | Method and device for extracting document elements based on OCR character recognition | |
CN207249688U (en) | A kind of computer marking device based on artificial intelligence character recognition technology | |
Ansari et al. | Library automation in Indian central universities: Issues and challenges | |
CN109376554B (en) | Multi-terminal electronic document examination and signature method and system based on labels and views | |
CN114328804A (en) | Method and system for searching key words containing character pictures | |
CN206431622U (en) | A kind of assistant learning system based on Quick Response Code | |
KR20200106397A (en) | Automatic eCRF generation from pCRF using AItechnology | |
US10606928B2 (en) | Assistive technology for the impaired | |
Bautista et al. | Adoption of an Open Source Optical Character Recognition (OCR) for Database Buildup of the Students' Scholastic Records |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170322 |
|
RJ01 | Rejection of invention patent application after publication |