CN111460786A - Technical method for analyzing traditional document structure - Google Patents
Technical method for analyzing traditional document structure Download PDFInfo
- Publication number
- CN111460786A CN111460786A CN202010272203.1A CN202010272203A CN111460786A CN 111460786 A CN111460786 A CN 111460786A CN 202010272203 A CN202010272203 A CN 202010272203A CN 111460786 A CN111460786 A CN 111460786A
- Authority
- CN
- China
- Prior art keywords
- document
- document information
- unit
- information
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000004458 analytical method Methods 0.000 claims abstract description 26
- 238000007405 data analysis Methods 0.000 claims abstract description 13
- 238000013075 data extraction Methods 0.000 claims description 6
- 230000010354 integration Effects 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 3
- 238000011835 investigation Methods 0.000 abstract description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
Abstract
The invention discloses a technical method for analyzing the structure of a traditional document, which comprises a document information input module, a document database module, a document data analysis module and a document data summarization module, wherein the document information input module is used for inputting document information; the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit; the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit; the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit. This application utilizes the technique of big data investigation contrast to carry out quick analysis to the document, can help the comparatively quick analysis of user to go out the information of preparation, and the user uses more convenience.
Description
Technical Field
The invention relates to a technical method for analyzing a traditional document structure, belonging to the technical field of information.
Background
Document analysis is a technique for gathering requirements during a requirements acquisition phase of a project. It describes the examination of existing documents of comparable business processes or systems to extract information relevant to the current project, so that the document analysis should take into account project requirements, and the business analyst can elicit discovery needs in many ways, typically using questionnaires, interviews or meetings to motivate the requirements of stakeholders. However, document analysis is particularly valuable when replacing one or more existing systems with new systems, in a manner that will provide enhanced functionality or a better overall user experience, existing documents may be searched for key functions, business rules, business entities and business entity attributes, and document analysis may also be required when stakeholders are unable to provide insight into existing business processes or systems.
The existing document analysis method is complex, cannot help a user to analyze a result quickly, has poor analysis efficiency, and carries out technical innovation on the basis of the existing refining agent aiming at the situation.
Disclosure of Invention
The invention aims to provide a technical method for analyzing the structure of a traditional document, which comprises the following steps: the document information input module, the document database module, the document data analysis module and the document data summarization module; wherein the content of the first and second substances,
the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit;
the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit;
the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit.
Preferably, the document information scanning unit of the document information entry module is configured to scan a paper document by using laser scanning, and convert the paper document into an electronic document format, the document information correction unit is configured to correct error information in the electronic document scanned by the document information scanning unit and correct the error information in an artificial manner, and the document information storage unit is configured to store the entered document information.
Preferably, the document information presetting unit in the document database module is used for presetting documents with different themes in the unit and documents with different keywords in the unit, the document information classifying unit is used for carrying out basic classification processing according to document information ideas, and the document information comparison unit is used for comparing the input document information with the documents in the original database to find out approximate documents and call out information characteristics of the approximate documents.
Preferably, the document data analysis module includes a document information data extraction unit, a document information data big data analysis unit, and a document information data integration unit.
Preferably, the document information data extraction unit is configured to extract topics and core words in the document, determine a frequency of occurrence of each core word, and determine the topics of the document data according to a plurality of core words with high frequency of occurrence.
Preferably, the document information data big data analysis unit analyzes the document information using big data.
Preferably, the document information data integration unit is configured to integrate the document information analyzed by each unit.
Preferably, the document information analysis report generation unit is configured to generate a preset table from data analyzed by the document information and generate a corresponding report in the graph, and the document information cloud storage unit is configured to transmit the document information analysis report and the document information to the cloud for storage.
Compared with the prior art, the invention has the following beneficial effects:
according to the technical method for analyzing the traditional document structure, the document is quickly analyzed by utilizing a big data investigation and comparison technology, a user can be helped to quickly analyze prepared information, the user can use the method more conveniently, the input efficiency of the document is greatly improved by utilizing a laser scanning mode for inputting the document, and the method is more convenient and quicker.
Detailed Description
In order to make the technical solutions of the present invention more clear and definite for those skilled in the art, the present invention is further described in detail with reference to the following examples, but the embodiments of the present invention are not limited thereto.
The technical method for analyzing the traditional document structure provided by the embodiment comprises the following steps: the document information input module, the document database module, the document data analysis module and the document data summarization module; wherein the content of the first and second substances,
the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit;
the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit;
the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit.
The document information scanning unit of the document information recording module is used for scanning paper documents by laser scanning and converting the paper documents into electronic document formats, the document information correcting unit is used for correcting error information in the electronic documents scanned by the document information scanning unit and correcting the error information in an artificial mode, and the document information storage unit is used for storing the recorded document information.
The document information presetting unit in the document database module is used for presetting documents with different themes in the unit and documents with different central words in the unit, the document information classifying unit is used for carrying out basic classification processing according to document information ideas, and the document information comparison unit is used for comparing the input document information with the documents in the original database to find out approximate documents and call out information characteristics of the approximate documents.
The document data analysis module comprises a document information data extraction unit, a document information data big data analysis unit and a document information data integration unit.
The document information data extraction unit is used for extracting the topics and the central words in the document, determining the frequency of occurrence of each central word, and determining the topics of the document data according to a plurality of central words with high frequency.
The document information data big data analysis unit analyzes the document information by using the big data.
The document information data integration unit is used for integrating the document information analyzed by each unit.
The document information analysis report generation unit is used for generating a preset table from data analyzed by the document information and generating a corresponding report in a graph, and the document information cloud storage unit is used for transmitting the document information analysis report and the document information to a cloud for storage.
The above description is only for the purpose of illustrating the present invention and is not intended to limit the scope of the present invention, and any person skilled in the art can substitute or change the technical solution of the present invention and its conception within the scope of the present invention.
Claims (8)
1. A technical method for traditional document structure analysis is characterized by comprising the following steps: the document information input module, the document database module, the document data analysis module and the document data summarization module; wherein the content of the first and second substances,
the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit;
the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit;
the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit.
2. The technical method of traditional document structure analysis according to claim 1, wherein: the document information scanning unit of the document information input module is used for scanning paper documents by laser scanning and converting the paper documents into electronic document formats, the document information correcting unit is used for correcting error information in the electronic documents scanned by the document information scanning unit and correcting the error information in an artificial mode, and the document information storage unit is used for storing the input document information.
3. The technical method of traditional document structure analysis according to claim 1, wherein: the document information presetting unit in the document database module is used for presetting documents with different themes in the unit and documents with different central words in the unit, the document information classifying unit is used for carrying out basic classification processing according to document information ideas, and the document information comparison unit is used for comparing the recorded document information with the documents in the original database to find out approximate documents and call out information characteristics of the approximate documents.
4. The technical method of traditional document structure analysis according to claim 1, wherein: the document data analysis module comprises a document information data extraction unit, a document information data big data analysis unit and a document information data integration unit.
5. The technical method of traditional document structure analysis according to claim 4, wherein: the document information data extraction unit is used for extracting the topics and the central words in the document, determining the frequency of occurrence of each central word, and determining the topics of the document data according to a plurality of central words with high frequency.
6. The technical method of traditional document structure analysis according to claim 4, wherein: the document information data big data analysis unit analyzes the document information by using the big data.
7. The technical method of traditional document structure analysis according to claim 4, wherein: the document information data integration unit is used for integrating the document information analyzed by each unit.
8. The technical method of traditional document structure analysis according to claim 1, wherein: the document information analysis report generation unit is used for generating data analyzed by the document information into a preset table and a corresponding report in a graph, and the document information cloud storage unit is used for transmitting the document information analysis report and the document information to a cloud for storage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010272203.1A CN111460786A (en) | 2020-04-09 | 2020-04-09 | Technical method for analyzing traditional document structure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010272203.1A CN111460786A (en) | 2020-04-09 | 2020-04-09 | Technical method for analyzing traditional document structure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111460786A true CN111460786A (en) | 2020-07-28 |
Family
ID=71681229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010272203.1A Pending CN111460786A (en) | 2020-04-09 | 2020-04-09 | Technical method for analyzing traditional document structure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111460786A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112688925A (en) * | 2020-12-17 | 2021-04-20 | 崔强 | Enterprise storage system state monitoring method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101055581A (en) * | 2006-04-13 | 2007-10-17 | Lg电子株式会社 | Document management system and method |
CN107886309A (en) * | 2017-12-15 | 2018-04-06 | 四川汉科计算机信息技术有限公司 | Document examines instrument automatically |
CN110135264A (en) * | 2019-04-16 | 2019-08-16 | 深圳壹账通智能科技有限公司 | Data entry method, device, computer equipment and storage medium |
-
2020
- 2020-04-09 CN CN202010272203.1A patent/CN111460786A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101055581A (en) * | 2006-04-13 | 2007-10-17 | Lg电子株式会社 | Document management system and method |
CN107886309A (en) * | 2017-12-15 | 2018-04-06 | 四川汉科计算机信息技术有限公司 | Document examines instrument automatically |
CN110135264A (en) * | 2019-04-16 | 2019-08-16 | 深圳壹账通智能科技有限公司 | Data entry method, device, computer equipment and storage medium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112688925A (en) * | 2020-12-17 | 2021-04-20 | 崔强 | Enterprise storage system state monitoring method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20190286898A1 (en) | System and method for data extraction and searching | |
CN108920675B (en) | Information processing method and device, computer storage medium and terminal | |
US20070237427A1 (en) | Method and system for simplified recordkeeping including transcription and voting based verification | |
WO2006002009A2 (en) | Document management system with enhanced intelligent document recognition capabilities | |
CN111126952A (en) | Electronic file filing processing system and method | |
CN108664635A (en) | Acquisition methods, device, equipment and the storage medium of statistics of database information | |
CN108897862A (en) | One kind being based on government document picture retrieval method and system | |
US20230177267A1 (en) | Automated classification and interpretation of life science documents | |
CN111460786A (en) | Technical method for analyzing traditional document structure | |
US20060210171A1 (en) | Image processing apparatus | |
CN112464907A (en) | Document processing system and method | |
CN112182174A (en) | Business question-answer knowledge query method and device, computer equipment and storage medium | |
KR102496620B1 (en) | AI-based search function and OCR electronic research note management system | |
CN115310423A (en) | Document multi-mode information extraction and association method | |
CN111221777A (en) | Data record matching method and device | |
CN116450717B (en) | Data integration method and information management system for cross-service modules | |
KR20190076302A (en) | Apparatus for document classification processing using the machine learning and publishing apparatus using the same | |
CN117112846B (en) | Multi-information source license information management method, system and medium | |
CN117216015A (en) | Structured data extraction method and system | |
CN110309384B (en) | Management method for classifying patent files by using dates | |
Jahangiri et al. | Development of Indigenous Indicators of Knowledge Management Evaluation in a Military Center | |
Weber | Extracting retrievable information from archival documents | |
Räisänen et al. | AI Powered Tools for Improving Usability in Digital Archiving | |
CN117574857A (en) | Basic-level data acquisition and management system and method for electronic ink screen equipment | |
CN114139526A (en) | New credit investigation report PDF analysis method, processing and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200728 |