CN111460786A - Technical method for analyzing traditional document structure - Google Patents

Technical method for analyzing traditional document structure Download PDF

Info

Publication number
CN111460786A
CN111460786A CN202010272203.1A CN202010272203A CN111460786A CN 111460786 A CN111460786 A CN 111460786A CN 202010272203 A CN202010272203 A CN 202010272203A CN 111460786 A CN111460786 A CN 111460786A
Authority
CN
China
Prior art keywords
document
document information
unit
information
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010272203.1A
Other languages
Chinese (zh)
Inventor
李玉峰
吴小虎
凌霄汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seu Intelligece System Co ltd
Original Assignee
Seu Intelligece System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seu Intelligece System Co ltd filed Critical Seu Intelligece System Co ltd
Priority to CN202010272203.1A priority Critical patent/CN111460786A/en
Publication of CN111460786A publication Critical patent/CN111460786A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files

Abstract

The invention discloses a technical method for analyzing the structure of a traditional document, which comprises a document information input module, a document database module, a document data analysis module and a document data summarization module, wherein the document information input module is used for inputting document information; the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit; the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit; the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit. This application utilizes the technique of big data investigation contrast to carry out quick analysis to the document, can help the comparatively quick analysis of user to go out the information of preparation, and the user uses more convenience.

Description

Technical method for analyzing traditional document structure
Technical Field
The invention relates to a technical method for analyzing a traditional document structure, belonging to the technical field of information.
Background
Document analysis is a technique for gathering requirements during a requirements acquisition phase of a project. It describes the examination of existing documents of comparable business processes or systems to extract information relevant to the current project, so that the document analysis should take into account project requirements, and the business analyst can elicit discovery needs in many ways, typically using questionnaires, interviews or meetings to motivate the requirements of stakeholders. However, document analysis is particularly valuable when replacing one or more existing systems with new systems, in a manner that will provide enhanced functionality or a better overall user experience, existing documents may be searched for key functions, business rules, business entities and business entity attributes, and document analysis may also be required when stakeholders are unable to provide insight into existing business processes or systems.
The existing document analysis method is complex, cannot help a user to analyze a result quickly, has poor analysis efficiency, and carries out technical innovation on the basis of the existing refining agent aiming at the situation.
Disclosure of Invention
The invention aims to provide a technical method for analyzing the structure of a traditional document, which comprises the following steps: the document information input module, the document database module, the document data analysis module and the document data summarization module; wherein the content of the first and second substances,
the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit;
the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit;
the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit.
Preferably, the document information scanning unit of the document information entry module is configured to scan a paper document by using laser scanning, and convert the paper document into an electronic document format, the document information correction unit is configured to correct error information in the electronic document scanned by the document information scanning unit and correct the error information in an artificial manner, and the document information storage unit is configured to store the entered document information.
Preferably, the document information presetting unit in the document database module is used for presetting documents with different themes in the unit and documents with different keywords in the unit, the document information classifying unit is used for carrying out basic classification processing according to document information ideas, and the document information comparison unit is used for comparing the input document information with the documents in the original database to find out approximate documents and call out information characteristics of the approximate documents.
Preferably, the document data analysis module includes a document information data extraction unit, a document information data big data analysis unit, and a document information data integration unit.
Preferably, the document information data extraction unit is configured to extract topics and core words in the document, determine a frequency of occurrence of each core word, and determine the topics of the document data according to a plurality of core words with high frequency of occurrence.
Preferably, the document information data big data analysis unit analyzes the document information using big data.
Preferably, the document information data integration unit is configured to integrate the document information analyzed by each unit.
Preferably, the document information analysis report generation unit is configured to generate a preset table from data analyzed by the document information and generate a corresponding report in the graph, and the document information cloud storage unit is configured to transmit the document information analysis report and the document information to the cloud for storage.
Compared with the prior art, the invention has the following beneficial effects:
according to the technical method for analyzing the traditional document structure, the document is quickly analyzed by utilizing a big data investigation and comparison technology, a user can be helped to quickly analyze prepared information, the user can use the method more conveniently, the input efficiency of the document is greatly improved by utilizing a laser scanning mode for inputting the document, and the method is more convenient and quicker.
Detailed Description
In order to make the technical solutions of the present invention more clear and definite for those skilled in the art, the present invention is further described in detail with reference to the following examples, but the embodiments of the present invention are not limited thereto.
The technical method for analyzing the traditional document structure provided by the embodiment comprises the following steps: the document information input module, the document database module, the document data analysis module and the document data summarization module; wherein the content of the first and second substances,
the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit;
the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit;
the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit.
The document information scanning unit of the document information recording module is used for scanning paper documents by laser scanning and converting the paper documents into electronic document formats, the document information correcting unit is used for correcting error information in the electronic documents scanned by the document information scanning unit and correcting the error information in an artificial mode, and the document information storage unit is used for storing the recorded document information.
The document information presetting unit in the document database module is used for presetting documents with different themes in the unit and documents with different central words in the unit, the document information classifying unit is used for carrying out basic classification processing according to document information ideas, and the document information comparison unit is used for comparing the input document information with the documents in the original database to find out approximate documents and call out information characteristics of the approximate documents.
The document data analysis module comprises a document information data extraction unit, a document information data big data analysis unit and a document information data integration unit.
The document information data extraction unit is used for extracting the topics and the central words in the document, determining the frequency of occurrence of each central word, and determining the topics of the document data according to a plurality of central words with high frequency.
The document information data big data analysis unit analyzes the document information by using the big data.
The document information data integration unit is used for integrating the document information analyzed by each unit.
The document information analysis report generation unit is used for generating a preset table from data analyzed by the document information and generating a corresponding report in a graph, and the document information cloud storage unit is used for transmitting the document information analysis report and the document information to a cloud for storage.
The above description is only for the purpose of illustrating the present invention and is not intended to limit the scope of the present invention, and any person skilled in the art can substitute or change the technical solution of the present invention and its conception within the scope of the present invention.

Claims (8)

1. A technical method for traditional document structure analysis is characterized by comprising the following steps: the document information input module, the document database module, the document data analysis module and the document data summarization module; wherein the content of the first and second substances,
the document information input module comprises a document information scanning unit, a document information correcting unit and a document information storage unit;
the document database module comprises a document information presetting unit, a document information classifying unit and a document information comparing unit;
the document data summarization module comprises a document information analysis report generation unit and a document information cloud storage unit.
2. The technical method of traditional document structure analysis according to claim 1, wherein: the document information scanning unit of the document information input module is used for scanning paper documents by laser scanning and converting the paper documents into electronic document formats, the document information correcting unit is used for correcting error information in the electronic documents scanned by the document information scanning unit and correcting the error information in an artificial mode, and the document information storage unit is used for storing the input document information.
3. The technical method of traditional document structure analysis according to claim 1, wherein: the document information presetting unit in the document database module is used for presetting documents with different themes in the unit and documents with different central words in the unit, the document information classifying unit is used for carrying out basic classification processing according to document information ideas, and the document information comparison unit is used for comparing the recorded document information with the documents in the original database to find out approximate documents and call out information characteristics of the approximate documents.
4. The technical method of traditional document structure analysis according to claim 1, wherein: the document data analysis module comprises a document information data extraction unit, a document information data big data analysis unit and a document information data integration unit.
5. The technical method of traditional document structure analysis according to claim 4, wherein: the document information data extraction unit is used for extracting the topics and the central words in the document, determining the frequency of occurrence of each central word, and determining the topics of the document data according to a plurality of central words with high frequency.
6. The technical method of traditional document structure analysis according to claim 4, wherein: the document information data big data analysis unit analyzes the document information by using the big data.
7. The technical method of traditional document structure analysis according to claim 4, wherein: the document information data integration unit is used for integrating the document information analyzed by each unit.
8. The technical method of traditional document structure analysis according to claim 1, wherein: the document information analysis report generation unit is used for generating data analyzed by the document information into a preset table and a corresponding report in a graph, and the document information cloud storage unit is used for transmitting the document information analysis report and the document information to a cloud for storage.
CN202010272203.1A 2020-04-09 2020-04-09 Technical method for analyzing traditional document structure Pending CN111460786A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010272203.1A CN111460786A (en) 2020-04-09 2020-04-09 Technical method for analyzing traditional document structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010272203.1A CN111460786A (en) 2020-04-09 2020-04-09 Technical method for analyzing traditional document structure

Publications (1)

Publication Number Publication Date
CN111460786A true CN111460786A (en) 2020-07-28

Family

ID=71681229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010272203.1A Pending CN111460786A (en) 2020-04-09 2020-04-09 Technical method for analyzing traditional document structure

Country Status (1)

Country Link
CN (1) CN111460786A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112688925A (en) * 2020-12-17 2021-04-20 崔强 Enterprise storage system state monitoring method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055581A (en) * 2006-04-13 2007-10-17 Lg电子株式会社 Document management system and method
CN107886309A (en) * 2017-12-15 2018-04-06 四川汉科计算机信息技术有限公司 Document examines instrument automatically
CN110135264A (en) * 2019-04-16 2019-08-16 深圳壹账通智能科技有限公司 Data entry method, device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055581A (en) * 2006-04-13 2007-10-17 Lg电子株式会社 Document management system and method
CN107886309A (en) * 2017-12-15 2018-04-06 四川汉科计算机信息技术有限公司 Document examines instrument automatically
CN110135264A (en) * 2019-04-16 2019-08-16 深圳壹账通智能科技有限公司 Data entry method, device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112688925A (en) * 2020-12-17 2021-04-20 崔强 Enterprise storage system state monitoring method

Similar Documents

Publication Publication Date Title
US20190286898A1 (en) System and method for data extraction and searching
CN108920675B (en) Information processing method and device, computer storage medium and terminal
US20070237427A1 (en) Method and system for simplified recordkeeping including transcription and voting based verification
WO2006002009A2 (en) Document management system with enhanced intelligent document recognition capabilities
CN111126952A (en) Electronic file filing processing system and method
CN108664635A (en) Acquisition methods, device, equipment and the storage medium of statistics of database information
CN108897862A (en) One kind being based on government document picture retrieval method and system
US20230177267A1 (en) Automated classification and interpretation of life science documents
CN111460786A (en) Technical method for analyzing traditional document structure
US20060210171A1 (en) Image processing apparatus
CN112464907A (en) Document processing system and method
CN112182174A (en) Business question-answer knowledge query method and device, computer equipment and storage medium
KR102496620B1 (en) AI-based search function and OCR electronic research note management system
CN115310423A (en) Document multi-mode information extraction and association method
CN111221777A (en) Data record matching method and device
CN116450717B (en) Data integration method and information management system for cross-service modules
KR20190076302A (en) Apparatus for document classification processing using the machine learning and publishing apparatus using the same
CN117112846B (en) Multi-information source license information management method, system and medium
CN117216015A (en) Structured data extraction method and system
CN110309384B (en) Management method for classifying patent files by using dates
Jahangiri et al. Development of Indigenous Indicators of Knowledge Management Evaluation in a Military Center
Weber Extracting retrievable information from archival documents
Räisänen et al. AI Powered Tools for Improving Usability in Digital Archiving
CN117574857A (en) Basic-level data acquisition and management system and method for electronic ink screen equipment
CN114139526A (en) New credit investigation report PDF analysis method, processing and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200728