CN114491016A - Automatic document classification and automatic maintenance method - Google Patents

Automatic document classification and automatic maintenance method Download PDF

Info

Publication number
CN114491016A
CN114491016A CN202111577605.3A CN202111577605A CN114491016A CN 114491016 A CN114491016 A CN 114491016A CN 202111577605 A CN202111577605 A CN 202111577605A CN 114491016 A CN114491016 A CN 114491016A
Authority
CN
China
Prior art keywords
classification
data
information
document
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111577605.3A
Other languages
Chinese (zh)
Inventor
江梅
胡煜
牛丽阳
张悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Flight Automatic Control Research Institute of AVIC
Original Assignee
Xian Flight Automatic Control Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Flight Automatic Control Research Institute of AVIC filed Critical Xian Flight Automatic Control Research Institute of AVIC
Publication of CN114491016A publication Critical patent/CN114491016A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Abstract

Compared with the traditional document management method, the automatic classification and automatic maintenance method for the document data eliminates the time-consuming data recording and data classification working links during manual establishment of the document database, reduces the problem of easy error during manual recording and classification, and can realize document data release through a common data release module and a small amount of parameter configuration during data release, thereby reducing a large amount of programming work in the data release process.

Description

Automatic document classification and automatic maintenance method
Technical Field
The invention belongs to the field of document classification, and particularly relates to an automatic document data classification and automatic maintenance method.
Background
Knowledge is a core asset of modern research type enterprises and is a source for continuous innovation and sustainable development of the enterprises, so that many enterprises catch knowledge management as basic management of the enterprises, and sufficient attention and enterprise positioning are given to the knowledge management. The literature data is an important carrier for knowledge bearing, and comprises data acquired from outside by purchase or in other ways, and various literature data and various data generated in the research, development, manufacturing and operation processes of the enterprise. These documents constitute the entire intellectual property of an enterprise. Overall, the management of these intellectual assets presents several distinct features, as follows:
a large amount of various document management demands exist in research-type enterprises, and along with the gradual replacement of paper books by electronic publications, the working mode of managing a large amount of various electronic publications in a database mode is gradually mature;
meanwhile, a large number of self-made electronic files are formed in enterprises, the electronic files are slowly changed from individual custody to centralized custody of a database, correct classification of the files is realized on one of the bases of centralized management of the files, the classification needs to ensure long-term stability, and the files can be identified by people and computers and are easy to manage;
the traditional electronic document management is low in efficiency and easy to make mistakes, the traditional manual management mode cannot meet the requirement of rapid development of enterprises along with the rapid pace development of the enterprises, the corresponding requirement for rapid utilization of electronic documents is also provided, and the enterprise operation rhythm is accelerated along with the rapid development of the enterprises, so that the electronic document resource service is required to keep the working characteristics of rapid response and accurate service.
The document management work inevitably involves the classification work of a large number of documents, and an administrator needs to manually classify the documents according to a certain classification method at the time of recording. Due to the adoption of the manual library building method, the following problems exist: firstly, in order to classify the literature for management, the grouped literature needs to be organized according to a certain hierarchical relationship, and in the manual recording process, the data administrator is difficult to completely and correctly classify the literature due to insufficient professional literacy during classification, and in the actual work, the error is difficult to find, so that a series of problems exist in the future when the literature is utilized. The work efficiency is low, a skilled literature data manager probably maintains about ten recorded data per hour, and the manual mode is difficult to meet the rapidly-increasing electronic literature management requirements of enterprises. .
Disclosure of Invention
Aiming at the problems in the background art, the invention aims to provide an automatic classification and automatic maintenance method for literature data, and the efficient literature classification and publishing method is realized by collecting the literature data, classifying and sorting the literature data locally, uploading the literature data in batches, classifying the literature data automatically, establishing the hierarchical relationship of categories automatically and publishing the data after configuring a small number of websites.
In order to solve the technical problem, the technical scheme of the invention is as follows:
a method for automatically classifying and maintaining literature materials comprises the following steps: step S1: initializing; step S2: processing and publishing the batch document data; step S3: processing and publishing single document data; step S4: maintaining the classification structure of the literature data; s5: synchronizing the local and database server classification structures;
in step S1, the initialization includes the steps of:
s101: firstly, establishing a database server, wherein fields contained in an electronic file registration table in the database server are basically in one-to-one correspondence with file bibliographic information of the literature data, and certain reserved fields are reserved for system expansion;
s102: establishing an electronic file classification table aiming at each type of the document data in a database server, wherein fields of the electronic file classification table comprise classification structure codes, father classification structure codes and classification names, and establishing classified parent-child relations through the classification structure codes and the father classification structure codes; the classification structure codes adopt GUIDs or incremental numbers or self-defined coding rules;
s103: the classification structure codes correspond to the classification structure codes in the electronic document registration table, and the electronic documents corresponding to the document data classification can be screened through the classification structure codes in the electronic document registration table;
s104: initializing the electronic file classification table, establishing top classification, and freely establishing the classification of a user under the top classification;
wherein, in the step S2, the processing and publishing of the batch literature data includes the following steps:
s201: document data arrangement, namely arranging the document data at a document resource management client, establishing a classification structure of the document data, and storing electronic files classified correspondingly in a folder mode; the names of the folders are named according to a classification structure defined by a user, the names of the folders are automatically converted into all classification names in the electronic file classification table in subsequent processing, classification structure codes are automatically generated, and parent-child classification structures are automatically established in a database server according to the hierarchical relation of the folders;
s202: the method comprises the steps that user files are uploaded to be prepared, a user selects a well-arranged folder in a document resource management client, and the folder is prepared to be uploaded to a file server corresponding to classification in an electronic file classification table;
s203: selecting a data category, wherein the data category is initialized in a system of a document resource management client, the data category can be freely customized, and a user selects one of the data categories and can associate the electronic file with the category;
s204: selecting data management options, selecting whether the electronic files need to be uploaded one by one, and selecting the visibility and the security level of the electronic files to be uploaded; after the user selects the classification and management metadata, executing the batch operation of the literature data, wherein the batch operation comprises two parts, namely, uploading and storing the classification of the literature data to a file server, and storing the classification information of the literature data to an electronic file classification table of a database server;
s205: when uploading the electronic file, the file server automatically generates classification data of each document material and automatically registers classification information of the document material in an electronic file registration table;
s206: uploading the literature data to a file server, automatically establishing a directory structure on the file server by a program according to the file storage structure of the literature resource management client by a selected large category of the literature data, and storing the electronic files in respective directories;
s207: storing the classification information of the document data to a database server, automatically reading the management information set by the user of each electronic file and the classification information of the electronic file by a program when the electronic file is uploaded, and simultaneously writing the management information set by the user of each electronic file and the classification information of the document data into the database server;
s208: the classification information is checked, after the steps S206 and S207 are completed, the classification information of the document data can be checked, and errors can be found;
s209: the data classification changes, when the classification information of the document data is found to be wrong, the document data can be reclassified;
s210: WEB parameter configuration, namely operation before document distribution, after configuring classification information of document data and storage position information of the document data on a database server, forming a document output storage process, namely meeting the distribution requirement of the document data;
s211: data publishing, namely establishing a data publishing page, calling the storage process formed in the step, reading the management information set by the user of each electronic file and the classification information of the document data output in the storage process, and displaying the management information and the classification information of the document data to the user on the data publishing page through a WEB server, so that a data publishing function link is completed;
in step S5, the step of synchronizing the local and database server classification structures includes the following steps:
s51: reading a local directory, starting a management system of a document data management client, firstly retrieving local directory information which has uploaded data in the system according to current login user information, traversing the local directory on the document data management client according to the local directory information, and temporarily storing first information in the system, wherein the first information comprises folder structure information of the document data management client and file list information contained under each folder;
s52: reading a directory of a database server, wherein the system reads top-level classification of document data which is responsible for a user in the system according to information of the current login user, reads classification structure information successively according to the top-level classification, traverses the classification structure information, reads document data information under the classification structure information, and temporarily stores second information in the system, wherein the second information comprises a classification structure and file list information;
s53: comparing the first information and the second information which are temporarily stored, checking whether the classification which is not existed in the data server exists in a local classification structure in the document data management client, and simultaneously checking whether the existing local classification contains the document data which is not existed in the data server;
s54: the classification synchronization is carried out, and classifications which are not in a database server are created in a file server;
s55: synchronizing the literature data, comparing a file list in each folder of the literature data management client with a file list under a classification structure on the data server one by one, uploading the literature which is not in the file server, and recording information such as metadata of the uploaded literature in the database server; circularly processing the classification structure information and the document information under the classification structure information;
s56: and after finishing creating classification and uploading documents, automatically publishing the classification structure information and the document information under the classification structure information by the WEB server and online.
Further, the data categories include: patents, academic reports, achievements, papers, and journal documents.
Further, the step S203 specifically includes:
when a system in the database server operates for the first time, reading the top class classification of the electronic file classification table, and establishing a top class folder on the file server according to the classification name;
the user can establish own classification under the top-level folder, so that a corresponding folder is established on the file server;
the user selects a certain classification, electronic files can be uploaded to the classification in batch, a system in the document resource management client recursively reads folders of the document resource management client, the folders are established on the file server, and meanwhile, the classification is established in the electronic file classification table.
Further, in the step S3, the processing and issuing of the single document data includes the following steps:
s31: local literature data classification, namely determining the classification of a single literature data to be processed by contrasting a literature data classification structure locally managed in a database server;
s32: uploading the document data, and uploading the single document data to a file server;
s33: selecting a classification node, and selecting a corresponding classification in a system in the document management client for processing according to the classification of the single document to be processed;
s34: and after the classification node selection of the document data is completed, the document data is firstly stored to a corresponding position on the file server, and then the metadata of the document data is extracted and stored to the database server.
Further, in the step S4, the maintaining the classification structure of the document data includes the following steps:
s41: local literature classification, namely establishing a folder for identifying a classification structure on a literature data management client;
s42: uploading the classification structure, namely selecting a target classification, namely selecting which classification structure sub-level the uploaded classification structure belongs to, and then uploading a built folder for identifying the classification structure;
s43: the system of the document data management client establishes a classification record in a database server according to the uploaded structure of the folder of the identification classification structure and the name of the folder of the identification classification structure; ensuring that the hierarchical relation among the classifications is consistent with the structure of a local folder uploaded by a user and is positioned under a target classification for selection;
and S44, adjusting the classification structure, and adjusting all the sub-classifications of the same top classification in the database server. And ensuring that the classification data of the same top classification is stored in the database server in a centralized manner.
Furthermore, the folders with the identification classification structures can be multi-level folders with hierarchical relationships or single folders, the name of each folder with the identification classification structures represents a file classification, and the folder with the identification classification structures does not need to have files.
Further, if the classification which is not existed on the data server exists in the local classification structure, the classification is newly added; and creating a new added classification on the database server, and writing the new added classification and hierarchical relation data into the database server.
Further, the step S55 specifically includes:
checking whether a new file exists in a directory of the document data management client;
if the file in the document data management client is consistent with the file on the database server, version information of the two files needs to be compared, the comparison basis is the modification time of the files, if the modification time of the files is the same, the files are regarded as the same file, and if the modification time of the local files is newer than the modification time of the files on the database server, the versions of the two files are regarded as inconsistent, and version control needs to be carried out;
uploading the file in the document data management client to a file server, updating the version metadata of the file in a database server, and recording version change history information.
Further, if there is a new file in a folder of the document management client, the steps S206 and S207 are performed.
Further, the database server is an electronic document management database.
Compared with the prior art, the invention has the advantages that:
compared with the traditional method, the automatic classification and automatic maintenance method for the document data eliminates time-consuming data recording and data classification working links when a document database is established manually, reduces the problem that errors are easy to occur during manual recording and classification, and can realize document data release through a common data release module and a small amount of parameter configuration during data release, thereby reducing a large amount of programming work in the data release process.
Drawings
FIG. 1 is a block diagram of a server for automatic document classification and automatic maintenance according to the present invention;
FIG. 2 is a process of batch document classification according to an automatic document classification and automatic maintenance method of the present invention;
FIG. 3 is a single document management flow of an automatic document classification and automatic maintenance method of the present invention;
FIG. 4 is a document classification management flow of an automatic document classification and automatic maintenance method according to the present invention;
FIG. 5 is a local and server classification structure synchronization process of an automatic document classification and automatic maintenance method according to the present invention.
Detailed Description
The following describes embodiments of the present invention with reference to examples:
it should be noted that the structures, proportions, sizes, and other elements shown in the specification are included for the purpose of understanding and reading only, and are not intended to limit the scope of the invention, which is defined by the claims, and any modifications of the structures, changes in the proportions and adjustments of the sizes, without affecting the efficacy and attainment of the same.
In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not to be construed as a scope of the present invention.
The first embodiment is as follows:
as shown in fig. 1, a method for automatically classifying and maintaining literature data includes the following steps: step S1: initializing; step S2: processing and publishing the batch document data; step S3: processing and publishing single document data; step S4: maintaining the classification structure of the literature data; s5: synchronizing the local and database server classification structures;
in step S1, the initialization includes the steps of:
s101: firstly, establishing a database server, wherein fields contained in an electronic file registration table in the database server are basically in one-to-one correspondence with file bibliographic information of the literature data, and certain reserved fields are reserved for system expansion;
s102: establishing an electronic file classification table aiming at each type of the document data in a database server, wherein fields of the electronic file classification table comprise classification structure codes, father classification structure codes and classification names, and establishing classified parent-child relations through the classification structure codes and the father classification structure codes; the classification structure codes adopt GUIDs or incremental numbers or self-defined coding rules;
s103: the classification structure codes correspond to the classification structure codes in the electronic document registration table, and the electronic documents corresponding to the document data classification can be screened through the classification structure codes in the electronic document registration table;
s104: initializing the electronic file classification table, establishing top classification, and freely establishing the classification of a user under the top classification;
in step S2, the processing and publishing of the batch of document data includes the following steps:
s201: document data arrangement, namely arranging the document data at a document resource management client, establishing a classification structure of the document data, and storing electronic files classified correspondingly in a folder mode; the names of the folders are named according to a classification structure defined by a user, the names of the folders are automatically converted into all classification names in the electronic file classification table in subsequent processing, classification structure codes are automatically generated, and parent-child classification structures are automatically established in a database server according to the hierarchical relation of the folders;
s202: the method comprises the steps that user files are uploaded to be prepared, a user selects a well-arranged folder in a document resource management client, and the folder is prepared to be uploaded to a file server corresponding to classification in an electronic file classification table;
s203: selecting a data category, wherein the data category is initialized in a system of a document resource management client, the data category can be freely customized, and a user selects one of the data categories and can associate the electronic file with the category;
s204: selecting data management options, selecting whether the electronic files need to be uploaded one by one, and selecting the visibility and the security level of the electronic files to be uploaded; after the user selects the classification and management metadata, executing the batch operation of the literature data, wherein the batch operation comprises two parts, namely, uploading and storing the classification of the literature data to a file server, and storing the classification information of the literature data to an electronic file classification table of a database server;
s205: when uploading the electronic file, the file server automatically generates classification data of each document material and automatically registers classification information of the document material in an electronic file registration table;
s206: uploading the document data to a file server, automatically establishing a directory structure on the file server by a program according to the file storage structure of the document resource management client by the program according to the selected document data category, and storing the electronic files in respective directories;
s207: storing the classification information of the document data to a database server, automatically reading the management information set by the user of each electronic file and the classification information of the electronic file by a program when the electronic file is uploaded, and simultaneously writing the management information set by the user of each electronic file and the classification information of the document data into the database server;
s208: the classification information is checked, after the steps S206 and S207 are completed, the classification information of the document data can be checked, and errors can be found;
s209: the data classification changes, when the classification information of the document data is found to be wrong, the document data can be reclassified;
s210: WEB parameter configuration, namely operation before document distribution, after configuring classification information of document data and storage position information of the document data on a database server, forming a document output storage process, namely meeting the distribution requirement of the document data;
s211: data publishing, namely establishing a data publishing page, calling the storage process formed in the step, reading the management information set by the user of each electronic file and the classification information of the document data output in the storage process, and displaying the management information and the classification information of the document data to the user on the data publishing page through a WEB server, so that a data publishing function link is completed;
in step S5, the step of synchronizing the local and database server classification structures includes the following steps:
s51: reading a local directory, starting a management system of a document data management client, firstly retrieving local directory information which has uploaded data in the system according to current login user information, traversing the local directory on the document data management client according to the local directory information, and temporarily storing first information in the system, wherein the first information comprises folder structure information of the document data management client and file list information contained under each folder;
s52: reading a directory of a database server, wherein the system reads top-level classification of document data which is responsible for a user in the system according to information of the current login user, reads classification structure information successively according to the top-level classification, traverses the classification structure information, reads document data information under the classification structure information, and temporarily stores second information in the system, wherein the second information comprises a classification structure and file list information;
s53: comparing the first information and the second information which are temporarily stored, checking whether the classification which is not existed in the data server exists in a local classification structure in the document data management client, and simultaneously checking whether the existing local classification contains the document data which is not existed in the data server;
s54: the classification synchronization is carried out, and classifications which are not in a database server are created in a file server;
s55: synchronizing the literature data, comparing a file list in each folder of the literature data management client with a file list under a classification structure on the data server one by one, uploading the literature which is not in the file server, and recording information such as metadata of the uploaded literature in the database server; circularly processing the classification structure information and the document information under the classification structure information;
s56: and after finishing creating classification and uploading documents, automatically publishing the classification structure information and the document information under the classification structure information by the WEB server and online.
Further, the data categories include: patents, academic reports, achievements, papers, and journal documents.
Further, the step S203 specifically includes:
when a system in the database server operates for the first time, reading the top class classification of the electronic file classification table, and establishing a top class folder on the file server according to the classification name;
the user can establish own classification under the top-level folder, so that a corresponding folder is established on the file server;
the user selects a certain classification, electronic files can be uploaded to the classification in batch, a system in the document resource management client recursively reads folders of the document resource management client, the folders are established on the file server, and meanwhile, the classification is established in the electronic file classification table.
Further, in the step S3, the processing and issuing of the single document data includes the following steps:
s31: local literature data classification, namely determining the classification of a single literature data to be processed by contrasting a literature data classification structure locally managed in a database server;
s32: uploading the document data, and uploading the single document data to a file server;
s33: selecting a classification node, and selecting a corresponding classification in a system in the document management client for processing according to the classification of the single document to be processed;
s34: and after the classification node selection of the document data is completed, the document data is firstly stored to a corresponding position on the file server, and then the metadata of the document data is extracted and stored to the database server.
Further, in the step S4, the maintaining the classification structure of the document data includes the following steps:
s41: local literature classification, namely establishing a folder for identifying a classification structure on a literature data management client;
s42: uploading the classification structure, namely selecting a target classification, namely selecting which classification structure sub-level the uploaded classification structure belongs to, and then uploading a built folder for identifying the classification structure;
s43: the system of the document data management client establishes a classification record in a database server according to the uploaded structure of the folder of the identification classification structure and the name of the folder of the identification classification structure; ensuring that the hierarchical relation among the classifications is consistent with the structure of a local folder uploaded by a user and is positioned under a target classification for selection;
and S44, adjusting the classification structure, and adjusting all the sub-classifications of the same top classification in the database server. And ensuring that the classification data of the same top classification is stored in the database server in a centralized manner.
Furthermore, the folders with the identification classification structures can be multi-level folders with hierarchical relationships or single folders, the name of each folder with the identification classification structures represents a file classification, and the folder with the identification classification structures does not need to have files.
Further, if the classification which is not existed on the data server exists in the local classification structure, the classification is newly added; and creating a new added classification on the database server, and writing the new added classification and hierarchical relation data into the database server.
Further, the step S55 specifically includes:
checking whether a new file exists in a directory of the document data management client;
if the file in the document data management client is consistent with the file on the database server, version information of the two files needs to be compared, the comparison basis is the modification time of the files, if the modification time of the files is the same, the files are regarded as the same file, and if the modification time of the local files is newer than the modification time of the files on the database server, the versions of the two files are regarded as inconsistent, and version control needs to be carried out;
uploading the file in the document data management client to a file server, updating the version metadata of the file in a database server, and recording version change history information.
Further, if there is a new file in a folder of the document management client, the steps S206 and S207 are performed.
Further, the database server is an electronic document management database.
Example two:
as shown in figure 1 of the drawings, in which,
1. the document resource management client, namely the document resource management client, refers to a local work computer of a document data manager, and after the document data manager collects the document data, the document data manager needs to arrange the document data on a local machine, so that local classification and file normalization renaming of the document data are mainly completed.
2. The processing server refers to a target server for uploading finished data by a document data manager, after the document data is uploaded, classification uploading and classification information storage of the document are executed on the server, and the file server and the processing server can be the same machine.
3. The database server is mainly used for storing metadata and classification structures of document data, a data output function module of WEB service and the like.
4. The document server stores the full text of the document, and the document classification on the server should be consistent with the local document classification of the document administrator. On the server, different administrator profiles are not accessible to each other for security reasons, unless access is granted or the same class is jointly responsible for by multiple administrators.
And 5, the WEB server is a server for providing network access and is used for publishing the data uploaded by the document data manager and providing the data for clients to utilize.
On the system architecture, the software system of the invention mainly realizes the initialization and four functions of the system, namely batch document data fast uploading and publishing, single document data uploading and publishing, document data classification structure maintenance and local/server information synchronization, and the functional flow of each functional module is described as follows:
a system block diagram of a document automatic classification and automatic maintenance method comprises the following steps: step S1: initializing; step S2: processing and publishing the batch document data; step S3: processing and publishing single document data; step S4: maintaining the classification structure of the literature data; s5: synchronizing the local and database server classification structures;
in step S1, the initialization includes the steps of:
s101: firstly, a database server is established, wherein fields contained in an electronic file registration table in the database server basically correspond to file bibliographic information of the literature data one by one, and certain reserved keywords are reserved for system expansion; the literature data refers to electronic literature data;
s102: establishing an electronic file classification table aiming at each type of the document data in a database server, wherein fields of the electronic file classification table comprise classification structure codes, father classification structure codes and classification names, establishing classified father-son relations through the classification structure codes and the father classification structure codes, and further expanding the classification structure trees; the classification structure codes adopt GUIDs or incremental numbers or self-defined coding rules;
s103: the classification structure codes correspond to the classification structure codes in the electronic document registration table, and the electronic documents corresponding to the document data classification can be screened through the classification structure codes in the electronic document registration table;
s104: initializing the electronic file classification table, establishing top classification, and freely establishing the classification of a user under the top classification;
in step S2, the synchronization between the local server information and the server information realizes an automatic maintenance method for the documents and their classification, which is the most important management method for the batch files and classification structures besides the batch document management, and is an extension of the batch document management function. Generally, after the document manager performs the batch document management function, the document manager will gradually add or delete documents from the locally processed folders, or may add or delete or rename the folders to reconstruct the classification structure. When these changes occur locally, it is necessary to maintain the classification structure on the file server and the document bibliographic information to be changed synchronously, which is the purpose of the function module.
As shown in FIG. 2, the processing and publishing of the batch of document data includes the following steps:
s201: document data arrangement, wherein a document data manager arranges the document data at a document resource management client, establishes a classification structure of the document data, and stores electronic files classified correspondingly in a folder mode; the names of the folders are named according to a classification structure defined by a user, the names of the folders are automatically converted into all classification names in the electronic file classification table in subsequent processing, the classification structure coding system is automatically generated, and a parent-child classification structure is automatically established in a database server according to the hierarchical relationship of the folders;
s202: uploading a user file, wherein the user uploads the file arranged by the document resource management client to a file server;
s203: selecting a data category, wherein the data category is initialized in a system of a document resource management client, the data category can be freely customized, and a user selects one of the data categories and can associate the electronic file with the category;
s204: selecting data management options, wherein a document manager can select whether the electronic files need to be uploaded, and the visibility and the confidentiality grade of the uploaded electronic files one by one; after the user selects the classification and management metadata, executing batch operation of the literature data, wherein the batch operation comprises two parts, namely, classifying, uploading and storing the literature data to a file server, and storing classification information of the literature data to a database server;
s205: when uploading the electronic file, the file server can automatically generate classification data of each document, and automatically register classification information of the document in the electronic file registration list;
s206: uploading the document data to a file server, automatically establishing a directory structure on the file server by a program according to a file storage structure of a document resource management client according to the category of the document data selected by an administrator, and storing the electronic files in respective directories;
s207: storing the classification information of the document data to a database server, automatically reading the management information set by the user of each electronic file and the classification information of the electronic file by a program when the electronic file is uploaded, and simultaneously writing the management information set by the user of each electronic file and the classification information of the document data into the database server;
s208: the classification information is checked, after the steps S206 and S207 are completed, the classification information of the document data can be checked, and errors can be found;
s209: the data classification changes, when the document manager finds that the classification information of the document is wrong, the document manager can reclassify the document;
s210: WEB parameter configuration, namely operation before document distribution, wherein a system administrator forms a document output and storage process after adjusting classification information of document data and storage position information of the document data on a database server, namely, the distribution requirement of the document data is met;
s211: data publishing, namely, a system administrator establishes a data publishing page, invokes the storage process formed in the steps, reads the management information set by the user of each electronic file and the classification information of the document data output in the storage process, and displays the management information and the classification information of the document data to the user on the data publishing page through a WEB server, so that a data publishing function link is completed;
in step S5, as shown in fig. 5, the step of synchronizing the local and database server classification structures includes the following steps:
s51: reading a local directory, starting a literature data management client by a literature data manager, searching local directory information which is changed to upload data once in the system by the system of the literature data management client according to currently logged-in user information, traversing the local directory on the literature data management client according to the local directory information, and temporarily storing first information in the system, wherein the first information comprises folder structure information of the literature data management client and file list information contained under each folder;
s52: reading a directory of a database server, wherein the system reads top-level classification of document data which is responsible for a user in the system according to information of the current login user, reads classification structure information successively according to the top-level classification, traverses the classification structure information, reads document data information under the classification structure information, and temporarily stores second information in the system, wherein the second information comprises a classification structure and file list information;
s53: comparing the first information and the second information which are temporarily stored, checking whether the classification which is not existed in the data server exists in a local classification structure in the document data management client, and simultaneously checking whether the existing local classification contains the document data which is not existed in the data server;
s54: the classification synchronization is carried out, and classifications which are not in a data server are created in a file server;
s55: and synchronizing the literature data, namely comparing a file list in each folder of the literature data management client with a file list under a classification structure on the data server one by one, uploading the literature which is not in the file server, and recording information such as metadata of the uploaded literature in the database server. Circularly processing the classification structure information and the documents under the classification structure information;
s56: and after finishing creating classification and uploading documents, automatically publishing the classification structure information and the document information under the classification structure information by the WEB server and online. Further, the data categories include: patents, academic reports, achievements, papers, and journal documents.
Further, the step S203 specifically includes:
when a system in the database server operates for the first time, reading the top class classification of the electronic file classification table, and establishing a top class folder on the file server according to the classification name;
the user can establish own classification under the top folder, so that a corresponding directory is established on the file server;
the user selects a certain classification, electronic files can be uploaded to the classification in batch, a system in the document resource management client recursively reads folders of the document resource management client, a directory is established on a file server, and the classification is established in an electronic file classification table.
Further, in the step S3, as shown in fig. 3, the uploading process of the single document is a supplement to the batch document process described in fig. 2, and is suitable for daily data collection and real-time uploading process, and the processing and publishing of the single document includes the following steps:
s31: the method comprises the steps that local literature data is classified, and a literature data manager firstly determines the classification of a single literature data to be processed according to a local management literature data classification structure in a database server;
s32: uploading the document data, and uploading the single document data to a file server;
s33: selecting a classification node, and selecting a corresponding classification in a system in the document management client for processing according to the classification of the single document to be processed;
s34: and after the classification node selection of the literature data is completed, the literature data is firstly stored to a corresponding position on the file server, and then the metadata is extracted and stored to the database server.
The process is the same as the function modules called in the steps S26 and S27 in batch processing, the consistency of classification information is ensured, the storage process of data output does not need to be changed again when the data are issued, and the data are issued directly.
Further, in step S4, as shown in fig. 4, the functional module implements a maintenance function for classifying the document, and the document administrator implements management of the classification structure on the server through the functional module, and the maintenance of the classification structure of the document includes the following steps:
s41: local literature classification, wherein a literature manager establishes a folder for identifying a classification structure on a literature management client;
s42: uploading a classification structure, wherein a document management administrator firstly selects a target classification, namely, the uploaded classification structure is under which existing classification structure, and then uploads a built folder for identifying the classification structure;
s43: the system of the document data management client establishes a classification record in a database server according to the uploaded structure of the folder of the identification classification structure and the name of the folder of the identification classification structure; ensuring that the hierarchical relation among the classifications is consistent with the structure of a local folder uploaded by a user and is positioned under a target classification for selection;
and S44, adjusting the classification structure, namely adjusting all the sub-classifications of the same top class in the database server to accelerate data access, and ensuring that the classification data of the same top class is stored in the database server in a centralized manner.
Furthermore, the folders with the identification classification structures can be multi-level folders with hierarchical relationships or single folders, the name of each folder with the identification classification structures represents the classification of one file, and the folder with the identification classification structures does not need to have files.
Further, if the classification which is not existed on the data server exists in the local classification structure, the classification is newly added; and creating a new added classification on the database server, and writing the new added classification and hierarchical relation data into the database server.
Further, the step S55 specifically includes:
checking whether a new file exists in a directory of the document data management client;
if the file in the document data management client is consistent with the file on the database server, version information of the two files needs to be compared, the comparison is based on the modification time of the file, if the modification time of the file is the same, the file is considered to be the same file, and if the modification time of the local file is newer than the modification time of the file on the database server, the versions of the two files are not consistent, and version control needs to be performed:
uploading the file in the document data management client to a file server, updating the version metadata of the file in a database server, and recording version change history information.
Further, if there is a new file in a folder of the document management client, the steps S206 and S207 are performed.
Further, the database server is an electronic document management database.
While the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Many other changes and modifications can be made without departing from the spirit and scope of the invention. It is to be understood that the invention is not to be limited to the specific embodiments, but only by the scope of the appended claims.

Claims (10)

1. A method for automatically classifying and maintaining literature data is characterized by comprising the following steps: step S1: initializing; step S2: processing and publishing the batch document data; step S3: processing and publishing single document data; step S4: maintaining the classification structure of the literature data; s5: synchronizing the local and database server classification structures;
in step S1, the initialization includes the steps of:
s101: firstly, establishing a database server, wherein fields contained in an electronic file registration table in the database server are basically in one-to-one correspondence with file bibliographic information of the literature data, and certain reserved fields are reserved for system expansion;
s102: establishing an electronic file classification table aiming at each type of the document data in a database server, wherein fields of the electronic file classification table comprise classification structure codes, father classification structure codes and classification names, and establishing classified parent-child relations through the classification structure codes and the father classification structure codes; the classification structure codes adopt GUIDs or incremental numbers or self-defined coding rules;
s103: the classification structure codes correspond to the classification structure codes in the electronic document registration table, and the electronic documents corresponding to the document data classification can be screened through the classification structure codes in the electronic document registration table;
s104: initializing the electronic file classification table, establishing top classification, and freely establishing the classification of a user under the top classification;
in step S2, the processing and publishing of the batch of document data includes the following steps:
s201: document data arrangement, namely arranging the document data at a document resource management client, establishing a classification structure of the document data, and storing electronic files classified correspondingly in a folder mode; the names of the folders are named according to a classification structure defined by a user, the names of the folders are automatically converted into all classification names in the electronic file classification table in subsequent processing, classification structure codes are automatically generated, and parent-child classification structures are automatically established in a database server according to the hierarchical relation of the folders;
s202: the method comprises the following steps that user file uploading preparation is carried out, a user selects a well-arranged folder in a document resource management client, and the folder is prepared to be uploaded to a file server corresponding to classification in an electronic file classification table;
s203: selecting a data category, wherein the data category is initialized in a system of a document resource management client, the data category can be freely customized, and a user selects one of the data categories and can associate the electronic file with the category;
s204: selecting data management options, selecting whether the electronic files need to be uploaded one by one, and selecting the visibility and the security level of the electronic files to be uploaded; after the user selects the classification and management metadata, executing the batch operation of the literature data, wherein the batch operation comprises two parts, namely, uploading and storing the classification of the literature data to a file server, and storing the classification information of the literature data to an electronic file classification table of a database server;
s205: when uploading the electronic file, the file server automatically generates classification data of each document material and automatically registers classification information of the document material in an electronic file registration table;
s206: uploading the document data to a file server, automatically establishing a directory structure on the file server by a program according to the file storage structure of the document resource management client by the program according to the selected document data category, and storing the electronic files in respective directories;
s207: storing the classification information of the document data to a database server, automatically reading the management information set by the user of each electronic file and the classification information of the electronic file by a program when the electronic file is uploaded, and simultaneously writing the management information set by the user of each electronic file and the classification information of the document data into the database server;
s208: the classification information is checked, after the steps S206 and S207 are completed, the classification information of the document data can be checked, and errors can be found;
s209: the data classification changes, when the classification information of the document data is found to be wrong, the document data can be reclassified;
s210: WEB parameter configuration, namely operation before document distribution, after configuring classification information of document data and storage position information of the document data on a database server, forming a document output storage process, namely meeting the distribution requirement of the document data;
s211: data publishing, namely establishing a data publishing page, calling the storage process formed in the step, reading the management information set by the user of each electronic file and the classification information of the document data output in the storage process, and displaying the management information and the classification information of the document data to the user on the data publishing page through a WEB server, so that a data publishing function link is completed;
in step S5, the step of synchronizing the local and database server classification structures includes the following steps:
s51: reading a local directory, starting a management system of a document data management client, firstly retrieving local directory information which has uploaded data in the system according to current login user information, traversing the local directory on the document data management client according to the local directory information, and temporarily storing first information in the system, wherein the first information comprises folder structure information of the document data management client and file list information contained under each folder;
s52: reading a directory of a database server, wherein the system reads top-level classification of document data which is responsible for a user in the system according to information of the current login user, reads classification structure information successively according to the top-level classification, traverses the classification structure information, reads document data information under the classification structure information, and temporarily stores second information in the system, wherein the second information comprises a classification structure and file list information;
s53: comparing the first information and the second information which are temporarily stored, checking whether the classification which is not existed in the data server exists in a local classification structure in the document data management client, and simultaneously checking whether the existing local classification contains the document data which is not existed in the data server;
s54: the classification synchronization is carried out, and classifications which are not in a database server are created in a file server;
s55: synchronizing the literature data, comparing a file list in each folder of the literature data management client with a file list under a classification structure on the data server one by one, uploading the literature which is not in the file server, and recording information such as metadata of the uploaded literature in the database server; circularly processing the classification structure information and the document information under the classification structure information;
s56: and after finishing creating classification and uploading documents, automatically publishing the classification structure information and the document information under the classification structure information by the WEB server and online.
2. The method of claim 1, wherein the data category comprises: patents, academic reports, achievements, papers, and journal documents.
3. The method according to claim 1, wherein the step S203 specifically comprises:
when a system in the database server operates for the first time, reading the top class classification of the electronic file classification table, and establishing a top class folder on the file server according to the classification name;
the user can establish own classification under the top-level folder, so that a corresponding folder is established on the file server;
the user selects a certain classification, electronic files can be uploaded to the classification in batch, a system in the document resource management client recursively reads folders of the document resource management client, the folders are established on the file server, and meanwhile, the classification is established in the electronic file classification table.
4. The method of claim 1, wherein the step S3 of processing and publishing the single document includes the following steps:
s31: local bibliographic classification, firstly, determining the classification of a single bibliographic to be processed by contrasting a bibliographic classification structure locally managed in a database server;
s32: uploading the document data, and uploading the single document data to a file server;
s33: selecting a classification node, and selecting a corresponding classification in a system in the document management client for processing according to the classification of the single document to be processed;
s34: and after the classification node selection of the document data is completed, the document data is firstly stored to a corresponding position on the file server, and then the metadata of the document data is extracted and stored to the database server.
5. The method of claim 1, wherein the step S4 of maintaining the classification structure of the document comprises the steps of:
s41: local literature classification, namely establishing a folder for identifying a classification structure on a literature data management client;
s42: uploading the classification structure, namely selecting a target classification, namely selecting which classification structure sub-level the uploaded classification structure belongs to, and then uploading a built folder for identifying the classification structure;
s43: the system of the document data management client establishes a classification record in a database server according to the uploaded structure of the folder of the identification classification structure and the name of the folder of the identification classification structure;
and S44, adjusting the classification structure, and adjusting all the sub-classifications of the same top classification in the database server.
6. The method as claimed in claim 5, wherein the folders with the labeled classification structures can be multi-level folders with hierarchical relationship or single folders, the name of each folder with the labeled classification structure represents a file classification, and each folder with the labeled classification structure does not necessarily have a file.
7. The method of claim 1, wherein if there is a classification in the local classification structure that is not available on the data server, then adding a new classification; and creating a new added classification on the database server, and writing the new added classification and hierarchical relation data into the database server.
8. The method for automatically classifying and maintaining bibliographic data according to claim 1, wherein said step S55 specifically comprises:
checking whether a new file exists in a directory of the document data management client;
if the file in the document data management client is consistent with the file on the database server, version information of the two files needs to be compared, the comparison basis is the modification time of the files, if the modification time of the files is the same, the files are regarded as the same file, and if the modification time of the local files is newer than the modification time of the files on the database server, the versions of the two files are regarded as inconsistent, and version control needs to be carried out;
uploading a file in the document data management client to a file server, updating the version metadata of the file in a database server, and recording version change history information.
9. The method of claim 7, wherein if there is a new file in a folder of the document management client, the method proceeds to step S206 and step S207.
10. The method of claim 1, wherein the database server is an electronic document management database.
CN202111577605.3A 2021-11-30 2021-12-22 Automatic document classification and automatic maintenance method Pending CN114491016A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111441501X 2021-11-30
CN202111441501 2021-11-30

Publications (1)

Publication Number Publication Date
CN114491016A true CN114491016A (en) 2022-05-13

Family

ID=81494620

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111577605.3A Pending CN114491016A (en) 2021-11-30 2021-12-22 Automatic document classification and automatic maintenance method

Country Status (1)

Country Link
CN (1) CN114491016A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116166622A (en) * 2022-12-21 2023-05-26 惠州市金百泽电路科技有限公司 Automatic identification method and system for PCB production data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116166622A (en) * 2022-12-21 2023-05-26 惠州市金百泽电路科技有限公司 Automatic identification method and system for PCB production data

Similar Documents

Publication Publication Date Title
US7574413B2 (en) System and method of discovering information
US9009099B1 (en) Method and system for reconstruction of object model data in a relational database
US6389429B1 (en) System and method for generating a target database from one or more source databases
US8812439B2 (en) Folder structure and authorization mirroring from enterprise resource planning systems to document management systems
DE60019839T2 (en) A method for exchanging data between a Java system database and an LDAP directory
US8700581B2 (en) Systems and methods for providing a map of an enterprise system
US11341171B2 (en) Method and apparatus for implementing a set of integrated data systems
CN112364223B (en) Digital archive system
CN104462185B (en) A kind of digital library's cloud storage system based on mixed structure
US9971977B2 (en) Opus enterprise report system
US11556502B2 (en) Intelligent routing based on the data extraction from the document
CN112015412A (en) Device and method for generating business model based on form engine
AU2017243870A1 (en) "Methods and systems for database optimisation"
CN105468785A (en) Computer file management method
CN114491016A (en) Automatic document classification and automatic maintenance method
CN115905628A (en) Dynamic resource directory construction method, device, equipment and storage medium
EP1510935A1 (en) Mapping a data from a data warehouse to a data mart
US20050071740A1 (en) Task extraction and synchronization
CN112052222A (en) Heterogeneous object storage cluster access method, device, equipment and storage medium
WO2022118276A1 (en) System and method for facilitating flexible and hierarchical storage and management of knowledge
CN113033169A (en) Service data processing method and device
Chang A database management system for interlibrary loan
CN116266176A (en) Data quality control method and system based on result analysis
CN116594963A (en) File storage and reading method
CN113485693A (en) Interface configuration method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination