Method and system for detecting electronic file tetragonality
Technical Field
The present invention relates to the field of data detection technologies, and more particularly, to a method and a system for detecting the tetrad of an electronic file.
Background
In the patent with application number 201711358945.0, a comprehensive management system for integrating paper files and electronic files is provided, which mainly comprises three parts: the system comprises a paper archive management system, a paper archive digital processing system and an electronic archive management system; the paper archive management system is mainly responsible for keeping and monitoring the safety of paper archives; the digital processing system of the paper archives mainly has the function of digital processing of the paper archives; the electronic archive management system comprises the management, inquiry and protection functions of the electronic archive, the whole life cycle of the archive is covered by the system, the integrated service is provided for the archive, the working speed and the working efficiency of an archive department are greatly improved, the flexibility and the reliability are remarkably improved, and the problems of management and storage of the data of the archive which is increased in a large amount are solved.
In the patent with application number 201911417062.1, a method, a device, a system and a storage medium for file management are provided, the method comprises: acquiring one or more archiving request files; determining a filing standard according to the service type, packaging one or more filing request files according to the filing standard, and determining packaged files and statistical information; performing quadriversal detection on the packaged file, and recording the result of the quadriversal detection in statistical information; the technical scheme realizes the automatic file and file management in the whole process, can be butted with various business systems and can be filed so as to solve the problems of the automatic file and file management in the whole process and guarantee the authenticity, integrity, availability and safety of filed files.
The above two patents are both systems and methods for electronic file management, and the first patent is an integrated comprehensive management system for paper files and electronic files, which is divided into three parts: paper archives management system, the digital system of processing of paper archives, electronic archives management system, paper archives management system is connected with the digital system of processing of paper archives, electronic archives management system is connected with the digital system of processing of paper archives, if paper archives management system and electronic archives management system do not go up the digital system of processing of paper archives, just can't carry out keeping and the safety monitoring of paper archives, there is very big security this moment, can't carry out follow-up operation.
The second patent is an archive management method, device, system and storage medium, which is used for acquiring one or more archive request files; determining a filing standard according to the service type, packaging one or more filing request files according to the filing standard, and determining packaged files and statistical information; performing quadriversal detection on the packaged file, and recording the result of the quadriversal detection in statistical information; and in response to the result of the four-property detection being passed, the statistical information and the packaged file are filed, the method only performs four-property detection and verification on the filed file, and is relatively single, and the four-property detection rule cannot be set, such as setting of a verification rule of a file header and metadata.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method for detecting the tetrad of an electronic file, comprising:
acquiring file headers and metadata of electronic files in batches of electronic files which finish the quadric detection and detecting modes of the file headers and the metadata, and storing the file headers and the metadata and the detecting modes of the file headers and the metadata in a database;
determining an electronic file to be detected, checking whether the electronic file to be detected meets the condition of the four-characteristic detection, and if so, determining a file header and metadata of the electronic file in the electronic file to be detected;
calling a database, taking the file header and the metadata of the electronic file in the electronic file to be detected as query data to query the database, determining the file header and the metadata of the electronic file in the database and the electronic file to be detected, selecting a file header and metadata approximate to the file header and metadata in a preset range, and performing four-property detection on the file header and the metadata of the electronic file to be detected in the electronic file to be detected.
Optionally, if the four-property detection fails, returning the detected error information, and using the error information to correct the electronic file to be detected.
Optionally, the four-property detection of the file header includes four-property detection of the file extension, the file description, and the file header data.
Optionally, the metadata four-property detection includes four-property detection of data name, data length, data format, data value range, data description, selected data classification, and selected data type.
Optionally, a preset quadric detection mode is stored in the database, and is used for performing quadric detection on the file header and the metadata of the electronic file in the electronic file to be detected by using the preset quadric detection mode when querying the database with the file header and the metadata of the electronic file in the electronic file to be detected as query data fails.
The invention also provides a system for detecting the four-property of the electronic file, which comprises the following components:
the acquisition unit is used for acquiring file headers and metadata of electronic files in batches of electronic files subjected to the four-property detection and detecting modes of the file headers and the metadata, and storing the file headers and the metadata and the detecting modes of the file headers and the metadata in a database;
the verification unit is used for determining the electronic file to be detected, verifying whether the electronic file to be detected meets the condition of the four-characteristic detection, and if so, determining the file header and the metadata of the electronic file in the electronic file to be detected;
the query unit is used for calling the database, taking the file header and the metadata of the electronic file in the electronic file to be detected as query data to query the database, determining the file header and the metadata of the electronic file in the database and the electronic file to be detected, selecting a detection mode of the queried file header and metadata, and performing four-property detection on the file header and the metadata of the electronic file in the electronic file to be detected.
Optionally, if the four-property detection fails, returning the detected error information, and using the error information to correct the electronic file to be detected.
Optionally, the four-property detection of the file header includes four-property detection of the file extension, the file description, and the file header data.
Optionally, the metadata four-property detection includes four-property detection of data name, data length, data format, data value range, data description, selected data classification, and selected data type.
Optionally, a preset quadric detection mode is stored in the database, and is used for performing quadric detection on the file header and the metadata of the electronic file in the electronic file to be detected by using the preset quadric detection mode when querying the database with the file header and the metadata of the electronic file in the electronic file to be detected as query data fails.
Compared with the prior art, the method is simpler and safer, and the detection efficiency of the electronic archive file is improved.
Drawings
FIG. 1 is a flow chart of a method for detecting the fourteen properties of an electronic file according to the present invention;
FIG. 2 is a block diagram of a system for performing a four-property check of an electronic file according to the present invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.
Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
The invention provides a method for detecting the tetragonality of an electronic file, as shown in fig. 1, comprising the following steps:
acquiring file headers and metadata of electronic files in batches of electronic files which finish the quadric detection and detecting modes of the file headers and the metadata, and storing the file headers and the metadata and the detecting modes of the file headers and the metadata in a database;
determining an electronic file to be detected, checking whether the electronic file to be detected meets the condition of the four-characteristic detection, and if so, determining a file header and metadata of the electronic file in the electronic file to be detected;
calling a database, taking the file header and the metadata of the electronic file in the electronic file to be detected as query data to query the database, determining the file header and the metadata of the electronic file in the database and the electronic file to be detected, selecting a file header and metadata approximate to the file header and metadata in a preset range, and performing four-property detection on the file header and the metadata of the electronic file to be detected in the electronic file to be detected.
If the four-property detection fails, returning the detected error information, and using the error information to correct the electronic file to be detected.
The file header four-property detection comprises four-property detection of file extension names, file descriptions and file header data.
The method comprises the following steps of detecting the four properties of metadata, wherein the detection of the four properties of the metadata comprises four properties of data name, data length, data format, data value range, data description, data classification selection and data type selection.
The database stores a preset quadric detection mode, and is used for carrying out the quadric detection on the file header and the metadata of the electronic file in the electronic file to be detected by using the preset quadric detection mode when the query of the database by using the file header and the metadata of the electronic file in the electronic file to be detected as query data fails.
The present invention further provides a system 200 for detecting the tetragonality of an electronic file, as shown in fig. 2, comprising:
the acquisition unit 201 is used for acquiring file headers and metadata of electronic files in batches of electronic files subjected to four-property detection and detecting modes of the file headers and the metadata, and storing the file headers and the metadata and the detecting modes of the file headers and the metadata in a database;
the verification unit 202 is used for determining the electronic file to be detected, verifying whether the electronic file to be detected meets the condition of the four-characteristic detection, and if so, determining the file header and the metadata of the electronic file in the electronic file to be detected;
the query unit 203 invokes a database, queries the database with the header and the metadata of the electronic file in the electronic file to be detected as query data, determines the header and the metadata of the electronic file in the database and the electronic file to be detected, selects a detection mode of the queried header and metadata, and performs a four-property detection on the header and the metadata of the electronic file to be detected.
If the four-property detection fails, returning the detected error information, and using the error information to correct the electronic file to be detected.
The file header four-property detection comprises four-property detection of file extension names, file descriptions and file header data.
The four-characteristic detection of the metadata comprises four-characteristic detection of data name, data length, data format, data value field, data description, selected data classification (certificate/account book/report/other/document/invoice) and selected data type (character string/integer/decimal/date).
The database stores a preset quadric detection mode, and is used for carrying out the quadric detection on the file header and the metadata of the electronic file in the electronic file to be detected by using the preset quadric detection mode when the query of the database by using the file header and the metadata of the electronic file in the electronic file to be detected as query data fails.
Compared with the prior art, the method is simpler and safer, and the detection efficiency of the electronic archive file is improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be implemented by adopting various computer languages, such as object-oriented programming language Java and transliterated scripting language JavaScript.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.