CN117763627A - File verification method and device, electronic equipment and storage medium - Google Patents

File verification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117763627A
CN117763627A CN202311864503.9A CN202311864503A CN117763627A CN 117763627 A CN117763627 A CN 117763627A CN 202311864503 A CN202311864503 A CN 202311864503A CN 117763627 A CN117763627 A CN 117763627A
Authority
CN
China
Prior art keywords
file
content
item
content information
content item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311864503.9A
Other languages
Chinese (zh)
Inventor
任志栋
王超
崔雯雯
楼武良
仇建民
刘小欧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202311864503.9A priority Critical patent/CN117763627A/en
Publication of CN117763627A publication Critical patent/CN117763627A/en
Pending legal-status Critical Current

Links

Abstract

The disclosure provides a file verification method, a file verification device, electronic equipment and a storage medium, and relates to the technical field of blockchain. The file verification method comprises the following steps: acquiring a file to be checked; extracting content information of each content item from the file to be verified based on at least one content item defined in advance; the content information of each content item extracted from the file to be checked is compared with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent; the content information of the content item extracted from the trusted file is acquired based on the uplink information stored on the blockchain; and displaying the content information of the file to be checked inconsistent with the trusted file according to the item-by-item comparison result. The method and the device can display inconsistent content information to the user based on item-by-item comparison results, so that user experience is improved.

Description

File verification method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of blockchains, and in particular relates to a file verification method, a device, electronic equipment and a storage medium.
Background
The block chain technology is a decentralization calculation paradigm for verifying, storing and updating data by using a distributed node consensus algorithm, has the characteristics of decentralization, traceability, non-falsification and the like, and is regarded as a foundation stone of the next-generation value Internet and the trust Internet. Because the blockchain is disclosed, transparent, retrospective and difficult to tamper, all behaviors of a certain main body can be directly proved and confirmed through layer-by-layer message retrospective, and therefore the real information problem is solved deterministically, and the blockchain is often used as an implementation base stone technology for checking files. However, the inventor finds that in the related art, when checking the file, it is generally only able to determine whether the file is real or not, and whether the file is modified or not, so that the user experience is poor.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The disclosure provides a file verification method, a device, an electronic device and a storage medium, which at least overcome the problem that in the related art, only whether a file is modified or not can be judged when the file is verified, and the user experience is poor to a certain extent.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to one aspect of the present disclosure, there is provided a file verification method, including: acquiring a file to be checked; extracting content information of each content item from the file to be verified based on at least one content item defined in advance; the content information of each content item extracted from the file to be checked is compared with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent; the content information of each content item extracted from the trusted file is acquired based on the uplink information stored on the blockchain; and displaying the content information of the file to be checked inconsistent with the trusted file according to the item-by-item comparison result.
In some embodiments, before extracting content information of each content item from the file to be verified based on the predefined at least one content item, the method further comprises: generating a hash value of the file to be checked; comparing the hash value of the file to be checked with the hash value of the trusted file; and if the comparison results are inconsistent, extracting the content information of each content item from the file to be checked based on at least one content item defined in advance.
In some embodiments, before comparing the hash value of the file to be verified with the hash value of the trusted file, further comprising: and extracting the hash value of the trusted file from the uplink information.
In some embodiments, before comparing the content information of each content item extracted from the file to be verified with the content information of each content item extracted from the trusted file item by item to verify whether the content information of each content item is consistent, the method further includes: and acquiring the content information of each content item extracted from the trusted file according to the uplink information.
In some embodiments, before comparing the content information of each content item extracted from the file to be verified with the content information of each content item extracted from the trusted file item by item to verify whether the content information of each content item is consistent, the method further includes: acquiring a file identifier of a file to be checked; the file to be checked and the trusted file have the same file identification; and searching the uplink information corresponding to the trusted file in the uplink information according to the file identification.
In some embodiments, extracting content information of each content item from the file to be verified based on the predefined at least one content item includes: determining the service type of a file to be checked; determining at least one content extraction range predefined for the service type of the file to be checked; wherein each content extraction range corresponds to each content item one by one; and extracting the content information in each content extraction range of the file to be checked to obtain the content information of the content item corresponding to each content extraction range.
In some embodiments, the files to be verified are files arranged in a preset layout; the content extraction range corresponding to each content item is a preset area in a preset layout; extracting content information in each content extraction range of the file to be checked to obtain content information of a content item corresponding to each content extraction range, wherein the content information comprises: and identifying the characters in each preset area to obtain the content information of each content item of the file to be checked.
In some embodiments, the file to be verified is a document of structured data, and content information of a plurality of preset categories in the document of structured data is stored in a preset structure; the content extraction range corresponding to each content item corresponds to at least one of a plurality of preset categories; extracting content information in each content extraction range of the file to be checked to obtain content information of a content item corresponding to each content extraction range, wherein the content information comprises: analyzing the file to be checked based on a preset structure to obtain content information of a plurality of preset categories; and determining the content information of each content item of the file to be checked according to the corresponding relation between each preset category and the content item.
According to another aspect of the present disclosure, there is also provided a file verification apparatus, including: the acquisition module is used for acquiring the file to be checked; the extraction module is used for extracting content information of each content item from the file to be verified based on at least one content item defined in advance; the comparison module is used for comparing the content information of each content item extracted from the file to be checked with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent or not; the content information of each content item extracted from the trusted file is acquired based on the uplink information stored on the blockchain; and the display module is used for displaying the content information of the file to be checked inconsistent with the trusted file according to the item-by-item comparison result.
According to another aspect of the present disclosure, there is also provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the file verification method of any of the embodiments described above via execution of the executable instructions.
According to another aspect of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the file verification method of any of the above embodiments.
According to another aspect of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the file verification method of any of the above embodiments.
According to the file verification method, the device, the electronic equipment and the storage medium, the content information of each content item of the trusted file is provided through the uplink information of the blockchain, the content information of each content item extracted from the file to be verified is compared with the content information of each content item extracted from the trusted file item by item, whether the content information of each content item is consistent is verified, inconsistent content information is displayed to a user based on the comparison result by item, and therefore user experience is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 is a schematic diagram of a file verification system architecture in an embodiment of the present disclosure;
FIG. 2 is a flowchart showing a method for verifying a file according to an embodiment of the disclosure;
FIG. 3 illustrates a second flowchart of a method for verifying a file in an embodiment of the disclosure;
FIG. 4 illustrates a third flowchart of a method for verifying a file in an embodiment of the disclosure;
FIG. 5 illustrates a fourth flowchart of a method for verifying files in an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a file verification apparatus according to an embodiment of the disclosure; and
fig. 7 shows a block diagram of an electronic device in an embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
The following detailed description of embodiments of the present disclosure refers to the accompanying drawings.
FIG. 1 illustrates an exemplary application system architecture diagram to which a file verification method may be applied in embodiments of the present disclosure. As shown in fig. 1, the system architecture may include a terminal device 101, a network 102, and a server 103.
The medium used by the network 102 to provide a communication link between the terminal device 101 and the server 103 may be a wired network or a wireless network.
Alternatively, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the Internet, but may be any network including, but not limited to, a LAN (Local Area Network ), a MAN (Metropolitan Area Network, metropolitan area network), a WAN (Wide Area Network ), a mobile, wired or wireless network, a private network, or any combination of virtual private networks. In some embodiments, the data exchanged over the network is represented using techniques and/or formats including HTML (HyperText Mark-up Language), XML (Extensible MarkupLanguage ), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as SSL (Secure Socket Layer ), TLS (Transport Layer Security, transport layer security), VPN (Virtual Private Network ), IPSec (Internet Protocol Security, internet security protocol), etc. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of or in addition to the data communication techniques described above.
The terminal device 101 may be a variety of electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, smart speakers, smart watches, wearable devices, augmented reality devices, virtual reality devices, and the like.
Alternatively, the clients of the applications installed in different terminal devices 101 are the same or clients of the same type of application based on different operating systems. The specific form of the application client may also be different based on the terminal platform, for example, the application client may be a mobile phone client, a PC (Personal Computer ) client, etc.
The server 103 may be a server providing various services, such as a background management server providing support for devices operated by the user with the terminal apparatus 101. The background management server can analyze and process the received data such as the request and the like, and feed back the processing result to the terminal equipment.
Optionally, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like.
Those skilled in the art will appreciate that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative, and that any number of terminal devices, networks, and servers may be provided as desired. The embodiments of the present disclosure are not limited in this regard.
Under the system architecture described above, the embodiments of the present disclosure provide a method for checking a file, which may be performed by any electronic device having computing processing capabilities.
In some embodiments, the file verification method provided in the embodiments of the present disclosure may be performed by a terminal device of the above system architecture; in other embodiments, the file verification method provided in the embodiments of the present disclosure may be performed by a server in the system architecture described above; in other embodiments, the method for checking a file provided in the embodiments of the present disclosure may be implemented by a terminal device and a server in the system architecture in an interactive manner.
Fig. 2 shows a flowchart of a file verification method in an embodiment of the present disclosure, and as shown in fig. 2, the file verification method provided in the embodiment of the present disclosure includes the following steps:
s201, obtaining a file to be checked.
The file to be checked is a file that needs to be checked whether it has been modified. In some embodiments, the file to be verified may be uploaded to the server 103 through the terminal device 101 as shown in fig. 1, so that verification is performed by the server 103.
The file to be checked may be a file of any service type/file format type. In some embodiments, the file to be verified may be an electronic voucher file, and may include, by way of example, vouchers of various business types including invoices, travel slips, business slips, statement, academic vouchers, and the like. The file format types of the file to be checked may include: OFD (Open Fixed-layout Document), PDF (portable Document), image, and the like.
S202, extracting content information of each content item from the file to be verified based on at least one content item defined in advance.
At least one content item is predefined for the file to be verified. In some embodiments, the content item may be predefined based on the service type and/or file format type of the file to be checked, thereby determining the extracted specific implementation. The content information may be information extracted from any kind of medium, for example, information extracted from images, videos, sounds, and characters. The content information may also be information presented in any of a variety of media, and in some embodiments, the content information may be characters, including letters, punctuation, numbers, kanji, and the like.
S203, comparing the content information of each content item extracted from the file to be checked with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent.
The file to be checked has a corresponding trusted file. In some embodiments, the trusted file may be an original file obtained at a trusted source, such as an invoice downloaded from a national tax website, or a running form printed from a bank. Alternatively, the trusted file may be a copy of the original file stored in a secure location, e.g., the trusted file may be stored in an address and the Hash value and address of the trusted file stored on a blockchain, or the trusted file may be directly packaged and chained.
The step of extracting the content information of the respective content item from the trusted file may refer to the step of extracting the content information of the respective content item from the file to be checked. In some embodiments, the step of extracting the information of the respective content item from the trusted file may be performed in S203, in other embodiments the step of extracting the information of the respective content item from the trusted file may also have been performed in advance before S203, e.g. the information of the respective content item extracted from the trusted file may be stored.
In some embodiments, the content information for each content item extracted in the trusted file is obtained based on the uplink information stored on the blockchain. For example, the trusted file may be stored in one address, and the hash value and the storage address of the trusted file are stored in the uplink information of the blockchain, so that the characteristics of the blockchain, such as decentralization, traceability, non-tamper, and the like, are utilized to ensure the authenticity and non-tamper of the uplink information. Alternatively, in other embodiments, the content information for each content item of the trusted file may be provided by wrapping the trusted file itself, or by wrapping the content information for each content item previously extracted from the trusted file.
And S204, displaying the content information of the file to be checked inconsistent with the trusted file according to the item-by-item comparison result.
After comparing the to-be-checked file with the trusted file item by item content information, if inconsistent content items exist, the inconsistent content information can be displayed. The display mode can be displayed in different modes such as sound, images, video and characters. For example, the presentation may be performed by highlighting various forms of inconsistent content information, voice broadcasting inconsistent content items, animation prompting inconsistent content information, and the like, which is not particularly limited by the present disclosure.
In some embodiments, the comparison of all content items of the item-by-item comparison results may be presented, i.e. for each content item, whether the content information of the file to be checked and the trusted file is consistent. In other embodiments, only inconsistent content information may be presented. The embodiments of the present disclosure are not particularly limited thereto.
According to the file verification method provided by the embodiment of the disclosure, the content information of each content item of the trusted file is provided through the uplink information of the blockchain, the content information of each content item extracted from the file to be verified is compared item by item with the content information of each content item extracted from the trusted file, so that whether the content information of each content item is consistent or not is verified, inconsistent content information is displayed to a user based on the comparison result by item, and therefore user experience is improved.
In some embodiments, S202 includes the following steps as shown in fig. 3:
s301, determining the service type of the file to be checked.
S302, determining at least one content extraction range predefined for the service type of the file to be checked.
In some embodiments, the different service types may include different content items, and the content items to be extracted, and the content extraction ranges corresponding to each content item, may be predefined for each service type. In the at least one content extraction range defined in advance, each content extraction range corresponds to each content item in the at least one content item defined in advance one by one.
For different file format types, different forms of content extraction ranges may be specified. In some embodiments, the files to be checked are files arranged in a preset layout, for example, for a file in a picture or PDF format, the content extraction range corresponding to each content item is a preset area in the preset layout, that is, for the preset layout of the files to be checked, the preset area corresponding to each content item to be extracted is drawn, and the content information extracted in each preset area is the content information of the content item corresponding to the preset area.
In other embodiments, the file to be verified is a document of structured data, e.g., an OFD document. Content information of a plurality of preset categories in a document of the structured data is stored in a preset structure, and a content extraction range corresponding to each content item corresponds to at least one of the plurality of preset categories. For example, the OFD document includes a plurality of nodes, which correspond to a plurality of preset categories one by one, and content information of the plurality of nodes is stored in a preset structure in the OFD document, each node corresponds to one content item, and each content item may correspond to at least one node.
S303, extracting content information in each content extraction range of the file to be checked, and obtaining content information of the content item corresponding to each content extraction range.
According to different file format types, different content resolvers can be selected to be used for resolving content information in the file to be checked. In some embodiments, the content parser may be directly configured in the server, or the functions of the content parser may be implemented by calling a third party service, or using a plug-in of a third party, or the like.
Taking the file to be checked as an example, and taking the file arranged in the preset layout as an example, the content information of each content item of the file to be checked can be obtained by identifying the characters in each preset area. The content parser may be capable of recognizing characters in a file of a file format type including PDF, picture, etc. using a method of optical character recognition OCR (Optical Character Recognition ).
If the file to be checked is a document with structured data, the file to be checked can be analyzed based on a preset structure of the document to obtain content information of a plurality of preset categories, and the content information of each content item of the file to be checked is determined according to the corresponding relation between each preset category and the content item.
In some embodiments, before executing S202 to extract content information of each content item from the file to be verified based on the predefined at least one content item, it may be determined whether the file to be verified is consistent with the trusted file based on a comparison of hash values of the file to be verified and the trusted file. After the hash value of the file to be checked is generated, comparing the hash value of the file to be checked with the hash value of the trusted file, if the comparison result is inconsistent, determining that the file to be checked is inconsistent with the trusted file, and modifying the file to be checked, and further executing the step S202 and the following steps. Otherwise, if the hash value of the file to be checked is consistent with that of the trusted file, the file to be checked is considered to be unmodified.
In some embodiments, the hash value of the trusted file may be generated in advance and stored in the uplink information of the blockchain, and further, the hash value of the trusted file may be extracted from the uplink information before comparing the hash value of the file to be verified with the hash value of the trusted file. In other embodiments, the trusted file itself may be included in the blockchain stored therein, and the hash value generated after the trusted file stored in the blockchain is downloaded may be compared with the hash value of the file to be verified.
In some embodiments, before comparing the content information of each content item extracted from the file to be verified with the content information of each content item extracted from the trusted file item by item to verify whether the content information of each content item is consistent, the content information of each content item extracted from the trusted file may also be obtained according to the uplink information.
In some embodiments, before the content information of each content item extracted from the file to be verified is compared item by item with the content information of each content item extracted from the trusted file, so as to verify whether the content information of each content item is consistent, a file identifier of the file to be verified may also be obtained. Wherein the file to be verified and the trusted file have the same file identification, for example, the file identification may be a file name. And searching the uplink information corresponding to the trusted file in the uplink information according to the file identification.
In the following, some specific embodiments of the file verification method provided by the embodiments of the present disclosure are described in some specific application scenarios.
As shown in fig. 4, the embodiment of the present disclosure is used for verifying an electronic credential file, including the following steps:
s401, based on the electronic certificate service types, various maintainable electronic certificate service type list lists are managed.
The electronic voucher service type is the service type of the electronic voucher file. The electronic voucher file can be divided into a plurality of service types, each service type being the same or similar in structure/layout. In some embodiments, the electronic voucher service types can be edited (which can include adding, modifying or deleting) in the electronic voucher service type list by managing the electronic voucher service type list.
S402, content items are managed based on the electronic certificate service types, and a specific content item list of each electronic certificate service music row is maintained.
Different electronic voucher service types, corresponding to different content item lists, can edit the content item lists through an editable interface, and define content items contained in each service type.
In some embodiments, an electronic voucher content item definition module can be provided for centrally defining and managing a detailed list of electronic vouchers of various service types by service type: and for the OFD format file, defining a Chinese label name of a content item corresponding to the node in the data file. For format files such as PDF and pictures which need OCR, a Chinese label name of each content item and a recognition area of each content item are defined, so that data recognized in each recognition area in the electronic voucher file can be corresponding to different content items when character recognition is performed subsequently. By defining a Chinese label for each content item, a clearer and easily understood effect of the comparison prompt can be achieved.
S403, calculating a hash value of the local electronic certificate to be checked, reading the hash value of the electronic certificate on the blockchain, and comparing consistency of the hash values.
The local electronic certificate is an electronic certificate file to be checked, in some embodiments, the electronic certificate file may be tampered by a person due to circulation, propagation and other reasons, if a target user has a need of checking the authenticity of the electronic certificate file, the target user can upload the electronic certificate file to be checked to a server and store the electronic certificate file to the server to obtain the local electronic certificate.
The uplink information stored in the blockchain is tamper-proof and has traceability. Thus, in some embodiments, the hash value of the original file of the electronic credential file may be cryptographically wrapped up. After the hash value of the original file is downloaded from the blockchain, the local electronic certificate is compared with the hash value of the original file. If the comparison results are consistent, the local electronic certificate is determined to be real and unmodified.
S404, if the hash values are inconsistent, the trusted voucher file in the blockchain is read.
In some embodiments, the original file address of the electronic voucher file may be wrapped up with the hash value. In other embodiments, the hash value of the original file and the original file itself or the hash value of the original file and the content information extracted from the original file in advance may be stored in the uplink information.
Based on the uplink information in the blockchain, the original file of the electronic voucher file can be obtained, and the trusted voucher file is obtained. In other embodiments than fig. 4, the content information extracted from the original file may be stored in the uplink information.
And respectively extracting each item of content information aiming at the local electronic certificate and the trusted certificate file acquired based on the blockchain. Specifically, S405 to 407 are included.
S405, identifying the format type of the electronic certificate file, and mapping the electronic certificate file to different content resolvers in a matching way according to different file formats.
S406, aiming at the national standard OFD format file, the corresponding content analyzer can combine the decompression technology with the compressed file format of ZIP, XML analysis and other technologies to realize the content analysis of the national standard OFD format layout file.
S407, for files of PDF format, picture format and other types, an OCR recognition content analyzer can be called to perform OCR recognition on the files and output characters obtained by recognition.
S408, comparing the local electronic certificate to be checked with each item of content information extracted from the trusted certificate file item by item, and marking the content items with differences.
S409, abnormal difference item total feedback, for example, content items in which all differences exist may be displayed.
In some embodiments, S401, S402, S409 may be implemented using a Web (website) service of a B/S (Browser/Server) architecture, providing a visualized Web interface.
As shown in fig. 5, an alternative embodiment of the file verification method provided in the embodiment of the present disclosure may include the following steps:
s501, electronic certificate service type management.
In one embodiment, various electronic credential traffic types may be maintained in advance. It should be noted that, the information such as unique electronic certificate service type, file format type, service description and the like should be included. For example, a service type management function list interface of the electronic certificate may be provided to provide a series of corresponding operation functions such as adding, modifying, deleting, inquiring, etc., where the inquiry list includes a table column such as a service type, a description, an electronic certificate file format, a content item editing button, etc.
S502, managing the content item of the electronic certificate service type, and defining a certificate content item.
According to one embodiment of the present disclosure, a specific plurality of content items may be defined for each service type. In some embodiments, a content item editing interface may be provided, and different elements may be presented according to different file types.
For example, for an OFD format electronic certificate, input items such as a Chinese label name, an OFD data file XML node Key (Key) and the like are presented; for PDF or picture format electronic certificates, chinese label signs, OCR recognition areas and other input items are presented. The content items can be displayed in a list form, and the operations of adding, deleting and modifying the content items are supported. The content item may cover the entire content item of the electronic voucher when maintained.
S503, the user uploads the local electronic certificate to the server through the accessory.
If the user has a verification requirement, the local electronic certificate can be uploaded to the server. The user uploads the local electronic voucher attachment and may upload related business data information, such as the user name, the unit in which the user is located, the file identity, etc.
S504, receiving the uploaded electronic certificate file, and searching corresponding data on the blockchain.
The embodiment of the disclosure provides a checking interface for supporting a user to upload a local electronic voucher file, receives the uploaded electronic voucher file and related service data, and can search the corresponding data of the uploaded electronic voucher file on a blockchain through a file identifier.
S505, the uploaded local file hash value is calculated and compared with the certificate hash on the blockchain.
And extracting a certification hash, namely the hash of the original file of the electronic certificate file, from the data searched on the blockchain.
S506, judging whether the hash values are equal. If the hash values are equal, then execution 513 presents a ping result, which is that the ping passed, and the process ends.
S507, if the hash values are not equal, the trusted electronic certificate file with the certificate is read.
The trusted electronic voucher file is downloaded from the blockchain for subsequent content item parsing.
S508, judging the format type of the electronic voucher file, and matching the corresponding content analyzer according to different format types. And respectively analyzing the uploaded electronic certificate file to be checked and the trusted electronic certificate file stored.
S5091, if the electronic certificate format is identified as OFD, the OFD file may be decompressed by using a decompression package (e.g., java. Uteil. Zip) to obtain a plurality of XML files in extensible markup language format.
S5092, invoking an XML parsing service (e.g., dom 4J) of a third party or parsing a plurality of XML using a preset XML parser (e.g., apache XML frames). And finally, defining a pairing analysis result according to the content item corresponding to the electronic voucher type, mapping and associating the analyzed result data with the content item correspondingly configured to the service type to form a Key Value pair set (Map) in a Key Value Key-Value pair form, and specifically, storing and extracting the content information of each content item in a content item label-content item form.
S510, if the electronic certificate format is identified as PDF or picture format, an OCR identification content analyzer is used for analyzing the PDF, picture and other format files.
And according to the content items correspondingly configured according to the electronic voucher service type, the result data of each content item is identified item by item according to the identification area, and finally, the key value pair Map in the form of key value pairs is obtained.
S511, performing item-by-item traversal comparison on the electronic certificate to be checked and the content item of the trusted electronic certificate, and marking the difference item.
S512, feeding back abnormal difference content items in full quantity.
And feeding back to the checking request terminal through an interface with the user checking request terminal, feeding back that checking is failed, screening out all marked difference content items (including the original value and the tampered value of the difference items) and feeding back to the checking request terminal for display.
In S501 to S513 shown in fig. 5, the remaining steps are performed by a blockchain certification system, which may be provided to a server as shown in fig. 1, except S503 and S513. The blockchain certification system provides a functional module for defining the type of the electronic certificate and the content items thereof, can maintain a corresponding detailed content item list according to the service type, receives an electronic certificate checking request of a user through an electronic certificate checking interface, and checks the authenticity of the electronic certificate by using a hash value so as to judge whether the electronic certificate is tampered or not. The method comprises the steps of realizing content information analysis of an OFD file through an OFD format file content analyzer, carrying out pairing of content information and content items according to content item definitions to obtain structured data, supporting configuration of file areas corresponding to each defined content item through an OCR-recognized content analyzer, and recognizing each item of content information item by item to obtain the structured data; and finally, marking all the difference items by a method for comparing each item of content information item by item, and returning all the difference items (including the original value and the tampered value of the difference items) to the checking interface once.
Thus, through the embodiment of the disclosure, the authenticity of the electronic certificate can be checked, and the specific tampered content of the electronic certificate can be further determined, so that service personnel can find problems in an actual application scene. The safe and reliable electronic voucher file in the blockchain evidence storage system and the uploaded electronic voucher file to be checked are respectively analyzed, the two analysis results are compared item by item, the modified specific content item is finally obtained, the front end is fed back, the specific part of the tampered electronic voucher file can be more intuitively presented to a user, friendly and intuitive structured difference comparison data are provided for the user, and the abnormal perception capability and the user experience are greatly improved, so that the problem can be located in time.
It should be noted that, in the technical solution of the present disclosure, the acquiring, storing, using, processing, etc. of data all conform to relevant regulations of national laws and regulations, and various types of data such as personal identity data, operation data, behavior data, etc. relevant to individuals, clients, crowds, etc. acquired in the embodiments of the present disclosure have been authorized.
Based on the same inventive concept, the embodiments of the present disclosure further provide a file verification device, as described in the following embodiments. Since the principle of solving the problem of the embodiment of the device is similar to that of the embodiment of the method, the implementation of the embodiment of the device can be referred to the implementation of the embodiment of the method, and the repetition is omitted.
Fig. 6 is a schematic diagram of a file verification apparatus according to an embodiment of the disclosure, as shown in fig. 6, where the apparatus includes: the device comprises an acquisition module 601, an extraction module 602, a comparison module 603 and a display module 604.
The obtaining module 601 is configured to obtain a file to be verified. The extraction module 602 is configured to extract content information of each content item from the file to be verified based on at least one content item defined in advance. The comparison module 603 is configured to compare, item by item, content information of each content item extracted from the file to be verified with content information of each content item extracted from the trusted file, so as to verify whether the content information of each content item is consistent. Wherein the content information of each content item extracted from the trusted file is obtained based on the uplink information stored on the blockchain. The display module 604 is configured to display content information that the file to be checked is inconsistent with the trusted file according to the item-by-item comparison result.
In some embodiments, the apparatus further comprises: and generating a module. The generation module is used for generating a hash value of the file to be verified before extracting content information of each content item from the file to be verified based on at least one content item which is predefined. The comparison module 603 is configured to compare the hash value of the file to be verified with the hash value of the trusted file. The extraction module 602 is further configured to extract content information of each content item from the file to be verified based on at least one content item defined in advance when the comparison results of the comparison module 603 are inconsistent.
In some embodiments, the extracting module 602 is further configured to extract the hash value of the trusted file from the uplink information before comparing the hash value of the file to be verified with the hash value of the trusted file.
In some embodiments, the obtaining module 601 is further configured to, before comparing, item by item, the content information of each content item extracted from the file to be verified with the content information of each content item extracted from the trusted file, to verify whether the content information of each content item is consistent, obtain, according to the uplink information, the content information of each content item extracted from the trusted file.
In some embodiments, the obtaining module 601 is further configured to obtain, before comparing, item by item, content information of each content item extracted from the file to be verified with content information of each content item extracted from the trusted file, so as to verify whether the content information of each content item is consistent, a file identifier of the file to be verified; the file to be checked and the trusted file have the same file identification. The device also comprises a searching module which is used for searching the uplink information corresponding to the trusted file in the uplink information according to the file identification.
In some embodiments, the extraction module 602 is further to: determining the service type of a file to be checked; determining at least one content extraction range predefined for the service type of the file to be checked; wherein each content extraction range corresponds to each content item one by one; and extracting the content information in each content extraction range of the file to be checked to obtain the content information of the content item corresponding to each content extraction range.
In some embodiments, the files to be verified are files arranged in a preset layout; the content extraction range corresponding to each content item is a preset area in a preset layout; the extraction module 602 is further configured to: and identifying the characters in each preset area to obtain the content information of each content item of the file to be checked.
In some embodiments, the file to be verified is a document of structured data, and content information of a plurality of preset categories in the document of structured data is stored in a preset structure; the content extraction range corresponding to each content item corresponds to at least one of a plurality of preset categories; the extraction module 602 is further configured to: analyzing the file to be checked based on a preset structure to obtain content information of a plurality of preset categories; and determining the content information of each content item of the file to be checked according to the corresponding relation between each preset category and the content item.
The file verification device provided by the embodiment of the disclosure provides the content information of each content item of the trusted file through the uplink information of the blockchain, and performs item-by-item comparison on the content information of each content item extracted from the file to be verified and the content information of each content item extracted from the trusted file so as to verify whether the content information of each content item is consistent or not, and displays inconsistent content information to a user based on the item-by-item comparison result, thereby improving user experience.
Here, the obtaining module 601, the extracting module 602, the comparing module 603 and the displaying module 604 correspond to S201 to S204 in the method embodiment, and the modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the method embodiment. It should be noted that the modules described above may be implemented as part of an apparatus in a computer system, such as a set of computer-executable instructions.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 700 according to such an embodiment of the present disclosure is described below with reference to fig. 7. The electronic device 700 shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 7, the electronic device 700 is embodied in the form of a general purpose computing device. Components of electronic device 700 may include, but are not limited to: the at least one processing unit 710, the at least one memory unit 720, and a bus 730 connecting the different system components, including the memory unit 720 and the processing unit 710.
Wherein the storage unit stores program code that is executable by the processing unit 710 such that the processing unit 710 performs steps according to various exemplary embodiments of the present disclosure described in the above-described "exemplary methods" section of the present specification. For example, the processing unit 710 may perform the following steps of the method embodiment described above: acquiring a file to be checked; extracting content information of each content item from the file to be verified based on at least one content item defined in advance; the content information of each content item extracted from the file to be checked is compared with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent; the content information of the content item extracted from the trusted file is acquired based on the uplink information stored on the blockchain; and displaying the content information of the file to be checked inconsistent with the trusted file according to the item-by-item comparison result.
The memory unit 720 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 7201 and/or cache memory 7202, and may further include Read Only Memory (ROM) 7203.
The storage unit 720 may also include a program/utility 7204 having a set (at least one) of program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 730 may be a bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 700 may also communicate with one or more external devices 740 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 700, and/or any device (e.g., router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 750. Also, electronic device 700 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 760. As shown, network adapter 760 communicates with other modules of electronic device 700 over bus 730. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 700, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In particular, according to embodiments of the present disclosure, the process described above with reference to the flowcharts may be implemented as a computer program product comprising: and a computer program which, when executed by the processor, implements the above-described file verification method.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium, which may be a readable signal medium or a readable storage medium, is also provided. The computer readable storage medium has stored thereon a program product capable of implementing the above-described method of the present disclosure. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
More specific examples of the computer readable storage medium in the present disclosure may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In this disclosure, a computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Alternatively, the program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
In particular implementations, the program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the description of the above embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (11)

1. A method for verifying a file, comprising:
acquiring a file to be checked;
extracting content information of each content item from the file to be verified based on at least one content item defined in advance;
the content information of each content item extracted from the file to be checked is compared with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent; the content information of each content item extracted from the trusted file is acquired based on the uplink information stored on the blockchain;
and displaying the content information of the file to be checked inconsistent with the trusted file according to the item-by-item comparison result.
2. The file verification method according to claim 1, wherein before extracting content information of each content item from the file to be verified based on at least one content item defined in advance, the method further comprises:
generating a hash value of the file to be checked;
comparing the hash value of the file to be checked with the hash value of the trusted file; and if the comparison results are inconsistent, extracting the content information of each content item from the file to be checked based on at least one content item defined in advance.
3. The file verification method according to claim 2, further comprising, before comparing the hash value of the file to be verified with the hash value of the trusted file:
and extracting the hash value of the trusted file from the uplink information.
4. The file verification method according to claim 1, wherein before comparing the content information of each content item extracted from the file to be verified with the content information of each content item extracted from the trusted file item by item to verify whether the content information of each content item is identical, further comprising:
and acquiring the content information of each content item extracted from the trusted file according to the uplink information.
5. The file verification method according to any one of claims 1 to 4, wherein before comparing the content information of each content item extracted from the file to be verified with the content information of each content item extracted from the trusted file item by item to verify whether the content information of each content item is identical, further comprising:
acquiring a file identifier of the file to be checked; the file to be checked and the trusted file have the same file identification;
And searching the uplink information corresponding to the trusted file in the uplink information according to the file identification.
6. The file verification method according to claim 1, wherein the extracting content information of each content item from the file to be verified based on at least one content item defined in advance includes:
determining the service type of the file to be checked;
determining at least one content extraction range predefined for the service type of the file to be checked; wherein each content extraction range corresponds to each content item one by one;
and extracting the content information in each content extraction range of the file to be checked to obtain the content information of the content item corresponding to each content extraction range.
7. The method for verifying files according to claim 6, wherein the files to be verified are files arranged in a preset layout; the content extraction range corresponding to each content item is a preset area in the preset layout; extracting the content information in each content extraction range of the file to be checked to obtain the content information of the content item corresponding to each content extraction range, wherein the content information comprises the following components:
and identifying characters in each preset area to obtain content information of each content item of the file to be checked.
8. The method according to claim 6, wherein the file to be verified is a document of structured data, and content information of a plurality of preset categories in the document of structured data is stored in a preset structure; the content extraction range corresponding to each content item corresponds to at least one of the plurality of preset categories; extracting the content information in each content extraction range of the file to be checked to obtain the content information of the content item corresponding to each content extraction range, wherein the content information comprises the following components:
analyzing the file to be checked based on the preset structure to obtain content information of a plurality of preset categories;
and determining the content information of each content item of the file to be checked according to the corresponding relation between each preset category and the content item.
9. A document verification apparatus, comprising:
the acquisition module is used for acquiring the file to be checked;
the extraction module is used for extracting content information of each content item from the file to be verified based on at least one content item which is predefined;
the comparison module is used for comparing the content information of each content item extracted from the file to be checked with the content information of each content item extracted from the trusted file item by item so as to check whether the content information of each content item is consistent or not; the content information of each content item extracted from the trusted file is acquired based on the uplink information stored on the blockchain;
And the display module is used for displaying the content information of the file to be checked, which is inconsistent with the trusted file, according to the item-by-item comparison result.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the file verification method of any one of claims 1 to 8 via execution of the executable instructions.
11. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the file verification method of any one of claims 1 to 8.
CN202311864503.9A 2023-12-29 2023-12-29 File verification method and device, electronic equipment and storage medium Pending CN117763627A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311864503.9A CN117763627A (en) 2023-12-29 2023-12-29 File verification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311864503.9A CN117763627A (en) 2023-12-29 2023-12-29 File verification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117763627A true CN117763627A (en) 2024-03-26

Family

ID=90318262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311864503.9A Pending CN117763627A (en) 2023-12-29 2023-12-29 File verification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117763627A (en)

Similar Documents

Publication Publication Date Title
US10621381B2 (en) Event log tamper detection
US10754634B1 (en) Customized application package with context specific token
US9104528B2 (en) Controlling the release of private information using static flow analysis
US11163906B2 (en) Adaptive redaction and data releasability systems using dynamic parameters and user defined rule sets
KR20080083300A (en) Remote module incorporation into a container document
US10261772B2 (en) Method and device for generating image file
US9954900B2 (en) Automating the creation and maintenance of policy compliant environments
CN105141632B (en) Method and apparatus for checking the page
WO2022111591A1 (en) Page generation method and apparatus, storage medium, and electronic device
CN113382083B (en) Webpage screenshot method and device
CN113627145A (en) Method, device, equipment and medium for generating file of parameterized configuration
CN114139503A (en) Document content processing method, device, equipment and storage medium
CN114139502A (en) Document content processing method, device, equipment and storage medium
CN108052842B (en) Signature data storage and verification method and device
CN117763627A (en) File verification method and device, electronic equipment and storage medium
CN111078569B (en) Method and device for testing optical character recognition application and storage medium
CN111367898A (en) Data processing method, device, system, electronic equipment and storage medium
CN114121049B (en) Data processing method, device and storage medium
CN115757191B (en) Data processing method and device
US20230344650A1 (en) Validation of images via digitally signed tokens
CN113221157B (en) Equipment upgrading method and device
CN115712905A (en) Anti-counterfeiting code generation method, electronic device and computer readable medium
Studiawan Forensic analysis of iOS binary cookie files
CN117034363A (en) Electronic signature method, electronic signature device, electronic equipment and storage medium
CN116310423A (en) Image recognition method, device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination