CN115048665A - Excel file-based information hiding method, device, equipment and storage medium - Google Patents

Excel file-based information hiding method, device, equipment and storage medium Download PDF

Info

Publication number
CN115048665A
CN115048665A CN202210749023.7A CN202210749023A CN115048665A CN 115048665 A CN115048665 A CN 115048665A CN 202210749023 A CN202210749023 A CN 202210749023A CN 115048665 A CN115048665 A CN 115048665A
Authority
CN
China
Prior art keywords
watermark
file
picture
information
markup language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210749023.7A
Other languages
Chinese (zh)
Inventor
杨文秀
吴建荣
常潇
史小松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Longzhi Digital Technology Service Co Ltd
Original Assignee
Beijing Longzhi Digital Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Longzhi Digital Technology Service Co Ltd filed Critical Beijing Longzhi Digital Technology Service Co Ltd
Priority to CN202210749023.7A priority Critical patent/CN115048665A/en
Publication of CN115048665A publication Critical patent/CN115048665A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2107File encryption

Abstract

The disclosure provides an Excel file-based information hiding method, device, equipment and storage medium. The method comprises the following steps: reading the Excel file into a binary stream file, and analyzing the binary stream file based on a composite document structure to obtain a composite document containing a plurality of universal markup language texts; performing encryption operation on original information to obtain watermark codes, and processing the watermark codes to obtain watermark pictures based on coding rules of picture files in a preset format; modifying a plurality of universal markup language texts in the compound document based on the watermark picture and the picture information so as to add the watermark picture into the cell; and modifying the attribute file of the compound document based on the watermark coding so as to inject the watermark coding into at least one label in the attribute file to obtain a watermark adding picture and an Excel file after the watermark coding is injected. The hidden information added by the method is not easy to perceive, is not easy to optimize, is not easy to crack, and has stronger robustness.

Description

Excel file-based information hiding method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an information hiding method, apparatus, device, and storage medium based on an Excel file.
Background
At present, a plurality of protection schemes for Excel files exist, and the Excel files can be protected by adding visible text watermarks or background picture watermarks in the Excel files. As a novel information hiding technology, the digital watermark provides a solution for solving a series of problems of copyright protection, source authentication, file leakage, user tracking, identity authentication and the like on an open network.
In the prior art, an Excel file is protected mainly by adopting the following two modes, wherein the first mode is that a visible text watermark or a background picture watermark is added into the Excel file to achieve a protection effect, but the mode can be cracked by a mode of damaging the file, and the visible watermark brings strong perceptibility and is easier to identify; the second way is to realize file protection by modifying the XML tag attribute in the Excel file structure, but this way is optimized when other software (such as WPS) is used for storage, so that the watermark information is lost.
Therefore, the existing method for protecting the Excel file by adding information into the Excel file has the problems of high watermark information perception, easy identification and optimization of information, easy cracking of information and poor robustness.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide an information hiding method, apparatus, device and storage medium based on an Excel file, so as to solve the problems in the prior art that the watermark information perception is high, the information is easy to identify and optimize, the information is easy to crack, and the robustness is poor.
In a first aspect of the embodiments of the present disclosure, an information hiding method based on an Excel file is provided, including: reading an Excel file to be processed into a binary stream file, and analyzing the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts; acquiring original information for hiding, performing encryption operation on the original information to obtain watermark codes corresponding to the original information, and processing the watermark codes to obtain watermark pictures based on coding rules of picture files in a preset format; modifying a plurality of universal markup language texts in the compound document based on the watermark picture and the picture information corresponding to the watermark picture so as to add the watermark picture into a cell in an Excel file; and modifying the attribute file of the compound document based on the watermark coding so as to inject the watermark coding into at least one label in the attribute file of the compound document to obtain a watermark adding picture and an Excel file after the watermark coding is injected.
In a second aspect of the embodiments of the present disclosure, an information hiding device based on an Excel file is provided, including: the reading module is configured to read the Excel file to be processed into a binary stream file, and analyze the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts; the encoding module is configured to acquire original information for hiding, perform encryption operation on the original information to obtain watermark encoding corresponding to the original information, and process the watermark encoding to obtain a watermark image based on an encoding rule of an image file with a predetermined format; the adding module is configured to modify a plurality of universal markup language texts in the compound document based on the watermark picture and the picture information corresponding to the watermark picture so as to add the watermark picture to a cell in the Excel file; and the injection module is configured to modify the attribute file of the compound document based on the watermark coding so as to inject the watermark coding into at least one label in the attribute file of the compound document, and obtain the watermark-added picture and the Excel file after the watermark coding is injected.
In a third aspect of the embodiments of the present disclosure, an electronic device is provided, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the method when executing the program.
In a fourth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the steps of the above-mentioned method.
The embodiment of the present disclosure adopts at least one technical scheme that can achieve the following beneficial effects:
reading an Excel file to be processed into a binary stream file, and analyzing the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts; acquiring original information for hiding, performing encryption operation on the original information to obtain watermark codes corresponding to the original information, and processing the watermark codes to obtain watermark pictures based on coding rules of picture files in a preset format; modifying a plurality of universal markup language texts in the compound document based on the watermark picture and the picture information corresponding to the watermark picture so as to add the watermark picture into a cell in an Excel file; and modifying the attribute file of the compound document based on the watermark coding so as to inject the watermark coding into at least one label in the attribute file of the compound document to obtain a watermark adding picture and an Excel file after the watermark coding is injected. The hidden information added by the method is not easy to perceive, is not easy to optimize, is not easy to crack, and has stronger robustness.
Drawings
To more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
Fig. 1 is a schematic flowchart of an information hiding method based on an Excel file according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an information hiding device based on an Excel file according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
As described in the background art, the existing protection scheme for the Excel file can achieve a protection effect by adding a visible text watermark or a background picture watermark in the Excel file, and can also achieve a protection effect by adding a watermark code to an XML tag attribute of an Excel file structure. However, the method of adding the visible text watermark or background picture watermark in the Excel file not only brings strong perceptibility and is easy to identify, but also the watermark can be cracked in a way of damaging the file; the mode of modifying the XML tag attribute in the Excel file structure is optimized when WPS software is used for storage, so that watermark information is lost, and the purpose of hiding the information in the Excel file cannot be achieved.
In view of the problems in the prior art, the research of the present disclosure finds that, in an Excel document with an OOXML structure, a picture carrying user watermark information can be added to a cell of the Excel document by modifying the OOXML structure and inserting the OOXML structure into the cell picture, and the visible width and height of the watermark picture are both set to 0, so as to achieve the purpose of imperceptibility. The function of inserting the horizontal pictures into the cells of the Excel document is completely realized, so the Excel document added with the horizontal pictures is identified as a normal structure and cannot be optimized. In addition, by researching the OOXML structure, the specific XML file can be modified to achieve the purpose of modifying the file attribute.
The method for hiding the information of the OOXML structure based on the Excel file comprises the steps of analyzing the OOXML structure of the Excel file, hiding watermark information in a table cell and file attributes, and enabling the hidden information of the Excel file processed by the technical scheme to be not easy to perceive, not easy to optimize and crack and high in robustness.
The technical solution of the table data processing method provided by the embodiments of the present disclosure is fully described below with reference to the accompanying drawings and the specific embodiments.
Fig. 1 is a schematic flowchart of an information hiding method based on an Excel file according to an embodiment of the present disclosure. The Excel file-based information hiding method of fig. 1 may be performed by a system server. As shown in fig. 1, the information hiding method based on the Excel file may specifically include:
s101, reading an Excel file to be processed into a binary stream file, and analyzing the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts;
s102, acquiring original information for hiding, performing encryption operation on the original information to obtain a watermark code corresponding to the original information, and processing the watermark code to obtain a watermark picture based on a coding rule of a picture file with a preset format;
s103, modifying a plurality of universal markup language texts in the compound document based on the watermark picture and the picture information corresponding to the watermark picture so as to add the watermark picture into a cell in an Excel file;
s104, modifying the attribute file of the compound document based on the watermark coding so as to inject the watermark coding into at least one label in the attribute file of the compound document, and obtaining the picture added with the watermark and the Excel file after the watermark coding is injected.
Specifically, the Excel file in the embodiment of the present disclosure is a table file, and the Excel file includes a plurality of cells. The predetermined compound document structure is an OOXML structure corresponding to an Excel file, the OOXML is called Office Open XML and also called OpenXML, and the OOXML is an Office document format based on XML, and includes word processing documents, spreadsheets, presentations, charts, diagrams, shapes and other graphical materials. The technical specification developed by microsoft corporation for Office 2007 product has become an international Document format standard, and is compatible with the former international standard odf (open Document format) and the chinese Document standard uof (unified Office Document format).
Further, the universal Markup Language text refers to an XML document constituting an OOXML structure, where XML is an Extensible Markup Language (XML) abbreviation, and is used for marking an electronic document to have a structural Markup Language, and may be used for marking data and defining a data type, and is a source Language allowing a user to define its own Markup Language. The compound document is an OOXML structure document, and the OOXML structure document includes a plurality of XML files, and each XML file is used for storing different information.
According to the technical scheme provided by the embodiment of the disclosure, an OOXML structure document containing a plurality of XML files is obtained by reading the Excel file and analyzing the Excel file by utilizing an OOXML structure, original information (namely information such as account identification, access time and the like of a user) for hiding is obtained, a 16-system code with the length of 32 bits is obtained by using an MD5 encryption algorithm, the 16-system code is used as a watermark complete code, the watermark code is used as color and transparency information of a watermark picture based on a coding rule of the picture file with a preset format, and the watermark code is processed to obtain the watermark picture; finally, adding a watermark picture in a cell of the Excel file by modifying an OOXML structure of the Excel file, and injecting watermark codes into a specific label of the attribute file by modifying the attribute file in the OOXML structure to realize modification of the attribute of the Excel file; the modified Excel file not only contains the cells with the inserted watermark pictures, but also contains watermark coding information in the file attribute of the modified Excel file. The hidden information in the Excel file is not easy to be perceived by a user, is not easy to be optimized and cracked, and has extremely strong robustness.
In some embodiments, before reading the Excel file to be processed as a binary stream file, the method further comprises: the method comprises the steps of obtaining an Excel file downloaded by a user from a system, taking the downloaded Excel file as the Excel file to be processed, and obtaining an account identification and user access time of the user when the user downloads the Excel file.
Specifically, the Excel file downloaded by a user in an enterprise internal platform system needs to be embedded with personal information, so that the situation that the file inside an enterprise is leaked by the user can be effectively avoided, meanwhile, after the file is leaked, watermark information carried in the file can be acquired in a watermark tracing mode, user information corresponding to the watermark information is acquired by querying a database with the watermark information, and therefore the user leaking the file information can be traced.
Further, the Excel file downloaded by the user is used as the Excel file to be processed, when the user downloads the file through the enterprise internal platform, the unique account identification (namely, account ID) and the user access time of the user are obtained, namely, the time of the user accessing the enterprise internal platform is recorded, and the information is used as the basic information when the watermark code is generated.
In some embodiments, the compound document structure employs an OOXML structure; reading an Excel file to be processed into a binary stream file, analyzing the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts, wherein the method comprises the following steps: reading an Excel file to be processed so as to read the Excel file into a binary stream file, analyzing the binary stream file based on an OOXML structure of the Excel file, and obtaining an OOXML document corresponding to the Excel file, wherein the OOXML document comprises a plurality of universal markup language texts, and the universal markup language texts are XML files.
Specifically, when an Excel file is read, the Excel file is read as a binary stream file, and then the read binary stream file is analyzed based on an OOXML structure of the Excel file to obtain an OOXML document corresponding to the Excel file, wherein the OOXML document comprises a plurality of XML files. In practical applications, each XML file corresponds to a different composition structure of the OOXML document, such as an xl/media structure.
In some embodiments, obtaining original information for hiding, and performing an encryption operation on the original information to obtain a watermark code corresponding to the original information includes: acquiring account identification, user access time and a pre-generated random value, combining the account identification, the user access time and the random value into a character string, and calculating the character string by using an encryption algorithm to obtain watermark coding; the encryption algorithm adopts an MD5 encryption algorithm, and the watermark code is a 16-system code with the length of 32 bits.
Specifically, before generating the watermark information, firstly, an account identification (i.e. a unique account ID) of the user, a user access time and a pre-generated random value are acquired, and a watermark picture is further generated by using an MD5 encryption algorithm and a coding rule of a PNG picture through the acquired basic information for generating the watermark. In practical applications, the picture file with a predetermined format includes, but is not limited to, a picture format supported by an Excel file such as PNG, JPG, and the like.
Further, according to the account identification, the user access time and a pre-generated random value, a string of character strings with rules of 0-9 and a-z is formed; and calculating the character string by using an MD5 algorithm to obtain a string of 16-system codes with the length of 32 bits, and taking the codes as the complete codes of the watermark, for example, marking the complete codes of the watermark as M. It should be noted that, when generating the watermark code, the embodiment of the present disclosure is not limited to the manner of obtaining the 32-bit 16-ary code by performing the operation with the MD5 algorithm, and other generation rules are also applicable, for example, other asymmetric encryption algorithms are adopted, and the length of the watermark code may also be any length (for example, 16-bit code).
In some embodiments, processing the watermark encoding to obtain the watermark picture based on the encoding rule of the picture file with the predetermined format includes: analyzing a binary file structure of the picture file with the preset format to obtain a coding rule of the picture file with the preset format; taking the watermark code as the color and transparency information of the picture file with the preset format, and processing the watermark code by using a coding rule to obtain a watermark picture; the picture file with the preset format adopts a picture file with a PNG format.
Specifically, after generating the watermark code, analyzing the binary file structure of the PNG picture to obtain the coding rule of the PNG picture file, using the watermark code M as the color and transparency information of the watermark picture, processing the watermark code M by using the coding rule of the PNG picture file to generate a PNG picture of 2px by 2px, using the generated PNG picture as the watermark picture, and recording the watermark picture as P.
Further, after generating the watermark code and the watermark picture, the embodiment of the disclosure transmits the user related information, the watermark information and the like to the server side in a hypertext Transfer Protocol (HTTP) or HTTP channel (HTTPs) manner and stores the user related information, the watermark information and the like in the database; in practical application, the user related information not only includes a unique account id of the user, but also includes basic information of an Excel file, related information generated by the user accessing the Excel file, and the like.
After the watermark code and the watermark picture are generated, the OOXML structure of the Excel file is modified, so that the watermark code and the watermark picture are respectively added into the cell and the file attribute, the perceptibility of a user is reduced, and the purpose of hiding information is achieved. The manner in which these two pieces of information are added will be described in detail below with reference to specific embodiments.
In some embodiments, modifying the plurality of universal markup language texts in the composite document based on the watermark picture and the picture information corresponding to the watermark picture includes: modifying a plurality of general markup language texts in an OOXML document based on a watermark picture and picture information of the watermark picture, inserting the watermark picture into a first general markup language text, inserting a label of the watermark picture into a second general markup language text, inserting position and size information of the watermark picture into a third general markup language text, inserting an associated label of the watermark picture into a fourth general markup language text and a fifth general markup language text, and inserting a Default label and an Override label into a sixth general markup language text.
Specifically, the image information of the watermark image comprises information such as a label, a position, a size and an associated label of the watermark image, and the PNG image carrying the watermark coding information is inserted into a cell of the Excel file by modifying an OOXML structure of the Excel file, so that the watermark image is hidden in the cell of the Excel file, and the cell for adding the watermark image can be a preset cell or a randomly selected cell.
Furthermore, a plurality of universal markup language texts in the OOXML document are modified, a PNG picture (namely a watermark picture) is firstly inserted into an xl/media structure, a watermark picture tag is inserted into the xl/works sets/sheet 1.XML, position information and size information of the watermark picture are inserted into the xl/draw 1.XML, associated tags of a file are inserted into the xl/works sets/_ lists/sheet 1.XML. lists and the xl/draw/_ lists/draw 1.XML. lists, Default and overture tags are inserted into the [ Content _ Types ]. XML, and multiple added XML files and Types need to be declared in the file.
Furthermore, xl/media corresponds to the first general markup language text, and the xl/media is used for storing a picture of the Excel table; xl/works/sheet 1.XML corresponds to the second universal markup language text, xl/drawings 1.XML corresponds to the third universal markup language text, xl/works/threads/sheets/sheet 1.XML. turns and xl/drawings/threads/rolls/1. XML. turns correspond to the fourth and fifth universal markup language texts, respectively, [ Content _ Types ]. XML corresponds to the sixth universal markup language text.
Furthermore, by modifying the OOXML structure of the Excel file, a PNG picture (namely a watermark picture) is inserted into the cell of the Excel file, the visible size of the picture can be set to be 0, and the position of the picture can be any cell position.
In some embodiments, modifying the property file of the composite document based on the watermark encoding to inject the watermark encoding into at least one tag within the property file of the composite document comprises: the method comprises the steps of obtaining a property file of an OOXML document, injecting watermark codes into a first label of the property file so as to modify a keeper in the property file, and injecting the watermark codes into a second label of the property file so as to modify description in the property file.
Specifically, in addition to adding a watermark picture in a cell of an Excel file, in order to ensure robustness in the watermark information injection process, watermark coding needs to be injected into attribute information of the Excel file, and the injection of the watermark coding still needs to be realized by modifying an OOXML structure, for example, by modifying a specific XML file in the OOXML structure to achieve the purpose of modifying the file attribute.
Furthermore, the modification of the keeper in the file attribute is realized by injecting the watermark code into the cp: lastModifiedBy tag of the docProps/core.XML of the OOXML structure, and in addition, the modification of the description in the file attribute is realized by injecting the watermark code into the dc: description tag of the docProps/core.XML of the OOXML structure. In practical application, the keeper only keeps the watermark codes of the latest operation users, and the description can keep the watermark codes of a plurality of operation users in a mode of overlapping one by one.
Further, in the embodiment of the disclosure, when the watermark is injected, only the watermark encoding information is added to the keeper and remarks in the file attribute, and it should be understood that other attribute information of the file may also be modified in practical application.
It should be noted that, in the embodiment of the present disclosure, the OOXML structure of the Excel file is modified, so that the watermark picture is respectively added to the cell of the Excel file, and the watermark code is injected into the attribute file of the Excel, thereby hiding the watermark information and achieving the purpose of protecting the Excel file from being leaked. However, it should be understood that in the two types of watermark information injection provided in the above embodiment, when any one of the watermark information is successfully injected, the watermark information of the Excel file is successfully added, so that as long as any one of the watermark information is successfully added, the Excel file can be protected from being leaked, and the tracing of the watermark information of the Excel file can be realized. In practical application, the Excel file added with the watermark information can be recorded as a file E.
The above embodiment describes the generation of the watermark information and the adding process of the watermark information in detail, and a tracing method of the Excel file after adding the watermark information is described below with reference to a specific embodiment.
The embodiment of the disclosure also provides a technical scheme for tracing the watermark of the Excel file, wherein the tracing of the watermark information mainly comprises the extraction and query of the watermark information. Corresponding to the two ways of adding watermark information provided in the above embodiments, the watermark tracing of the embodiment of the present disclosure also includes two methods of extracting watermark information.
Firstly, when extracting PNG pictures (namely watermark pictures) in an Excel file, all PNG pictures stored in xl/media in an OOXML structure of the Excel file are extracted, and the extracted PNG pictures are screened to screen out the PNG pictures with the size of 2px by 2 px. And decoding the PNG picture by analyzing the binary file structure of the PNG picture, and analyzing the color and transparency information of the picture so as to analyze the watermark code M.
Secondly, when the watermark code in the Excel file is extracted, a coding information acquisition mode of file attribute can be directly adopted, a corresponding file is opened by utilizing Microsoft Excel or WPS, then file attribute details are found, and the watermark code M can be obtained through information displayed by the file attribute details; then, according to the obtained watermark code and the user information uploaded to the database of the background system server, the user with the leaked file information can be traced.
According to the technical scheme provided by the embodiment of the disclosure, the OOXML structure of the Excel file is analyzed and modified, the watermark information is hidden in the table cells and the file attributes of the Excel file, the unique identification information of a user is encoded, and the encoded information is hidden in the cells and the file attributes of the Excel file; the purposes of adding the cell pictures and modifying the file attributes are realized through OOXML operation; the system in the enterprise can realize recording a series of user information chains for operating the Excel document, and simultaneously store the data in the cells and the file attributes in the Excel file.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
Fig. 2 is a schematic structural diagram of an information hiding device based on an Excel file according to an embodiment of the present disclosure. As shown in fig. 2, the information hiding apparatus based on an Excel file includes:
the analysis module 201 is configured as a reading module, and is configured to read the Excel file to be processed into a binary stream file, and analyze the binary stream file based on a predetermined compound document structure to obtain a compound document containing a plurality of universal markup language texts;
the encoding module 202 is configured to acquire original information for hiding, perform encryption operation on the original information to obtain a watermark encoding corresponding to the original information, and process the watermark encoding to obtain a watermark image based on an encoding rule of an image file with a predetermined format;
the modification module 203 is configured to modify the multiple universal markup language texts in the compound document based on the watermark picture and the picture information corresponding to the watermark picture, so as to add the watermark picture to the cells in the Excel file;
the injection module 204 is configured to modify the attribute file of the compound document based on the watermark coding, so as to inject the watermark coding into at least one tag in the attribute file of the compound document, and obtain a watermark-added picture and an Excel file after the watermark coding is injected.
In some embodiments, before reading the Excel file to be processed as the binary stream file, the parsing module 201 in fig. 2 acquires the Excel file downloaded by the user from the system, takes the downloaded Excel file as the Excel file to be processed, and acquires the account id and the user access time of the user when the user downloads the Excel file.
In some embodiments, the compound document structure employs an OOXML structure; the parsing module 201 in fig. 2 reads an Excel file to be processed to read the Excel file into a binary stream file, and parses the binary stream file based on an OOXML structure of the Excel file to obtain an OOXML document corresponding to the Excel file, where the OOXML document includes a plurality of universal markup language texts, and the universal markup language texts are XML files.
In some embodiments, the encoding module 202 in fig. 2 obtains an account id, a user access time, and a pre-generated random value, combines the account id, the user access time, and the random value into a character string, and calculates the character string by using an encryption algorithm to obtain a watermark code; the encryption algorithm adopts an MD5 encryption algorithm, and the watermark code is a 16-system code with the length of 32 bits.
In some embodiments, the encoding module 202 in fig. 2 analyzes the binary file structure of the picture file with the predetermined format to obtain the encoding rule of the picture file with the predetermined format; taking the watermark code as the color and transparency information of the picture file with the preset format, and processing the watermark code by using a coding rule to obtain a watermark picture; the picture file with the preset format adopts a picture file with a PNG format.
In some embodiments, the modification module 203 in fig. 2 modifies a plurality of universal markup language texts in an OOXML document based on a watermark picture and picture information of the watermark picture, inserts the watermark picture in a first universal markup language text, inserts a tag of the watermark picture in a second universal markup language text, inserts position and size information of the watermark picture in a third universal markup language text, inserts an associated tag of the watermark picture in a fourth universal markup language text and a fifth universal markup language text, and inserts a Default tag and an Override tag in a sixth universal markup language text.
In some embodiments, the injection module 204 of fig. 2 obtains a properties file of an OOXML document, injects a watermark encoding into a first tag of the properties file to modify a keeper in the properties file, and injects a watermark encoding into a second tag of the properties file to modify a description in the properties file.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.
Fig. 3 is a schematic structural diagram of an electronic device 3 provided in the embodiment of the present disclosure. As shown in fig. 3, the electronic apparatus 3 of this embodiment includes: a processor 301, a memory 302, and a computer program 303 stored in the memory 302 and operable on the processor 301. The steps in the various method embodiments described above are implemented when the processor 301 executes the computer program 303. Alternatively, the processor 301 implements the functions of the modules/units in the above-described device embodiments when executing the computer program 303.
Illustratively, the computer program 303 may be partitioned into one or more modules/units, which are stored in the memory 302 and executed by the processor 301 to accomplish the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 303 in the electronic device 3.
The electronic device 3 may be a desktop computer, a notebook, a palm computer, a cloud server, or other electronic devices. The electronic device 3 may include, but is not limited to, a processor 301 and a memory 302. Those skilled in the art will appreciate that fig. 3 is merely an example of the electronic device 3, and does not constitute a limitation of the electronic device 3, and may include more or less components than those shown, or combine certain components, or different components, for example, the electronic device may also include input-output devices, network access devices, buses, etc.
The Processor 301 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 302 may be an internal storage unit of the electronic device 3, for example, a hard disk or a memory of the electronic device 3. The memory 302 may also be an external storage device of the electronic device 3, such as a plug-in hard disk provided on the electronic device 3, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 302 may also include both an internal storage unit of the electronic device 3 and an external storage device. The memory 302 is used for storing computer programs and other programs and data required by the electronic device. The memory 302 may also be used to temporarily store data that has been output or is to be output.
It should be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional units and modules is only used for illustration, and in practical applications, the above function distribution may be performed by different functional units and modules as needed, that is, the internal structure of the device is divided into different functional units or modules, so as to perform all or part of the above described functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/computer device and method may be implemented in other ways. For example, the above-described apparatus/computer device embodiments are merely illustrative, and for example, a division of modules or units, a division of logical functions only, an additional division may be made in actual implementation, multiple units or components may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method in the above embodiments, and may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the above methods and embodiments. The computer program may comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain suitable additions or additions that may be required in accordance with legislative and patent practices within the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals or telecommunications signals in accordance with legislative and patent practices.
The above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present disclosure, and are intended to be included within the scope of the present disclosure.

Claims (10)

1. An information hiding method based on an Excel file is characterized by comprising the following steps:
reading an Excel file to be processed into a binary stream file, and analyzing the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts;
acquiring original information for hiding, performing encryption operation on the original information to obtain a watermark code corresponding to the original information, and processing the watermark code to obtain a watermark picture based on a coding rule of a picture file with a preset format;
modifying the plurality of universal markup language texts in the compound document based on the watermark picture and the picture information corresponding to the watermark picture so as to add the watermark picture to a cell in the Excel file;
and modifying the attribute file of the compound document based on the watermark code so as to inject the watermark code into at least one label in the attribute file of the compound document, thereby obtaining the Excel file added with the watermark picture and injected with the watermark code.
2. The method according to claim 1, wherein before the reading of the Excel file to be processed into a binary stream file, the method further comprises:
the method comprises the steps of obtaining an Excel file downloaded by a user from a system, taking the downloaded Excel file as a to-be-processed Excel file, and obtaining an account identification and user access time of the user when the user downloads the Excel file.
3. The method of claim 1, wherein the compound document structure is in an OOXML structure; the reading the Excel file to be processed into a binary stream file, and analyzing the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts, wherein the method comprises the following steps:
reading the Excel file to be processed so as to read the Excel file into a binary stream file, analyzing the binary stream file based on an OOXML structure of the Excel file to obtain an OOXML document corresponding to the Excel file, wherein the OOXML document comprises a plurality of universal markup language texts, and the universal markup language texts are XML files.
4. The method according to claim 2, wherein the obtaining original information for hiding, and performing an encryption operation on the original information to obtain a watermark encoding corresponding to the original information, comprises:
acquiring the account identification, the user access time and a pre-generated random value, forming a character string by the account identification, the user access time and the random value, and calculating the character string by using an encryption algorithm to obtain the watermark code; the encryption algorithm adopts an MD5 encryption algorithm, and the watermark code is a 16-system code with the length of 32 bits.
5. The method according to claim 1, wherein the processing the watermark encoding based on the encoding rule of the picture file with the predetermined format to obtain the watermark picture comprises:
analyzing the binary file structure of the picture file with the preset format to obtain the coding rule of the picture file with the preset format; taking the watermark code as the color and transparency information of the picture file with the preset format, and processing the watermark code by utilizing the coding rule to obtain a watermark picture; the picture file with the preset format adopts a picture file with a PNG format.
6. The method according to claim 1, wherein the modifying the plurality of generic markup language texts in the compound document based on the watermark picture and picture information corresponding to the watermark picture comprises:
modifying a plurality of general markup language texts in an OOXML document based on the watermark picture and the picture information of the watermark picture, inserting the watermark picture into a first general markup language text, inserting a label of the watermark picture into a second general markup language text, inserting position and size information of the watermark picture into a third general markup language text, inserting an associated label of the watermark picture into a fourth general markup language text and a fifth general markup language text, and inserting a Default label and an Override label into a sixth general markup language text.
7. The method of claim 1, wherein the modifying the property file of the compound document based on the watermark encoding to inject the watermark encoding into at least one tag within the property file of the compound document comprises:
acquiring a property file of an OOXML document, injecting the watermark code into a first label of the property file so as to modify a keeper in the property file, and injecting the watermark code into a second label of the property file so as to modify description in the property file.
8. An information hiding device based on an Excel file is characterized by comprising:
the analysis module is configured to read the Excel file to be processed into a binary stream file, and analyze the binary stream file based on a preset compound document structure to obtain a compound document containing a plurality of universal markup language texts;
the encoding module is configured to acquire original information for hiding, perform encryption operation on the original information to obtain watermark encoding corresponding to the original information, and process the watermark encoding to obtain a watermark picture based on an encoding rule of a picture file with a predetermined format;
a modification module configured to modify the plurality of universal markup language texts in the compound document based on the watermark picture and picture information corresponding to the watermark picture so as to add the watermark picture to a cell in the Excel file;
and the injection module is configured to modify the attribute file of the compound document based on the watermark coding so as to inject the watermark coding into at least one label in the attribute file of the compound document, and obtain the Excel file added with the watermark picture and injected with the watermark coding.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 7 when executing the program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202210749023.7A 2022-06-28 2022-06-28 Excel file-based information hiding method, device, equipment and storage medium Pending CN115048665A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210749023.7A CN115048665A (en) 2022-06-28 2022-06-28 Excel file-based information hiding method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210749023.7A CN115048665A (en) 2022-06-28 2022-06-28 Excel file-based information hiding method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115048665A true CN115048665A (en) 2022-09-13

Family

ID=83166085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210749023.7A Pending CN115048665A (en) 2022-06-28 2022-06-28 Excel file-based information hiding method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115048665A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795417A (en) * 2023-01-09 2023-03-14 北京亿赛通科技发展有限责任公司 OOXML document tracing method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795417A (en) * 2023-01-09 2023-03-14 北京亿赛通科技发展有限责任公司 OOXML document tracing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107239713B (en) Sensitive content data information protection method and system
US8812870B2 (en) Confidentiality preserving document analysis system and method
US8189861B1 (en) Watermarking digital documents
US20190180007A1 (en) Methods, apparatus, and articles of manufacture to encode auxiliary data into text data and methods, apparatus, and articles of manufacture to obtain encoded data from text data
WO2022095312A1 (en) Electronic seal adding and verifying method and system
US10402471B2 (en) Method for obfuscating the display of text
Mir Copyright for web content using invisible text watermarking
CN111680273A (en) Watermark embedding method, device, electronic equipment and readable storage medium
Fu et al. Chartem: Reviving chart images with data embedding
Khadam et al. Text data security and privacy in the internet of things: threats, challenges, and future directions
Castiglione et al. New steganographic techniques for the OOXML file format
CN103870583A (en) Relational-database-based online and controllable browsing method for PDF document
CN115048665A (en) Excel file-based information hiding method, device, equipment and storage medium
CN115114598A (en) Watermark generation method, and method and device for file tracing by using watermark
US20150113391A1 (en) Document processing system, document processing method and storage medium
CN114036561A (en) Information hiding method, information acquiring method, information hiding device, information acquiring device, storage medium and electronic equipment
CN110069907A (en) Big data source tracing method and system based on digital watermarking
CN110874456A (en) Watermark embedding method, watermark extracting method, watermark embedding device, watermark extracting device and data processing method
Castiglione et al. Hiding Information into OOXML Documents: New Steganographic Perspectives.
CN114756794A (en) Webpage information anti-leakage method and device
US10587731B2 (en) Method and system for providing electronic document, mother book server and child book client
CN115982675A (en) Document processing method, device, electronic equipment and storage medium
CN110909323B (en) Remote sensing image stream forwarding tracing method based on XML multi-label watermark
CN113360930A (en) Encryption method for realizing front-end and back-end character dissimilarity and processing terminal
CN104517259A (en) Digital watermark insertion method and device in color text files

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination