CN117828683A - Layout file digital signature method and system - Google Patents

Layout file digital signature method and system Download PDF

Info

Publication number
CN117828683A
CN117828683A CN202410251017.8A CN202410251017A CN117828683A CN 117828683 A CN117828683 A CN 117828683A CN 202410251017 A CN202410251017 A CN 202410251017A CN 117828683 A CN117828683 A CN 117828683A
Authority
CN
China
Prior art keywords
target
bwt
window size
data
window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410251017.8A
Other languages
Chinese (zh)
Other versions
CN117828683B (en
Inventor
沙伏生
孙肖辉
范红达
郭尚
杨瑞钦
陆猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dianju Information Technology Co ltd
Original Assignee
Beijing Dianju Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dianju Information Technology Co ltd filed Critical Beijing Dianju Information Technology Co ltd
Priority to CN202410251017.8A priority Critical patent/CN117828683B/en
Publication of CN117828683A publication Critical patent/CN117828683A/en
Application granted granted Critical
Publication of CN117828683B publication Critical patent/CN117828683B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1873Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the technical field of digital signature, in particular to a format file digital signature method and a system, comprising the following steps: obtaining layout file data; according to the preset initial BWT window size and the change step length, performing multiple iterations to obtain a first window size sequence; for the BWT window size at each iteration, a plurality of dividing windows are obtained; acquiring complexity and necessity of the BWT window size according to the character data in the dividing window; iterating according to the necessity of the BWT window size; acquiring the size of a target window; BWT coding is carried out according to the size of the target window, and run-length compression is carried out to obtain target data; acquiring the optimality of the size of a target window according to the distribution of the target data; obtaining the size of a target window of the maximum value of the optimality value as the size of the optimal target window; and acquiring the digital signature of the format file according to the character data of the optimal target window size after the run compression, and improving the consistency of the digital signature of the format file characters.

Description

Layout file digital signature method and system
Technical Field
The invention relates to the technical field of digital signature, in particular to a format file digital signature method and a format file digital signature system.
Background
Digital signatures play an important role in the field of layout files, in particular to ensure traceability of the file while protecting it from tampering, which if valid means that the document is not tampered with and indeed comes from a specific sender. However, in many cases, the consistency of the digital signature of the layout file is less due to the more information that is repeatable in the layout file. Therefore, in order to more effectively perform digital signature in the file, consistency of the digital signature of the layout file character data is improved by reducing the repeatability of the data in the layout file.
In order to improve the consistency of the digital signature by compressing the format file, the format file is encoded by adopting a BWT algorithm with more regularity of the compressed data, a BWT window with a certain size is selected in the BWT algorithm, and the BWT character strings are obtained by performing cyclic shift sequencing on the character strings, so that the accurate compression is performed. However, since the character distribution in the layout file has higher repeatability, if the size of the BWT window is not suitable, the local correlation after encoding of the BWT character string is affected, so that the repeatability information of the compressed character string is still more, and the consistency result of the digital signature is affected.
Disclosure of Invention
In order to solve the problems, the invention provides a layout file digital signature method and a layout file digital signature system.
The invention relates to a format file digital signature method and a system, which adopt the following technical scheme:
the embodiment of the invention provides a format file digital signature method, which comprises the following steps:
acquiring layout file character data;
performing multiple iterations according to the preset initial BWT window size and the change step length, obtaining a BWT window size during each iteration, obtaining a plurality of BWT window sizes during iteration stopping, and forming a first window size sequence by the plurality of BWT window sizes;
dividing layout file character data according to the BWT window size when iterating each time to obtain a plurality of dividing windows; for any one of the divided windows, the number of continuous occurrence of the same character data and the number of character types in the divided window are obtained; the number of consecutive occurrences of the same character data in the dividing window constitutes a first repeated set of the dividing window; acquiring the complexity of the BWT window size according to the first repeated set of the dividing window and the number of character types; acquiring the necessity of the BWT window size in the iteration according to the complexity difference between the BWT window size in the iteration and the previous BWT window size in the first window size sequence; if the necessity of the BWT window size in the iteration is larger than or equal to a preset necessity threshold, performing the next iteration, otherwise stopping the iteration;
acquiring a target window size set according to the complexity difference of two adjacent window sizes in the first window size sequence;
BWT encoding is carried out on layout file character data by using any one target window size in the target window size set to obtain a first encoding set; performing run-length compression on the first coding set to obtain target data of the target window after the run-length compression;
acquiring the optimality of the size of a target window according to the distribution of the target data; obtaining the maximum value in optimality of all target window sizes in the target window size set, obtaining the target window size of the maximum value of the optimality value, and marking the target window size as the optimal target window size;
and acquiring the digital signature of the format file according to the character data after the run compression of the optimal target window size.
Further, the complexity of obtaining the BWT window size according to the first repeated set of the dividing window and the number of character types includes the following specific steps:
the BWT window size at the time of the iteration is noted as the first window size sequenceThe BWT window size for +.>Complexity of the size of the individual BWT window +.>The calculation method of (1) is as follows:
wherein,indicate->The BWT window size,/->Indicating a BWT window size of +.>Is a number of divided windows;indicate->A maximum value in the first repeating set of individual division windows; />Indicate->The number of character types of the character data of the individual division windows; />An exponential function based on a natural constant is represented.
Further, the obtaining the necessity of the BWT window size during the iteration according to the complexity difference between the BWT window size during the iteration and the previous BWT window size in the first window size sequence includes the following specific steps:
the BWT window size at the time of the iteration is noted as the first window size sequenceThe BWT window size for +.>Necessity of the size of the individual BWT window +.>The calculation method of (1) is as follows:
wherein,indicate->The BWT window sizes; />Indicate->The BWT window sizes; />Indicate->Complexity of the BWT window size; />Representing an exponential function based on natural constants。
Further, the obtaining the target window size set according to the complexity difference of two adjacent window sizes in the first window size sequence includes the following specific steps:
for any BWT window size in a first window size sequence, acquiring complexity of the BWT window size;
the BWT window sizes adjacent to each other in the first window size sequence are combined to form a plurality of BWT window size combinations;
for any BWT window size combination, if the complexity of the former BWT window size is smaller than the complexity of the latter BWT window size, the latter BWT window size is taken as the target window size;
the target window sizes obtained by combining all BWT window sizes constitute a target window size set.
Further, the step of performing BWT encoding on the format file character data to obtain a first encoded set by using any one target window size in the target window size set includes the following specific steps:
for any one target window size in a target window size set, dividing all character data according to the target window size to obtain a plurality of first dividing windows;
BWT coding is carried out on character data in any one of the first dividing windows, and character data after BWT coding is recorded as first coding;
and obtaining all the first codes obtained after dividing the BWT codes in the first window to form a first code set.
Further, the method for obtaining the optimality of the size of the target window according to the distribution of the target data comprises the following specific steps:
acquiring a target frequency data sequence from the target dataAnd target character data sequence->Wherein the target number of timesData sequence->The number of times data of a plurality of targets are included, and the target character data sequence is +.>The method comprises a plurality of target character data;
for the followingIs the arbitrary one of the target character data in +.>Recording all other target character data with the same target character data as target character data to be calculated of the target character data; at->The target frequency data of the same position of the target character data to be calculated is recorded as the target frequency data to be calculated;
and obtaining the optimality of the size of the target window according to the target character data to be calculated and the target frequency data to be calculated of the character data of the target data.
Further, a target frequency data sequence is acquired from the target dataAnd target character data sequence->The method comprises the following specific steps:
the target data comprises a plurality of binary groups, the second data in each binary group is character data, the first data is the occurrence frequency of the target character data, and the first data in the target data isThe first data of the two tuples is denoted by +.>Data of the number of times of goal, th->The second data of the second tuple is denoted by +.>Target character data;
all the target times data are formed into a target times data sequenceThe method comprises the steps of carrying out a first treatment on the surface of the Constructing all the target character data into a target character data sequence +.>
Further, according to the target character data to be calculated and the target frequency data to be calculated of the character data of the target data, the optimality of the size of the target window is obtained, and the method comprises the following specific steps:
obtaining a distance distribution characteristic value of the size of a target window according to target character data to be calculated and target frequency data to be calculated;
for the firstTarget character string data of a target window size, optimality of the target window size +.>The calculation method of (1) is as follows:
wherein,indicate->A target number of data sets of a target window size; />Indicate->Distance distribution characteristic values of the sizes of the target windows; />Representing the acquisition of the maximum function.
Further, the obtaining the distance distribution characteristic value of the size of the target window according to the target character data to be calculated and the target frequency data to be calculated comprises the following specific steps:
in the target character data sequenceAccording to the target frequency data to be calculated, the same data are used as the same class, and then a plurality of classes of target frequency data to be calculated are obtained;
sequencing the target frequency data to be calculated according to the position sequence numbers from small to large, obtaining absolute values of differences of the position sequence numbers between adjacent target frequency data, and marking the absolute values as target frequency difference values, wherein each type of target frequency data comprises a plurality of target frequency difference values, and marking the variance values of all the target frequency difference values as target frequency distribution values to be calculated;
acquiring a target character data sequenceAnd (5) recording the average value of the target frequency distribution values of all the target character data as the distance distribution characteristic value of the target window.
The invention also provides a format file digital signature system which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of any one of the format file digital signature methods when executing the computer program.
The technical scheme of the invention has the beneficial effects that: the invention obtains the digital signature of the format file by carrying out self-adaptive BWT coding on the collected character data of the format file and carrying out run-length compression on the coded data. Different BWT window sizes are obtained in a traversing mode, and the complexity and the necessity of the different BWT window sizes are obtained according to the data distribution characteristics in the dividing windows under the different BWT window sizes, so that the target window size is obtained. BWT coding is carried out on the format file character data according to the size of the target window, so that optimality of different target window sizes is obtained; according to the optimality of different target window sizes, the optimal target window size is obtained, BWT encoding is further carried out according to the optimal target window size, run-length compression is carried out according to encoded data, and the digital signature of the format file is obtained, so that the purpose of improving the consistency of the digital signature of the character data of the format file by reducing the repeatability of the data in the format file is achieved, and the obtained digital signature is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart showing the steps of a digital signature method for layout files according to the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description refers to a specific implementation, structure, characteristics and effects of a digital signature method and system for layout files according to the invention in combination with the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention provides a method and a system for digitally signing layout files, which are specifically described below with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for digitally signing a layout file according to an embodiment of the present invention is shown, and the method includes the following steps:
s001, acquiring layout file character data through a scanning and recognition technology.
The purpose of this embodiment is to acquire a digital signature for an acquired layout file, perform BWT encoding on the layout file, and perform run-length compression on the encoded data, and improve the consistency of the digital signature of the character data of the layout file by reducing the repeatability of the data in the layout file, so that the print layout file needs to be scanned by a scanning and recognition technology to acquire editable character data of the layout file.
Specifically, the present embodiment processes a printed english format file, scans the file by an optical character recognition OCR technology, converts a printed text into editable format file character data, encodes the format file character data for a subsequent BWT, performs run-length compression on the encoded format file character data, and obtains compressed format file character data for digital signature.
S002, acquiring different BWT window sizes in a traversing manner, and acquiring the complexity of the different BWT window sizes according to the data distribution characteristics in the divided windows under the different BWT window sizes; acquiring the necessity of different BWT window sizes according to the complexity of the BWT window sizes and the change of the BWT window sizes; according to the necessity and complexity of different BWT window sizes, traversing conditions and target window sizes are determined, and a target window size set is obtained.
It should be noted that, for the collected english layout file, the repeated information in the layout file is more, which may cause the consistency of the digital signature of the layout fileLess sexual. Therefore, in order to more effectively perform digital signature in a file, it is necessary to perform digital signature on the layout file character data by reducing the repeatability of the layout file character data and improving the consistency of the digital signature of the layout file character dataAnd (3) encoding, and performing run-length compression on the encoded character data to improve the consistency of the digital signature. In the compression process of the format file character data, the BWT coding algorithm is adopted to code the format file character data into character data with high character repetition rate in some local areas, so that the repeatability is reduced in the character data after run compression, and the consistency of the digital signature of the format file character data can be improved.
It should be further noted that, in order to make the local area of the layout file character data after BWT encoding have character data with high character repetition rate, an accurate BWT window needs to be determined. The conventional BWT coding algorithm obtains the BWT character string by selecting a BWT window of a fixed size, thereby obtaining the character string of the character data, and performing the cyclic shift sequencing on the character string, thereby obtaining the BWT character string for accurate compression. However, since the character distribution of the english format file has higher repeatability, if a BWT window with a fixed size is used, the repeatability of the encoded data is reduced, and the subsequent compression effect is further affected. In order to ensure that the optimal BWT window sizes are acquired, the optimality of different BWT window sizes is required to be quantified, a traversing mode is adopted in the quantification process, in the different traversing processes, the window is larger and larger after each BWT window change, and the defect of larger calculation cost is caused, so that the influence of the complexity of characters with different BWT window sizes is required to be acquired according to the distribution characteristics of character data with different BWT window sizes in each traversing process, the necessity of different BWT window sizes is further acquired according to the complexity of the characters, new character data is introduced in each expanding process, and if new character data is introduced, the repeatability of the character data is increased, so that the necessity of the expanding process is higher, the expansion is required to be continued, and the optimality of the target window size is further quantified.
Specifically, the present embodiment determines the optimality of the size of the subsequent quantization target window according to the BWT window size traversal manner. According to preset an initial BWT window size(/>The magnitude of (2) may be determined by the specific implementation of the practitioner, and this embodiment gives an empirical value +.>Wherein the window size is the window side length), according to the preset change step length is(/>The magnitude of (2) may be determined by the specific implementation of the practitioner, and this embodiment gives an empirical value +.>) Traversing from the initial BWT window size, taking the sum of the initial BWT window size and the change step length as the BWT window size of the next traversing, carrying out a plurality of iterations, obtaining one BWT window size at each iteration, obtaining a plurality of BWT window sizes at the time of iteration stopping, and forming a first window size sequence by the plurality of BWT window sizes.
Further, the BWT window size at each iteration is noted as the firstBWT window size->Every ++all character data in left-to-right order according to the BWT window size>Dividing the character data into a window, and recording the number of all character data as +.>The number of BWT window size dividing windows is +.>By the formula->Calculation of>Representing the number of all character data, +.>Representing the BWT window size, +.>Representing a round-up function. It should be noted that if the last window contains less data than + ->Then in the subsequent calculation process, the data number actually contained in the window is processed. The number of consecutive occurrences of the same character data in the partition window constitutes a first repeated set of the partition window. The BWT window size at the time of the iteration is noted as +.>The BWT window size for +.>Complexity of the size of the individual BWT window +.>The calculation method of (1) is as follows:
wherein,indicate->The BWT window size,/->Indicating a BWT window size of +.>Is a number of divided windows;indicate->A maximum value in the first repeating set of individual division windows; />Indicate->The number of character types of the character data of the individual division windows; />It should be noted that ++L used in this example is represented as an exponential function based on natural constant>The model is used only to represent the negative correlation and the result of the constraint model output is at +.>Within the interval, other models with the same purpose can be replaced in the implementation, and the embodiment is only to +.>The model is described as an example, without specific limitation, wherein +.>Representing the input of the model. Wherein->The ratio between the number of repeated occurrences of character data representing each divided window and the number of divided window data is expressed, the larger the ratio is, the more repeated character data within the divided window is indicated, and the lower the complexity of the divided window is; />The data type distribution feature of the character data representing each divided window indicates that the complexity of the divided window is lower if the data type is smaller; averaging by calculating the complexity of each divided window, get +.>Window size +.>Is not limited by the complexity of (a).
Further, based on the complexity difference between the BWT window size at the time of the iteration and the previous BWT window size in the first window size sequence, the necessity of obtaining the BWT window size at the time of the iteration is determined for the first window sizeNecessity of the size of the individual BWT window +.>The calculation method of (1) is as follows:
wherein,indicate->BWT window size;/>Indicate->The BWT window sizes; />Indicate->Complexity of the BWT window size; />It should be noted that ++L used in this example is represented as an exponential function based on natural constant>The model is only used for representing that the result output by the negative correlation and the constraint model is inWithin the interval, other models with the same purpose can be replaced in the implementation, and the embodiment is only to +.>The model is described as an example, without specific limitation, wherein +.>Representing the input of the model. Wherein by calculating the +.sub.f for this traversal process compared to the adjacent iterations>The increased window size of the window size indicates necessity, if the BWT window size is larger, the cost of performing the subsequent quantization process of the optimality of the target window size is larger, the necessity of the corresponding BWT window size is smaller, and when the BWT window size is changed, the complexity of the complexity is also changed, the complexity of the corresponding BWT window size is smaller, and the necessity is correspondingly smallerThe larger.
Further, a necessity threshold is presetAnd if the necessity of the BWT window size of the iteration times is larger than or equal to a preset necessity threshold, performing the next iteration, otherwise stopping the iteration, and further obtaining a first window size sequence.
Further, for any BWT window size in the first window size sequence, acquiring complexity of the BWT window size; the BWT window sizes adjacent to each other in the first window size sequence are combined to form a plurality of BWT window size combinations; for any BWT window size combination, if the complexity of the former BWT window size is smaller than the complexity of the latter BWT window size, the latter BWT window size is taken as the target window size; the target window sizes obtained by combining all BWT window sizes constitute a target window size set.
S003, BWT coding is carried out on layout file character data according to the size of a target window in the acquired target size window set, and optimality of different target window sizes is calculated; acquiring the size of an optimal target window according to optimality of different target window sizes; BWT coding is carried out according to the optimal target window size, and compressed format file character data is obtained according to the run-length compression of the coding result.
In order to make the format file character data have high repeatability of the character data of the local area after BWT encoding, it is necessary to determine the optimal BWT window size so that the higher the repeatability of the encoded character data is. Quantifying the repeatability of data distribution according to all target window sizes in the obtained target window size set, dividing the window according to all divided data in all target window sizes, further performing BWT coding, performing run-length compression on the coding result to obtain compressed character data, determining the optimality of each target window size according to the repeated distribution characteristics of the compressed character data, and obtaining the optimal BWT window size according to the optimality of each target window size.
Advancing oneAnd step, for any size of a target window, dividing all character data according to the size of the target window to obtain a plurality of dividing windows. And performing BWT coding on character data in any one of the divided windows, recording the character data after BWT coding as first coding, obtaining character data after BWT coding in other divided windows by similar operation, further obtaining a first coding set, performing run-length compression on the data of the first coding set to obtain run-length compressed character string data, and recording the character string data as target data under the size of the target window. The target data comprises a plurality of binary groups, the second data in each binary group is character data, the first data is the occurrence frequency of the target character data, and the first data in the target data isThe first data of the two tuples is denoted by +.>Data of the number of times of goal, th->The second data of the second tuple is denoted by +.>Target character data; constructing all the target times data into a target times data sequence +.>The method comprises the steps of carrying out a first treatment on the surface of the Constructing all the target character data into a target character data sequence +.>. It should be noted in particular that->And->The two data of the same position in the list are represented as the same binary group, wherein the position is represented by a position number.
Further, for the case ofIs the arbitrary one of the target character data in +.>Recording all other target character data with the same target character data as target character data to be calculated of the target character data; at->And recording the target frequency data of the same position of the target character data to be calculated as the target frequency data to be calculated. In the target character data sequence->According to the target frequency data to be calculated, the same data are used as the same class, and then a plurality of classes of target frequency data to be calculated are obtained. Sequencing the target frequency data to be calculated according to the position sequence numbers from small to large, obtaining absolute values of differences of the position sequence numbers between adjacent target frequency data, and marking the absolute values as target frequency difference values, wherein each type of target frequency data comprises a plurality of target frequency difference values, and marking the variance values of all the target frequency difference values as target frequency distribution values to be calculated; similarly, the target character data sequence is acquired +.>And (3) marking the average value of the target frequency distribution values of all the target character data as the distance distribution characteristic value of the target window and marking the average value as the distance distribution characteristic value of the target window.
For the firstTarget character string data of a target window size, optimality of the target window size +.>The calculation method of (1) is as follows:
wherein,indicate->A target number of data sets of a target window size; />Indicate->Distance distribution characteristic values of the sizes of the target windows; />Representing the acquisition of the maximum function. If the maximum value in all the target frequency data is larger, the higher the repeatability of coding compression of the dividing window obtained under the size of the target window is, and the higher the optimality of the corresponding size of the target window is; if the distance distribution characteristic value of the target window is larger, the fact that the distribution among characters is disordered in the run-length compression result is shown, so that the data repeatability is lower, and the optimality of the corresponding size of the target window is smaller.
Further, optimality of target character string data of all target window sizes is obtained, and an optimality set is obtained. And obtaining a target window corresponding to the maximum value of optimality in the optimality set, and recording the target window as the size of the optimal target window. BWT coding is carried out according to the optimal target window size, and compressed format file character data is obtained according to the run-length compression of the coding result.
S004, obtaining the digital signature of the format file according to the obtained compressed format file character data.
According to the obtained compressed format file character data, throughAnd the hash function converts the hash value of the format file character data to obtain a character string with a fixed size after the format file character data is converted. By->And the encryption algorithm generates a private key, and encrypts the character string with a fixed size after the format file character data are converted according to the obtained private key, so that the digital signature of the corresponding format file is obtained.
The invention also provides a format file digital signature system which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of any one of the eye state recognition methods based on computer vision when executing the computer program.
The invention also provides a format file digital signature system which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the steps S001-S004 are realized when the processor executes the computer program.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the invention, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A format file digital signature method is characterized by comprising the following steps:
acquiring layout file character data;
performing multiple iterations according to the preset initial BWT window size and the change step length, obtaining a BWT window size during each iteration, obtaining a plurality of BWT window sizes during iteration stopping, and forming a first window size sequence by the plurality of BWT window sizes;
dividing layout file character data according to the BWT window size when iterating each time to obtain a plurality of dividing windows; for any one of the divided windows, the number of continuous occurrence of the same character data and the number of character types in the divided window are obtained; the number of consecutive occurrences of the same character data in the dividing window constitutes a first repeated set of the dividing window; acquiring the complexity of the BWT window size according to the first repeated set of the dividing window and the number of character types; acquiring the necessity of the BWT window size in the iteration according to the complexity difference between the BWT window size in the iteration and the previous BWT window size in the first window size sequence; if the necessity of the BWT window size in the iteration is larger than or equal to a preset necessity threshold, performing the next iteration, otherwise stopping the iteration;
acquiring a target window size set according to the complexity difference of two adjacent window sizes in the first window size sequence;
BWT encoding is carried out on layout file character data by using any one target window size in the target window size set to obtain a first encoding set; performing run-length compression on the first coding set to obtain target data of the target window after the run-length compression;
acquiring the optimality of the size of a target window according to the distribution of the target data; obtaining the maximum value in optimality of all target window sizes in the target window size set, obtaining the target window size of the maximum value of the optimality value, and marking the target window size as the optimal target window size;
and acquiring the digital signature of the format file according to the character data after the run compression of the optimal target window size.
2. The method for digitally signing a layout file according to claim 1, wherein the complexity of obtaining the BWT window size according to the first repeated set of the dividing windows and the number of character types includes the following specific steps:
the BWT window size at the time of the iteration is noted as the first window size sequenceNumber BWT window size, for +.>Complexity of the size of the individual BWT window +.>The calculation method of (1) is as follows:
wherein,indicate->The BWT window size,/->Indicating a BWT window size of +.>Is a number of divided windows; />Indicate->A maximum value in the first repeating set of individual division windows; />Indicate->The number of character types of the character data of the individual division windows; />An exponential function based on a natural constant is represented.
3. The method for digitally signing a layout file according to claim 1, wherein the step of obtaining the necessity of the BWT window size at the time of the iteration according to the complexity difference between the BWT window size at the time of the iteration and the previous BWT window size in the first window size sequence comprises the following specific steps:
the BWT window size at the time of the iteration is noted as the first window size sequenceThe BWT window size for +.>Necessity of the size of the individual BWT window +.>The calculation method of (1) is as follows:
wherein,indicate->The BWT window sizes; />Indicate->The BWT window sizes; />Indicate->Complexity of the BWT window size; />An exponential function based on a natural constant is represented.
4. The method for digitally signing a layout file according to claim 1, wherein the step of obtaining the target window size set according to the complexity difference of two adjacent window sizes in the first window size sequence comprises the following specific steps:
for any BWT window size in a first window size sequence, acquiring complexity of the BWT window size;
the BWT window sizes adjacent to each other in the first window size sequence are combined to form a plurality of BWT window size combinations;
for any BWT window size combination, if the complexity of the former BWT window size is smaller than the complexity of the latter BWT window size, the latter BWT window size is taken as the target window size;
the target window sizes obtained by combining all BWT window sizes constitute a target window size set.
5. The method for digitally signing a layout file according to claim 1, wherein the step of performing BWT encoding on the character data of the layout file to obtain the first encoded set by using any one of the target window sizes in the set of target window sizes comprises the specific steps of:
for any one target window size in a target window size set, dividing all character data according to the target window size to obtain a plurality of first dividing windows;
BWT coding is carried out on character data in any one of the first dividing windows, and character data after BWT coding is recorded as first coding;
and obtaining all the first codes obtained after dividing the BWT codes in the first window to form a first code set.
6. The digital signature method of a layout file according to claim 1, wherein the obtaining of the optimality of the target window size according to the distribution of the target data comprises the following specific steps:
acquiring a target frequency data sequence from the target dataAnd target character data sequence->Wherein the target number of times data sequence->The number of times data of a plurality of targets are included, and the target character data sequence is +.>The method comprises a plurality of target character data;
for the followingIs the arbitrary one of the target character data in +.>Recording all other target character data with the same target character data as target character data to be calculated of the target character data; at->The target frequency data of the same position of the target character data to be calculated is recorded as the target frequency data to be calculated;
and obtaining the optimality of the size of the target window according to the target character data to be calculated and the target frequency data to be calculated of the character data of the target data.
7. According to claim 6The format file digital signature method is characterized in that a target frequency data sequence is acquired from the target dataAnd target character data sequence->The method comprises the following specific steps:
the target data comprises a plurality of binary groups, the second data in each binary group is character data, the first data is the occurrence frequency of the target character data, and the first data in the target data isThe first data of the two tuples is denoted by +.>Data of the number of times of goal, th->The second data of the second tuple is denoted by +.>Target character data;
all the target times data are formed into a target times data sequenceThe method comprises the steps of carrying out a first treatment on the surface of the Constructing all the target character data into a target character data sequence +.>
8. The layout file digital signature method according to claim 6, wherein the obtaining the optimality of the target window size according to the target character data to be calculated and the target number of times data to be calculated of the character data of the target data comprises the following specific steps:
obtaining a distance distribution characteristic value of the size of a target window according to target character data to be calculated and target frequency data to be calculated;
for the firstTarget character string data of a target window size, optimality of the target window size +.>The calculation method of (1) is as follows:
wherein,indicate->A target number of data sets of a target window size; />Indicate->Distance distribution characteristic values of the sizes of the target windows; />Representing the acquisition of the maximum function.
9. The method for digitally signing a layout file according to claim 8, wherein the step of obtaining the distance distribution characteristic value of the size of the target window according to the target character data to be calculated and the target number data to be calculated comprises the following specific steps:
in the target character data sequenceAccording to the target frequency data to be calculated, the same data are used as the same class, and then a plurality of classes of target frequency data to be calculated are obtained;
sequencing the target frequency data to be calculated according to the position sequence numbers from small to large, obtaining absolute values of differences of the position sequence numbers between adjacent target frequency data, and marking the absolute values as target frequency difference values, wherein each type of target frequency data comprises a plurality of target frequency difference values, and marking the variance values of all the target frequency difference values as target frequency distribution values to be calculated;
acquiring a target character data sequenceAnd (5) recording the average value of the target frequency distribution values of all the target character data as the distance distribution characteristic value of the target window.
10. A digital signature system for a layout file, comprising a memory, a processor and a computer program stored in said memory and running on said processor, characterized in that said processor implements the steps of a digital signature method for a layout file according to any one of claims 1-9 when said computer program is executed by said processor.
CN202410251017.8A 2024-03-06 2024-03-06 Layout file digital signature method and system Active CN117828683B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410251017.8A CN117828683B (en) 2024-03-06 2024-03-06 Layout file digital signature method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410251017.8A CN117828683B (en) 2024-03-06 2024-03-06 Layout file digital signature method and system

Publications (2)

Publication Number Publication Date
CN117828683A true CN117828683A (en) 2024-04-05
CN117828683B CN117828683B (en) 2024-04-30

Family

ID=90506112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410251017.8A Active CN117828683B (en) 2024-03-06 2024-03-06 Layout file digital signature method and system

Country Status (1)

Country Link
CN (1) CN117828683B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10116281A (en) * 1996-10-11 1998-05-06 Fuji Xerox Co Ltd Method and device for document processing
US20080050047A1 (en) * 2006-08-24 2008-02-28 Ocarina Networks Inc. Methods and Apparatus for Reducing Storage Size
CN101803203A (en) * 2007-09-14 2010-08-11 微软公司 Optimized data stream compression using data-dependent chunking
CN116522300A (en) * 2023-07-04 2023-08-01 北京点聚信息技术有限公司 Intelligent management system for electronic seal
CN116721472A (en) * 2023-04-27 2023-09-08 杭州电力设备制造有限公司 Screen handwriting signature recognition method for paperless office work
CN116916047A (en) * 2023-09-12 2023-10-20 北京点聚信息技术有限公司 Intelligent storage method for layout file identification data
CN117648680A (en) * 2024-01-30 2024-03-05 北京点聚信息技术有限公司 Electronic signature format file generation method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10116281A (en) * 1996-10-11 1998-05-06 Fuji Xerox Co Ltd Method and device for document processing
US20080050047A1 (en) * 2006-08-24 2008-02-28 Ocarina Networks Inc. Methods and Apparatus for Reducing Storage Size
CN101803203A (en) * 2007-09-14 2010-08-11 微软公司 Optimized data stream compression using data-dependent chunking
CN116721472A (en) * 2023-04-27 2023-09-08 杭州电力设备制造有限公司 Screen handwriting signature recognition method for paperless office work
CN116522300A (en) * 2023-07-04 2023-08-01 北京点聚信息技术有限公司 Intelligent management system for electronic seal
CN116916047A (en) * 2023-09-12 2023-10-20 北京点聚信息技术有限公司 Intelligent storage method for layout file identification data
CN117648680A (en) * 2024-01-30 2024-03-05 北京点聚信息技术有限公司 Electronic signature format file generation method and system

Also Published As

Publication number Publication date
CN117828683B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN100553152C (en) Coding method and equipment and coding/decoding method and equipment based on CABAC
CN115242475B (en) Big data safety transmission method and system
US7365658B2 (en) Method and apparatus for lossless run-length data encoding
CN116775589B (en) Data security protection method for network information
US11551785B2 (en) Gene sequencing data compression preprocessing, compression and decompression method, system, and computer-readable medium
WO2013067327A2 (en) Image compression using sub-resolution images
US20190140657A1 (en) Data compression coding method, apparatus therefor, and program therefor
CN110635807A (en) Data coding method and decoding method
CN107565970B (en) Hybrid lossless compression method and device based on feature recognition
US7302106B2 (en) System and method for ink or handwriting compression
JP2798172B2 (en) Image encoding / decoding device
CN116226471A (en) Data storage method for homeland resource planning
CN110021368B (en) Comparison type gene sequencing data compression method, system and computer readable medium
CN116055008A (en) Router data processing method for cloud server connection
CN116595568A (en) Private data encryption method based on blockchain
US9092717B2 (en) Data processing device and data processing method
CN117828683B (en) Layout file digital signature method and system
JP2012134858A (en) Data compression apparatus, data compression method and data compression program
CN116032476B (en) Electronic contract content intelligent encryption method based on sequence decomposition
CN115695564B (en) Efficient transmission method of Internet of things data
CN116975864A (en) Malicious code detection method and device, electronic equipment and storage medium
Grassberger Data compression and entropy estimates by non-sequential recursive pair substitution
CN116843774A (en) Point cloud data compression method, device, equipment and storage medium
CN109698702B (en) Gene sequencing data compression preprocessing method, system and computer readable medium
CN106941610B (en) Binary ROI mask coding method based on improved block coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant