CN108829872B - Method, device, system and storage medium for rapidly processing lossless compressed file - Google Patents

Method, device, system and storage medium for rapidly processing lossless compressed file Download PDF

Info

Publication number
CN108829872B
CN108829872B CN201810657224.8A CN201810657224A CN108829872B CN 108829872 B CN108829872 B CN 108829872B CN 201810657224 A CN201810657224 A CN 201810657224A CN 108829872 B CN108829872 B CN 108829872B
Authority
CN
China
Prior art keywords
compressed
processed
character
characters
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810657224.8A
Other languages
Chinese (zh)
Other versions
CN108829872A (en
Inventor
王防修
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Polytechnic University
Original Assignee
Wuhan Polytechnic University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Polytechnic University filed Critical Wuhan Polytechnic University
Priority to CN201810657224.8A priority Critical patent/CN108829872B/en
Publication of CN108829872A publication Critical patent/CN108829872A/en
Application granted granted Critical
Publication of CN108829872B publication Critical patent/CN108829872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a method, equipment, a system and a storage medium for rapidly processing a lossless compressed file. The processing equipment of the invention obtains all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed, establishes a mapping relation between the characters to be processed and the codes corresponding to the characters to be processed, replaces the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed respectively in the mapping relation, completes the coding of the source file to be compressed, and directly replaces the characters with the codes by establishing one-to-one mapping between the characters and the codes.

Description

Method, device, system and storage medium for rapidly processing lossless compressed file
Technical Field
The present invention relates to the field of file compression technologies, and in particular, to a method, an apparatus, a system, and a storage medium for fast processing a lossless compressed file.
Background
In order to improve the utilization efficiency of the external memory, the stored data file is often required to be compressed. For a lossy compression, the complete information before compression cannot be restored after decompression. However, for some important information, lossless compression must be used so that the decompressed information is the same as the information before compression. First, only files that have redundancy can be losslessly compressed. Secondly, the same source file is compressed, and different compression ratios can be obtained by different encoding methods. However, if the encoding speed in the compression process is too slow, it takes too much waiting time for the user to compress the file. Also, if the speed of decompressing the compressed file is too slow, the user will also wait too long. Therefore, it is very important to research methods for increasing the compression and decompression speed of files.
Under the condition that the software and hardware environments of a computer are not changed, a fast code word query method is required for improving the coding speed in the compression process, and a faster character query method is required to be designed for improving the decompression speed of a compressed file.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a method, equipment, a system and a storage medium for rapidly processing a lossless compression file, and aims to solve the problem that in the prior art, the encoding and decoding speed is low in the file compression and decompression processes.
In order to achieve the above object, the present invention provides a method for rapidly processing a lossless compressed file, the method comprising the steps of:
acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed;
establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed;
and respectively replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation, and completing the coding of the source file to be compressed.
Preferably, after the characters to be processed are replaced by the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relationship, respectively, and the codes of the source file to be compressed are completed, the method further includes:
acquiring a compressed file of the source file to be compressed;
establishing a binary tree based on all codes to be processed of the compressed file and a plurality of characters corresponding to the codes to be processed;
and traversing the binary tree, respectively acquiring characters corresponding to the codes to be processed of the compressed file, and completing the decoding of the compressed file.
Preferably, the acquiring all characters to be processed of the source file to be compressed and the codes corresponding to the characters to be processed specifically includes:
acquiring all characters to be processed of the source file to be compressed, codes corresponding to the characters to be processed and positions of the characters to be processed in the source file to be compressed;
correspondingly, the establishing a mapping relationship between the character to be processed and the code corresponding to the character to be processed specifically includes:
and according to the position of the character to be processed in the source file to be compressed, establishing a mapping relation between the character to be processed and a code corresponding to the character to be processed.
Preferably, the respectively replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relationship to complete the coding of the source file to be compressed, specifically including:
reading a current character from the source file to be compressed;
replacing the current character with the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed;
when the current character is the last character in the source file to be compressed, finishing the encoding of the source file to be compressed;
when the current character is not the last character in the source file to be compressed, reading the next character from the source file;
and repeating the steps of replacing the current character by the code corresponding to the current character in the mapping relation and judging whether the current character is the last character in the source file to be compressed until the code of the source file to be compressed is finished.
Preferably, after the mapping relationship is established between the character to be processed and the code corresponding to the character to be processed, the method further includes:
and storing the mapping relation in a memory.
Preferably, after the obtaining of the compressed file of the source file to be compressed, the method further includes:
acquiring characters corresponding to the codes to be processed based on the codes to be processed in the compressed file;
establishing a code table according to the code to be processed and the character corresponding to the code to be processed;
correspondingly, the establishing of the binary tree based on the characters corresponding to the codes to be processed in the compressed file specifically includes:
and establishing a binary tree based on the coding table.
Preferably, traversing the binary tree, respectively obtaining characters corresponding to the to-be-processed code of the compressed file, specifically includes:
reading a current code from the compressed file;
traversing the binary tree, searching for a character corresponding to the current code, and judging whether the current code is the last code in the compressed file;
when the current code is the last code in the compressed file, finishing decoding the compressed file;
reading a next code from the compressed file when the current code is not the last code in the compressed file;
and repeatedly executing the steps of traversing the binary tree, searching for the character corresponding to the current code, and judging whether the current code is the last code in the compressed file until the characters corresponding to all the codes in the compressed file are searched.
Further, to achieve the above object, the present invention also provides a fast processing apparatus for lossless compression files, comprising: a memory, a processor and a fast processing program of lossless compressed files stored on said memory and executable on said processor, said fast processing program of lossless compressed files being configured to implement the steps of the method of fast processing of lossless compressed files as described above.
In addition, to achieve the above object, the present invention provides a system for rapidly processing a lossless compressed file, including: the system comprises an acquisition module, an establishment module and a replacement module;
the acquisition module is used for acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed;
the establishing module is used for establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed;
and the replacing module is used for replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation respectively to finish the coding of the source file to be compressed.
In addition, to achieve the above object, the present invention further provides a storage medium having stored thereon a fast processing program of a lossless compression file, the fast processing program of the lossless compression file, when executed by a processor, implementing the steps of the fast processing method of the lossless compression file as described above.
In the invention, a processing device acquires all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed, a mapping relation is established between the characters to be processed and the codes corresponding to the characters to be processed, the characters to be processed are respectively replaced by the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation, the codes of the source file to be compressed are completed, the characters are directly replaced by the codes through one-to-one mapping established between the characters and the codes, the codes corresponding to the characters are not required to be searched in the compression process, a large amount of character comparison time is saved, and the processing speed in the file processing process is effectively improved.
Drawings
FIG. 1 is a schematic diagram of a fast processing device for lossless compression of files in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of a method for fast processing a lossless compressed file according to the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of a method for fast processing of lossless compressed files according to the present invention;
FIG. 4 is a functional block diagram of a first embodiment of a system for fast processing of losslessly compressed files according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a fast processing device for lossless compression of files in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the apparatus for rapidly processing a lossless compression file may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the architecture shown in FIG. 1 does not constitute a limitation of a fast processing apparatus for lossless compression of files, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a fast handler for lossless compression of files.
In the apparatus for rapidly processing a lossless compression file shown in fig. 1, the network interface 1004 is mainly used for data communication with an external network; the user interface 1003 is mainly used for receiving input instructions of a user; the apparatus for fast processing of lossless compression files calls a fast processing program of lossless compression files stored in the memory 1005 through the processor 1001, and performs the following operations:
acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed;
establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed;
and respectively replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation, and completing the coding of the source file to be compressed.
Further, the processor 1001 may call a fast handler of the lossless compression file stored in the memory 1005, and also perform the following operations:
acquiring a compressed file of the source file to be compressed;
establishing a binary tree based on all codes to be processed of the compressed file and a plurality of characters corresponding to the codes to be processed;
and traversing the binary tree, respectively acquiring characters corresponding to the codes to be processed of the compressed file, and completing the decoding of the compressed file.
Further, the processor 1001 may call a fast handler of the lossless compression file stored in the memory 1005, and also perform the following operations:
acquiring all characters to be processed of the source file to be compressed, codes corresponding to the characters to be processed and positions of the characters to be processed in the source file to be compressed;
correspondingly, the establishing a mapping relationship between the character to be processed and the code corresponding to the character to be processed specifically includes:
and according to the position of the character to be processed in the source file to be compressed, establishing a mapping relation between the character to be processed and a code corresponding to the character to be processed.
Further, the processor 1001 may call a fast handler of the lossless compression file stored in the memory 1005, and also perform the following operations:
reading a current character from the source file to be compressed;
replacing the current character with the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed;
when the current character is the last character in the source file to be compressed, finishing the encoding of the source file to be compressed;
when the current character is not the last character in the source file to be compressed, reading the next character from the source file;
and repeating the steps of replacing the current character by the code corresponding to the current character in the mapping relation and judging whether the current character is the last character in the source file to be compressed until the code of the source file to be compressed is finished.
Further, the processor 1001 may call a fast handler of the lossless compression file stored in the memory 1005, and also perform the following operations:
and storing the mapping relation in a memory.
Further, the processor 1001 may call a fast handler of the lossless compression file stored in the memory 1005, and also perform the following operations:
acquiring characters corresponding to the codes to be processed based on the codes to be processed in the compressed file;
establishing a code table according to the code to be processed and the character corresponding to the code to be processed;
correspondingly, the establishing of the binary tree based on the characters corresponding to the codes to be processed in the compressed file specifically includes:
and establishing a binary tree based on the coding table.
Further, the processor 1001 may call a fast handler of the lossless compression file stored in the memory 1005, and also perform the following operations:
reading a current code from the compressed file;
traversing the binary tree, searching for a character corresponding to the current code, and judging whether the current code is the last code in the compressed file;
when the current code is the last code in the compressed file, finishing decoding the compressed file;
reading a next code from the compressed file when the current code is not the last code in the compressed file;
and repeatedly executing the steps of traversing the binary tree, searching for the character corresponding to the current code, and judging whether the current code is the last code in the compressed file until the characters corresponding to all the codes in the compressed file are searched.
According to the scheme, the processing equipment acquires all characters to be processed of the source file to be compressed and codes corresponding to the characters to be processed, mapping relations are established between the characters to be processed and the codes corresponding to the characters to be processed, the characters to be processed are replaced by the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relations, the source file to be compressed is encoded, the characters are directly replaced by the codes through one-to-one mapping established between the characters and the codes, the codes corresponding to the characters do not need to be searched in the compression process, a large amount of character comparison time is saved, and the processing speed in the file processing process is effectively improved.
Based on the hardware structure, the embodiment of the method for rapidly processing the lossless compression file is provided.
Referring to FIG. 2, FIG. 2 is a flow chart of a first embodiment of the method for fast processing a lossless compressed file according to the present invention.
In a first embodiment, the method for fast processing of lossless compressed files comprises the following steps:
s10: acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed.
It can be understood that the source file can be compressed as long as all characters contained in the source file and the number of times that various characters appear in the source file are counted.
For example, there are n different characters C in the source filei,WiIs a character CiThe number of occurrences in a document, after counting the characters contained in said source document, can also be obtained the character CiCorresponding prefix code word bi
S20: and establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed.
In specific implementation, when all characters to be processed of the source file to be compressed and codes corresponding to the characters to be processed are obtained, the positions of the characters to be processed in the source file to be compressed can also be obtained at the same time.
And according to the position of the character to be processed in the source file to be compressed, establishing a mapping relation between the character to be processed and a code corresponding to the character to be processed.
It is understood that after the mapping relationship is established, the mapping relationship may be stored in the memory, so that although a certain storage space is occupied, the encoding speed is increased.
S30: and respectively replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation, and completing the coding of the source file to be compressed.
In a specific implementation, a first character can be read from a source file to be compressed, the current character is replaced by the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed, when the current character is the last character in the source file to be compressed, namely, the encoding of the source file to be compressed, namely, the compression process is completed, when the current character is not the last character in the source file to be compressed, the next character is read from the source file, and repeatedly performing the replacement of the current character with the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed or not until the encoding of the source file to be compressed is finished.
In this embodiment, a processing device obtains all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed, a mapping relationship is established between the characters to be processed and the codes corresponding to the characters to be processed, the characters to be processed are replaced by the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relationship, the source file to be compressed is encoded, the characters are directly replaced by the codes by establishing one-to-one mapping between the characters and the codes, the codes corresponding to the characters are not required to be searched in the compression process, a large amount of character comparison time is saved, and the processing speed in the file processing process is effectively improved.
Further, as shown in fig. 3, a second embodiment of the method for fast processing a lossless compression file according to the present invention is proposed based on the first embodiment, and in this embodiment, after step S30, the method further includes:
s40: and acquiring a compressed file of the source file to be compressed.
S50: and establishing a binary tree based on a plurality of characters corresponding to all codes to be processed of the compressed file.
It is understood that after the compressed file is obtained, the codes to be processed in the compressed file and the character suggestion code table corresponding to the codes to be processed can be obtained.
Because the codes of the lossless compression file are all prefix codes, namely any one code in the code table cannot be the prefix of other codes, a binary tree can be established according to the codes in the code table.
In a specific implementation, the binary tree is established as follows:
(1) defining the structure of the binary tree: typedef struct node1 × bintree; struch node1{ unidimensioned char ch; bintree lchild, rchild };
(2) applying for a system according to a node t and satisfying t- > lchild ═ l- > rchild ═ NULL;
(3) let i equal to 1 and point to the first codeword in the coding table;
(4) let j equal to 1, point toCode word biThe first bit binary number of (a);
(5) let p be t, which means to start the search from the root node;
(6) if b isij0 and p->lchild<>NULL, then p->lchild;
(7) If b isij0 and p->And if lchild is NULL, applying for a new node q from the system. Simultaneously, the following operations are carried out:
q->lchild=q->rchild=NULL;p->lchild=q;p=q;
(8) if b isij1 and p->rchild<>NULL, then p->rchild;
(9) If b isij1 and p->And applying for a new node q from the system if rchild is NULL. Simultaneously, the following operations are carried out:
q->lchild=q->rchild=NULL;p->rchild=q;p=q;
(10) if j is<liThen execute j ═ j +1 and return to step (6)
(11) Performing p->ch=ci
(12) If i < n, executing i-i +1 and returning to the step (4);
(13) the binary tree building is finished.
It will be appreciated that the binary tree is built using the recursive idea: a pointer pointing to a root node is given, then a create function is called recursively, a binary tree is automatically generated, and of course, before establishment, a node structure is defined first. The traversal of the binary tree also adopts the recursive idea that if the nodes have data, the root nodes and the child nodes are traversed according to the traversal rule, if no data exists, the data is returned until all the data are traversed, and the recursion is ended.
Wherein lchild and rchild respectively represent left and right branches of the binary tree, char ch is char date, which defines the meaning of data, and stores the corresponding data into the binary tree one by one>Indicating the direction, after the root node is established, p is t, indicating that the search is started from the root node, assigning t to p, and leading the traversal if p->lchild<>NULL, indicating that the node already exists, then p ═ p->lchild is the assignment of the current node to p, if p->If lchild is NULL, indicating that the node does not exist, applying for a new node q to the system, assigning a value to the new node and assigning p to q, repeating the establishment of the binary tree node and executing p->ci, i.e. characters, are stored in the nodes of the binary tree and passed through bij0 and p->lchild<>NULL, then p->lchild and bij1 and p->rchild<>NULL, then p->rchild stores all 0's in the binary data in the left branch tree and all 1's in the binary data in the right branch tree.
As can be seen from the above binary tree establishment procedure, all characters are located in the leaf nodes of the binary tree.
S60: and traversing the binary tree, respectively acquiring characters corresponding to the codes to be processed of the compressed file, and completing the decoding of the compressed file.
In a specific implementation, a current code is read from the compressed file; traversing the binary tree, searching for a character corresponding to the current code, and judging whether the current code is the last code in the compressed file; when the current code is the last code in the compressed file, finishing decoding the compressed file; reading a next code from the compressed file when the current code is not the last code in the compressed file; and repeatedly executing the steps of traversing the binary tree, searching for the character corresponding to the current code, and judging whether the current code is the last code in the compressed file until the characters corresponding to all the codes in the compressed file are searched.
The process of implementing the fast decoding of the compressed file by using the established binary tree is as follows:
(1) reading a one-bit binary number b from a compressed filet
(2) If the compression side part is read to the end, jumping to the step (9);
(3) let p be t;
(4) if b ist0, then p ═ p->lchild;
(5) If b ist1, then p ═ p->rchild;
(6) Repeating steps (1), (2), (3), (4), (5) until p- > lchild ═ p- > rchild
Until 0 is obtained;
(7) writing the character p- > ch into the decompressed file;
(8) returning to the step (1);
(9) the decompression process ends.
As can be known from the binary tree establishing process, all 0's in the binary data are stored in the left branch tree, and all 1's in the binary data are stored in the right branch tree, so in the process of searching characters, the first bit binary number is read out from the compressed file, if the read binary number is 0, all left branch trees of the binary tree are traversed, and if the read binary number is 1, all right branch trees of the binary tree are traversed.
The specific process is that a bit binary code bt is read from a compressed file, a root node t is assigned to p, the traversal query is started from p, and if the bit binary code bt is btReturning left node data assignment to p for 0, returning right node data assignment to p if bt is 1, returning p and assigning to ch if traversing left node and right node are all 0, writing ch into compressed file, and ending the process.
As can be seen from the decoding process, the method searches for the characters with the comparison times equal to the length of the code word, thereby reducing the search time.
In this embodiment, a compressed file decoding method based on a binary tree is provided, where a binary tree corresponding to an encoding table is established, and a query of a character corresponding to a codeword in a compressed file is performed by traversing nodes of the binary tree, so that the number of comparison times during search is reduced, and thus the time spent during decompression is reduced.
In order to verify the effect of the encoding and decoding method provided by the present patent, the following description will use the compression and decompression process of a specific file as verification.
The test environment for this lossless compression and decompression is: (1) software development environment: indows 7, Microsoft Visual Studio 2008; (2) softA piece operation environment: windows 7; (3) hardware development environment: dell vostro220PC.
Figure BDA0001704665530000122
Dual-Core CPU 2.70 GHz; 2GB DDR3SDRAM memory; 320GB SATA (7200RPM) hard disk; (4) hardware operating environment: dell vostro220PC (5) programming language and version number: microsoft Visual C + + 2008.
The source files to be compressed are: word file, size 162304 bytes.
The text file containing the characters and each character CiNumber of occurrences in a file WiStatistics are shown in table 1:
table 1 each character C in the fileiAnd the number of occurrences Wi
Figure BDA0001704665530000121
Figure BDA0001704665530000131
Taking the compression of the same Word file by Huffman coding, Shannon coding and Vorino coding as an example, the time spent in the coding process by using the code Word searching mode provided by the invention is 5296.532475 microseconds, 5317.198312 microseconds and 5344.752762 microseconds respectively.
Similarly, the time spent in the decoding process by using the binary tree-based character searching mode provided by the invention is 9332.854102 microseconds, 9083.64842 microseconds and 9265.994041 microseconds respectively by decompressing the same compressed file by using the Huffman code, the Shannon code and the Vorono code, and other methods are also adopted for decompressing, so that more time is required for result display.
Therefore, the character code word one-to-one mapping search method can effectively improve the compression speed of the file. Also, the binary tree search character method can improve the file decompression speed more quickly.
Referring to fig. 4, fig. 4 is a functional block diagram of a first embodiment of a system for rapidly processing a lossless compressed file according to the present invention, which is proposed based on a method for rapidly processing a lossless compressed file.
In this embodiment, the system for rapidly processing a lossless compressed file includes: the system comprises an acquisition module 10, an establishment module 20 and a replacement module 30;
the obtaining module 10 is configured to obtain all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed.
It can be understood that the source file can be compressed as long as all characters contained in the source file and the number of times that various characters appear in the source file are counted.
For example, there are n different characters C in the source filei,WiIs a character CiThe number of occurrences in a document, after counting the characters contained in said source document, can also be obtained the character CiCorresponding prefix code word bi
The establishing module 20 is configured to establish a mapping relationship between the character to be processed and the code corresponding to the character to be processed.
In specific implementation, when all characters to be processed of the source file to be compressed and codes corresponding to the characters to be processed are obtained, the positions of the characters to be processed in the source file to be compressed can also be obtained at the same time.
And according to the position of the character to be processed in the source file to be compressed, establishing a mapping relation between the character to be processed and a code corresponding to the character to be processed.
It is understood that after the mapping relationship is established, the mapping relationship may be stored in the memory, so that although a certain storage space is occupied, the encoding speed is increased.
The replacing module 30 is configured to replace the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relationship, respectively, so as to complete the coding of the source file to be compressed.
In a specific implementation, a first character can be read from a source file to be compressed, the current character is replaced by the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed, when the current character is the last character in the source file to be compressed, namely, the encoding of the source file to be compressed, namely, the compression process is completed, when the current character is not the last character in the source file to be compressed, the next character is read from the source file, and repeatedly performing the replacement of the current character with the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed or not until the encoding of the source file to be compressed is finished.
In this embodiment, a processing device obtains all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed, a mapping relationship is established between the characters to be processed and the codes corresponding to the characters to be processed, the characters to be processed are replaced by the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relationship, the source file to be compressed is encoded, the characters are directly replaced by the codes by establishing one-to-one mapping between the characters and the codes, the codes corresponding to the characters are not required to be searched in the compression process, a large amount of character comparison time is saved, and the processing speed in the file processing process is effectively improved.
In addition, an embodiment of the present invention further provides a storage medium, where the storage medium stores a fast processing program for a lossless compressed file, and when executed by a processor, the fast processing program for the lossless compressed file implements the following operations:
acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed;
establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed;
and respectively replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation, and completing the coding of the source file to be compressed.
Further, the fast handler for lossless compression of files, when executed by a processor, further performs the following operations:
acquiring a compressed file of the source file to be compressed;
establishing a binary tree based on all codes to be processed of the compressed file and a plurality of characters corresponding to the codes to be processed;
and traversing the binary tree, respectively acquiring characters corresponding to the codes to be processed of the compressed file, and completing the decoding of the compressed file.
Further, the fast handler for lossless compression of files, when executed by a processor, further performs the following operations:
acquiring all characters to be processed of the source file to be compressed, codes corresponding to the characters to be processed and positions of the characters to be processed in the source file to be compressed;
correspondingly, the establishing a mapping relationship between the character to be processed and the code corresponding to the character to be processed specifically includes:
and according to the position of the character to be processed in the source file to be compressed, establishing a mapping relation between the character to be processed and a code corresponding to the character to be processed.
Further, the fast handler for lossless compression of files, when executed by a processor, further performs the following operations:
reading a current character from the source file to be compressed;
replacing the current character with the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed;
when the current character is the last character in the source file to be compressed, finishing the encoding of the source file to be compressed;
when the current character is not the last character in the source file to be compressed, reading the next character from the source file;
and repeating the steps of replacing the current character by the code corresponding to the current character in the mapping relation and judging whether the current character is the last character in the source file to be compressed until the code of the source file to be compressed is finished.
Further, the fast handler for lossless compression of files, when executed by a processor, further performs the following operations:
and storing the mapping relation in a memory.
Further, the fast handler for lossless compression of files, when executed by a processor, further performs the following operations:
acquiring characters corresponding to the codes to be processed based on the codes to be processed in the compressed file;
establishing a code table according to the code to be processed and the character corresponding to the code to be processed;
correspondingly, the establishing of the binary tree based on the characters corresponding to the codes to be processed in the compressed file specifically includes:
and establishing a binary tree based on the coding table.
Further, the fast handler for lossless compression of files, when executed by a processor, further performs the following operations:
reading a current code from the compressed file;
traversing the binary tree, searching for a character corresponding to the current code, and judging whether the current code is the last code in the compressed file;
when the current code is the last code in the compressed file, finishing decoding the compressed file;
reading a next code from the compressed file when the current code is not the last code in the compressed file;
and repeatedly executing the steps of traversing the binary tree, searching for the character corresponding to the current code, and judging whether the current code is the last code in the compressed file until the characters corresponding to all the codes in the compressed file are searched.
According to the scheme, the processing equipment acquires all characters to be processed of the source file to be compressed and codes corresponding to the characters to be processed, mapping relations are established between the characters to be processed and the codes corresponding to the characters to be processed, the characters to be processed are replaced by the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relations, the source file to be compressed is encoded, the characters are directly replaced by the codes through one-to-one mapping established between the characters and the codes, the codes corresponding to the characters do not need to be searched in the compression process, a large amount of character comparison time is saved, and the processing speed in the file processing process is effectively improved.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (7)

1. A method for fast processing of lossless compressed files, the method comprising the steps of:
acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed;
establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed;
replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation respectively;
acquiring a compressed file of the source file to be compressed;
acquiring characters corresponding to the codes to be processed based on the codes to be processed in the compressed file; establishing an encoding table according to the code to be processed and the character corresponding to the code to be processed, and establishing a binary tree based on the encoding table;
reading a current code from the compressed file;
traversing the binary tree, searching for a character corresponding to the current code, and judging whether the current code is the last code in the compressed file;
when the current code is the last code in the compressed file, finishing decoding the compressed file;
reading a next code from the compressed file when the current code is not the last code in the compressed file;
and repeatedly executing the steps of traversing the binary tree, searching for the character corresponding to the current code, and judging whether the current code is the last code in the compressed file or not until the characters corresponding to all codes in the compressed file are searched, and decoding the compressed file.
2. The method according to claim 1, wherein the obtaining of all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed specifically comprises:
acquiring all characters to be processed of the source file to be compressed, codes corresponding to the characters to be processed and positions of the characters to be processed in the source file to be compressed;
correspondingly, the establishing a mapping relationship between the character to be processed and the code corresponding to the character to be processed specifically includes:
and according to the position of the character to be processed in the source file to be compressed, establishing a mapping relation between the character to be processed and a code corresponding to the character to be processed.
3. The method according to claim 1, wherein the step of replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relationship to complete the encoding of the source file to be compressed includes:
reading a current character from the source file to be compressed;
replacing the current character with the code corresponding to the current character in the mapping relation, and judging whether the current character is the last character in the source file to be compressed;
when the current character is the last character in the source file to be compressed, finishing the encoding of the source file to be compressed;
when the current character is not the last character in the source file to be compressed, reading the next character from the source file;
and repeating the steps of replacing the current character by the code corresponding to the current character in the mapping relation and judging whether the current character is the last character in the source file to be compressed until the code of the source file to be compressed is finished.
4. The method of claim 1, wherein after the mapping relationship between the character to be processed and the code corresponding to the character to be processed is established, the method further comprises:
and storing the mapping relation in a memory.
5. A fast processing apparatus for lossless compression of a file, comprising: memory, a processor and a fast processing program of lossless compressed files stored on the memory and executable on the processor, the fast processing program of lossless compressed files being configured to implement the steps of the method of fast processing of lossless compressed files according to any of claims 1 to 4.
6. A system for fast processing of lossless compressed files, the system comprising: the system comprises an acquisition module, an establishment module and a replacement module;
the acquisition module is used for acquiring all characters to be processed of a source file to be compressed and codes corresponding to the characters to be processed;
the establishing module is used for establishing a mapping relation between the character to be processed and the code corresponding to the character to be processed;
the replacing module is used for replacing the characters to be processed with the codes corresponding to the characters to be processed in the source file to be compressed in the mapping relation respectively to finish the coding of the source file to be compressed;
the system for fast processing of lossless compressed files is configured to implement the steps of the method for fast processing of lossless compressed files according to any one of claims 1 to 4.
7. A storage medium, characterized in that the storage medium has stored thereon a fast processing program of lossless compression file, which when executed by a processor implements the steps of the method of fast processing of lossless compression file according to any one of claims 1 to 4.
CN201810657224.8A 2018-06-22 2018-06-22 Method, device, system and storage medium for rapidly processing lossless compressed file Active CN108829872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810657224.8A CN108829872B (en) 2018-06-22 2018-06-22 Method, device, system and storage medium for rapidly processing lossless compressed file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810657224.8A CN108829872B (en) 2018-06-22 2018-06-22 Method, device, system and storage medium for rapidly processing lossless compressed file

Publications (2)

Publication Number Publication Date
CN108829872A CN108829872A (en) 2018-11-16
CN108829872B true CN108829872B (en) 2021-03-09

Family

ID=64138277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810657224.8A Active CN108829872B (en) 2018-06-22 2018-06-22 Method, device, system and storage medium for rapidly processing lossless compressed file

Country Status (1)

Country Link
CN (1) CN108829872B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527753B (en) * 2020-12-11 2023-05-26 平安科技(深圳)有限公司 DNS analysis record lossless compression method and device, electronic equipment and storage medium
CN113163198B (en) * 2021-03-19 2022-12-06 北京百度网讯科技有限公司 Image compression method, decompression method, device, equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101350624A (en) * 2008-09-11 2009-01-21 中国科学院计算技术研究所 Method for compressing Chinese text supporting ANSI encode
CN101783788A (en) * 2009-01-21 2010-07-21 联想(北京)有限公司 File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device
CN103326730A (en) * 2013-06-06 2013-09-25 清华大学 Data parallelism compression method
US8606351B2 (en) * 2011-12-28 2013-12-10 General Electric Company Compression of electrocardiograph signals
CN104021121A (en) * 2013-02-28 2014-09-03 北京四维图新科技股份有限公司 Method, device and server for compressing text data
CN104283567A (en) * 2013-07-02 2015-01-14 北京四维图新科技股份有限公司 Method for compressing or decompressing name data, and equipment thereof
CN105608214A (en) * 2015-12-30 2016-05-25 杭州中奥科技有限公司 Method for searching under-surveillance license plate numbers fast
CN105989071A (en) * 2015-02-10 2016-10-05 阿里巴巴集团控股有限公司 Method and device for obtaining user network operation characteristics
CN106357275A (en) * 2016-08-30 2017-01-25 国网冀北电力有限公司信息通信分公司 Huffman compression method and device
CN206585493U (en) * 2016-12-27 2017-10-24 哈尔滨理工大学 A kind of control system for permanent-magnet synchronous motor based on explicit model PREDICTIVE CONTROL
CN107463676A (en) * 2017-08-04 2017-12-12 杭州安恒信息技术有限公司 Text data store method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739462B (en) * 2009-12-31 2012-11-28 中兴通讯股份有限公司 Extensible markup language coding method, decoding method and client
US8559741B2 (en) * 2010-06-02 2013-10-15 Altek Corporation Lossless image compression method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101350624A (en) * 2008-09-11 2009-01-21 中国科学院计算技术研究所 Method for compressing Chinese text supporting ANSI encode
CN101783788A (en) * 2009-01-21 2010-07-21 联想(北京)有限公司 File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device
US8606351B2 (en) * 2011-12-28 2013-12-10 General Electric Company Compression of electrocardiograph signals
CN104021121A (en) * 2013-02-28 2014-09-03 北京四维图新科技股份有限公司 Method, device and server for compressing text data
CN103326730A (en) * 2013-06-06 2013-09-25 清华大学 Data parallelism compression method
CN104283567A (en) * 2013-07-02 2015-01-14 北京四维图新科技股份有限公司 Method for compressing or decompressing name data, and equipment thereof
CN105989071A (en) * 2015-02-10 2016-10-05 阿里巴巴集团控股有限公司 Method and device for obtaining user network operation characteristics
CN105608214A (en) * 2015-12-30 2016-05-25 杭州中奥科技有限公司 Method for searching under-surveillance license plate numbers fast
CN106357275A (en) * 2016-08-30 2017-01-25 国网冀北电力有限公司信息通信分公司 Huffman compression method and device
CN206585493U (en) * 2016-12-27 2017-10-24 哈尔滨理工大学 A kind of control system for permanent-magnet synchronous motor based on explicit model PREDICTIVE CONTROL
CN107463676A (en) * 2017-08-04 2017-12-12 杭州安恒信息技术有限公司 Text data store method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Lossless image coding using binary tree decomposition of prediction residuals";Mortuza Ali et al.;《Picture Coding Symposium》;20150730;第194-198页 *
"基于哈夫曼编码的图像压缩技术研究";田端财 等;《科技资讯》;20090619;第29-30页 *

Also Published As

Publication number Publication date
CN108829872A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN1145264C (en) Data compression and decompression system with immediate dictionary updating interleaved with string search
JP6319740B2 (en) Method for speeding up data compression, computer for speeding up data compression, and computer program therefor
CN111339382B (en) Character string data retrieval method, device, computer equipment and storage medium
EP0729237A2 (en) Adaptive multiple dictionary data compression
CN101783788A (en) File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device
JP2002529849A (en) Data compression method for intermediate object code program executable in embedded system supplied with data processing resources, and embedded system corresponding to this method and having multiple applications
CN109815238B (en) Method and device for realizing dynamic addition of database by strictly balanced binary tree
JP2003218703A (en) Data coder and data decoder
US20200294629A1 (en) Gene sequencing data compression method and decompression method, system and computer-readable medium
CN108829872B (en) Method, device, system and storage medium for rapidly processing lossless compressed file
CN106849956B (en) Compression method, decompression method, device and data processing system
US20160028415A1 (en) Compression method, compression device, and computer-readable recording medium
US20130054543A1 (en) Inverted Order Encoding in Lossless Compresssion
CN115189696A (en) Hardware compression and decompression method based on Huffman decoding table
JP6990881B2 (en) System level test of entropy encoding
US9479195B2 (en) Non-transitory computer-readable recording medium, compression method, decompression method, compression device, and decompression device
CN111767280A (en) Data processing method, device and storage medium
US20220199202A1 (en) Method and apparatus for compressing fastq data through character frequency-based sequence reordering
JP2016170750A (en) Data management program, information processor and data management method
CN114356386A (en) Block differential upgrading method, terminal equipment and computer readable storage medium
WO2009001174A1 (en) System and method for data compression and storage allowing fast retrieval
US8786471B1 (en) Lossless data compression with variable width codes
CN108090034B (en) Cluster-based uniform document code coding generation method and system
EP2843842A1 (en) Method and system for LZW based decompression
CN110807092A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant