CN101783788B

CN101783788B - File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device

Info

Publication number: CN101783788B
Application number: CN200910076795.3A
Authority: CN
Inventors: 范昂
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2009-01-21
Filing date: 2009-01-21
Publication date: 2014-09-03
Anticipated expiration: 2029-01-21
Also published as: CN101783788A

Abstract

The embodiment of the invention provides a file compression method, a file compression device, a file decompression method, a file decompression device, a compressed file searching method and a compressed file searching device. The file compression device comprises a first storage module, a first acquiring module, a first word segmentation module, and a first coding module, wherein the first storage module is used for storing a coding table which records the correspondence between standard character strings and coding identifiers, and each of the standard character strings has a unique coding identifier; the first acquiring module is used for acquiring a part of or all texts in a file to be compressed to form a text to be coded; the first word segmentation module is used for carrying out word segmentation to the text to be coded according to the standard character strings and decomposing the text to be coded to at least one character string to be coded; and the first coding module is used for acquiring a first coding sequence corresponding to the text to be coded by replacing the coding identifiers of the standard character strings with the corresponding at least one character string to be coded according to the correspondence between the standard character strings recorded in the coding table and the coding identifiers. The invention improves compression ratio of the text compression algorithm and convenience of the searching.

Description

Compressing file, decompression method, device and compressed file searching method, device

Technical field

The present invention relates to compressing file technical field, particularly a kind of compressing file, decompression method, device and compressed file searching method, device.

Background technology

Along with constantly advancing of computer technology, various types of data files are more and more huger, therefore, cause its storage to take increasing memory space, and transmission time need to take increasing bandwidth.Therefore, data file is compressed in and in computer technology, seems more and more important.

Now, for the compression of data file, be divided into two kinds of lossy compression method and Lossless Compressions, we conventional WinRAR, WinZip belong to Lossless Compression, its basic principle is all the same, briefly, namely more succinct method representation for the repeating data in file, namely remove data redundancy.

In existing Text compression algorithm, comprise a class statistics compression algorithm, as Huffman (Huffman) algorithm etc., be described as follows.

Huffman algorithm is a kind of compression method based on statistics.Its essence is exactly that the character in text is carried out to recompile, and for the higher character of frequency of utilization, its coding is also shorter.

Text after coding, mainly comprises 2 parts: Huffman code table part and compressed content part.When decompressing, first Huffman code table is taken out, then each character of compressed content part is decoded one by one, form source file.

As can be seen here, using the key of Huffman algorithm is to form Huffman code table.Here will use the data structure of Huffman tree.After a Huffman tree is generated, code table has also just generated.

Under illustrate, the urtext of supposing us is " abcbbcccc ".

The generation of Huffman tree comprises the steps:

Steps A 1, scan source file, adds up character frequency.

For sample, statistics is, and: a occurs 1 time, and b occurs 3 times, and c occurs 5 times, is designated as queue as shown in Figure 1, a:1 b:3 c:5.

Steps A 2 is taken out 2 nodes that frequency is minimum from above-mentioned queue, is merged into the branch nodes X that a frequency is 2 nodal frequency sums, joins in former queue, after adding, continues hold queue and arranges by frequency ascending order;

For sample, obtain queue as shown in Figure 2;

Steps A 3, repeating step A2, until only have a node in queue.

Steps A 4, obtains the Huffman shown in Fig. 3 by above-mentioned steps and sets, and leaf node is character, and path from root vertex to leaf node is the Huffman coding of this character.From a node, navigate to its left child, this section of path is 0, navigates to right child, and this section of path is 1.

As shown in Figure 3, the coding that can know a character is exactly 00, b character be encoded to 01, and c character be encoded to 1, after Huffman code table generates, original text " abcbbcccc " has just become 0001101011111 bit string, by each character, take 2 byte and calculate, size is by original 18 bytes (9*2), totally 144 bit, 13 bit have been become, 2 bytes.Reached the object of compression.

Decompression process is as described below, first according to Huffman code table, generates a Huffman tree, then, according to Huffman tree, compressed content is decompressed.

Such as if compressed content is bit string 0001101011111, shown in Fig. 3, so from root vertex, because first bit is 0, first port subtree, second bit is 0, port subtree, arrives leaf node a again, so decoding first character is out exactly a, each character of decompress(ion), all, from root node, flows according to bit, turn to the left or to the right, until arrive leaf node, the character that namely solution presses out, repeat this process, until all characters are all decompressed always.

Yet inventor, in realizing process of the present invention, finds that prior art at least exists following shortcoming:

In prior art, for each Text compression document, must comprise two parts, a part is the code table for encoding, another part is the coded sequence after Text compression, because these two is in a condensed document, so it is not very desirable causing compression ratio, be therefore necessary to propose new compression scheme, further to improve the compression ratio of Text compression algorithm.

Summary of the invention

The object of the embodiment of the present invention is to provide a kind of compressing file, decompression method, device and compressed file searching method, device, to improve the compression ratio of Text compression algorithm.

To achieve these goals, the embodiment of the present invention provides a kind of compressing file device, comprising:

First preserves module, and for preserving a coding schedule, described coding schedule has recorded the corresponding relation between standard word string and code identification, and described in each, standard word string has unique described code identification;

The first acquisition module, for obtaining the part or all of text of file to be compressed, forms text to be encoded;

First participle module, for described text to be encoded being carried out to participle according to described standard word string, resolves at least one word string to be encoded by described text to be encoded;

The first coding module, for the described standard word string that records according to described coding schedule and the corresponding relation between described code identification, utilize the described code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtain first coded sequence corresponding with described text to be encoded.

Above-mentioned compressing file device, wherein, described code identification is with numeral, and the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

Above-mentioned compressing file device, wherein, also comprises:

Statistical module, for the text of described composition corpus is carried out to word frequency statistics, obtains the frequency that described standard word string occurs in described text.

Above-mentioned compressing file device, wherein, in described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding, and described compressing file device also comprises:

Modified module, for adding the file identification of described file to be compressed to described search field corresponding at least one word string to be encoded described in each.

To achieve these goals, the embodiment of the present invention also provides a kind of file compression method, it is characterized in that, comprising:

Obtain the part or all of text in file to be compressed, form text to be encoded;

According to standard word string, described text to be encoded is carried out to participle, described text to be encoded is resolved into at least one word string to be encoded;

According to the described standard word string recording in the coding schedule of preserving in advance and the corresponding relation between code identification, utilize the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtain first coded sequence corresponding with described text to be encoded, described in each, standard word string has unique described code identification.

Above-mentioned method, wherein, described code identification is with numeral, and the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

Above-mentioned method, wherein, in described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding, and described method also comprises:

The file identification of described file to be compressed is added in described search field corresponding at least one word string to be encoded described in each.

To achieve these goals, the embodiment of the present invention also provides a kind of file decompressing device, it is characterized in that, comprising:

The 3rd acquisition module, for obtaining the first sequence to be decoded;

The first decoder module, the standard word string that the coding schedule of preserving in advance for basis records and the corresponding relation of code identification, utilize described standard word string to replace code identification corresponding in described the first sequence to be decoded, obtain the text corresponding with described the first sequence to be decoded, described in each, standard word string has unique described code identification.

Above-mentioned device, wherein, described code identification with numeral, the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

Above-mentioned device, wherein, also comprises:

The second decoder module, for utilizing default value decompression algorithm, decompresses to the second sequence to be decoded, obtains described the first sequence to be decoded.

To achieve these goals, the embodiment of the present invention also provides a kind of file decompression method, it is characterized in that, comprising:

Obtain the first sequence to be decoded;

According to the standard word string recording in the coding schedule of preserving in advance and the corresponding relation of code identification, utilize described standard word string to replace code identification corresponding in described the first sequence to be decoded, obtain the text corresponding with described the first sequence to be decoded, described in each, standard word string has unique described code identification.

Above-mentioned device, wherein, also comprises:

Utilize default value decompression algorithm, the second sequence to be decoded is decompressed, obtain described the first sequence to be decoded.

To achieve these goals, the embodiment of the present invention also provides a kind of compressed file searcher, it is characterized in that, comprising:

First preserves module, for preserving in advance a coding schedule, described coding schedule has recorded standard word string and with the corresponding relation between the code identification of numeral, described in each, standard word string has unique described code identification, in described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of described file identification comprises the described standard word string that described search field is corresponding;

The second acquisition module, for obtaining the search string of user's input;

The second word-dividing mode, for described search string being carried out to participle according to described standard word string, obtains at least one word string to be searched;

File identification extraction module, for obtaining respectively the corresponding file identification set of at least one word string to be searched described in each from described coding schedule;

Search Results output module, for exporting described file identification intersection of sets collection as Search Results.

To achieve these goals, the embodiment of the present invention also provides a kind of compressed file searching method, comprising:

Obtain the search string of user's input;

According to described standard word string, described search string is carried out to participle, obtain at least one word string to be searched;

From the coding schedule of preserving in advance, obtain respectively the corresponding file identification set of at least one word string to be searched described in each; Described coding schedule has recorded standard word string and with the corresponding relation between the code identification of numeral, described in each, standard word string has unique described code identification, and in described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of described file identification comprises the described standard word string that described search field is corresponding;

Described file identification intersection of sets collection is exported as Search Results.

To achieve these goals, the embodiment of the present invention also provides a kind of compressing file transmission method, comprising:

According to described standard word string, described text to be encoded is carried out to participle, described text to be encoded is resolved into at least one word string to be encoded;

According to the standard word string recording in the coding schedule of preserving in advance and the corresponding relation between code identification, utilize the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtain first coded sequence corresponding with described text to be encoded, described in each, standard word string has unique described code identification;

Described the first coded sequence is sent to network storage server.

Above-mentioned device, wherein, during part text in obtaining file to be compressed, described method also comprises:

Repeated obtain text is to the step that sends coded sequence, until the text in described file to be compressed all compresses end of transmission.

To achieve these goals, the embodiment of the present invention also provides a kind of compressing file transmitting device, comprising:

The first coding module, for the described standard word string that records according to described coding schedule and the corresponding relation between code identification, utilize the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtain first coded sequence corresponding with described text to be encoded;

Transport module, for sending to network storage server by described the first coded sequence.

The embodiment of the present invention has following beneficial effect:

First, in the embodiment of the present invention, preserve in advance a code table that is directed to all Text compressions, so do not comprise code table in each compressed file, therefore, greatly dwindled the data volume of the text after compression, improved compression ratio;

Secondly, the code table in the embodiment of the present invention is for the overall situation, is the code identification of the overall word string that obtains based on a large corpus, therefore can provide higher compression ratio;

Again, the technical scheme that is transferred to network storage server after compression with respect to prior art is compared, owing to storing identical coding schedule at network storage server in advance, so the coded sequence after compression does not comprise coding schedule, reduced network burden, and this coding schedule is all suitable for all compressed texts, when the text of the network storage is more, reduced memory space;

Finally, owing to using the coding schedule obtaining in advance, so text to be compressed can be divided into a plurality of parts at transmitting terminal, process respectively, handle part transmission in time, reduced the demand to interim storage.

Accompanying drawing explanation

The process schematic diagram that the Text compression that Fig. 1 is Huffman algorithm to Fig. 3 is processed;

Fig. 4 is the structural representation of the compressing file device of the embodiment of the present invention;

Fig. 5 is the schematic flow sheet of the file compression method of the embodiment of the present invention;

Fig. 6 is the schematic flow sheet of the compressed file searching method of the embodiment of the present invention.

Embodiment

In the method for the embodiment of the present invention and device, preserve in advance a database, this data-base recording be used to form the word of text or the umerical coding of the profit of word, when carrying out Text compression, utilize this database to encode, improve compression ratio, simultaneously, by increase by a search field in coding schedule, utilize this coding schedule to search for, saved the resource consumption of search.

As shown in Figure 1, the compressing file device in the data file of the embodiment of the present invention comprises:

First preserves module, be used for preserving a coding schedule, described coding schedule has recorded code identification corresponding to described standard word string, described code identification is with numeral, and described in each, standard word string has unique described code identification (namely the code identification of each standard word string is different, standard word string and code identification have one-to-one relationship), the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string;

First participle module, carries out participle for the described standard word string according to described coding schedule to described text to be encoded, and described text to be encoded is resolved into at least one word string to be encoded;

The first coding module, for utilizing the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtains first coded sequence corresponding with described text to be encoded.

Relevant to its frequency of occurrences by the code identification that can know described standard word string above, therefore, the compressing file device of the embodiment of the present invention also comprises:

Statistical module, for carrying out word frequency statistics according to the described text that forms described corpus, obtains forming the frequency that the described standard word string of described text occurs in described text;

Within existing minute, word algorithm is divided into three major types: the segmenting method based on string matching, the segmenting method based on understanding and the segmenting method based on statistics specifically do not limit in specific embodiments of the invention.

The word string recording in above table and code identification meet following condition:

1, coding sign has uniqueness;

2, standard word string and coding sign have one-to-one relationship;

3, the number of times that standard word string occurs at the text that forms corpus is more, less for representing the numeral of code identification of described word string.

With concrete example, the embodiment of the present invention is elaborated below.

Suppose and utilize a plurality of texts to carry out, after word frequency statistics, having preserved corresponding relation as shown in the table in coding schedule, it should be understood that, at this, only illustrate, code identification does not represent actual situation:

Standard word string	Code identification
		……	……
's	ID1
		……	……
Word	ID2
		Carry out	ID3
……	……
		Suitably	ID4
Describe	ID5
		……	……
Adopt	ID6
		……	……

Suppose that text to be encoded that now acquisition module obtains, for " adopting suitable word ", obtains following word string to be encoded by word-dividing mode: adopt, suitably,, word.

Search the coded sequence that coding schedule can obtain text to be encoded: ID6 ID4 ID1 ID2.

The embodiment of the present invention has following beneficial effect with respect to the existing compression method based on statistics:

In the embodiment of the present invention, preserve in advance a code table that is directed to all Text compressions, so do not comprise code table in each compressed file, therefore, greatly dwindled the data volume of the text after compression, improved compression ratio;

Code table in the embodiment of the present invention is for the overall situation, is the code identification of the overall word string that obtains based on a large corpus, therefore can provide higher compression ratio.

Simultaneously, in prior art, in order to provide search service, after the text of compression need to being decompressed, just search service can be provided, in the embodiment of the present invention for search service is further provided, in this coding schedule, corresponding to standard word string described in each, be also provided with a search field, which file this search field appears at for recording corresponding standard word string, therefore, compressing file device also comprises:

Database update module, for adding the file identification of described file to be compressed to search field corresponding at least one word string to be encoded described in each;

This compressed file searcher comprises:

The second acquisition module, for obtaining the search string of user's input;

Search Results output module, exports as Search Results for the described file identification intersection of sets collection that described file identification extraction module is obtained.

By above-mentioned processing, utilize the compression set of the embodiment of the present invention, when search service is provided, utilizes this coding schedule can carry out search service, and compressed file need not be decompressed, saved the resource of system.

, can know, the Output rusults of the first coding module is a Serial No. meanwhile, and therefore, in order further to improve compression ratio, the compressing file device of the embodiment of the present invention also comprises:

The second compression module, be used for utilizing default value Coding Compression Algorithm, the code identification corresponding with described at least one word string to be encoded in the coded sequence respectively described the first coding module being obtained carries out compressed encoding, obtains second coded sequence corresponding with described text to be encoded.

Wherein, this default value Coding Compression Algorithm can be the numerical value Coding Compression Algorithm such as distance of swimming block code algorithm, distance of swimming variable-length encoding algorithm.

Simultaneously, owing to utilizing the coding schedule of preserving in advance in the embodiment of the present invention, rather than utilize the text in file to be compressed to obtain code identification, when so the compressing file device of the embodiment of the present invention is used for Internet Transmission, can to the text in a text, be divided into a plurality of parts and carry out serial process, and need not wait whole file to be read, so the processing time can be saved.

Text compression methods in the data file of the embodiment of the present invention, as shown in Figure 5, comprising:

Step 51, obtains the part or all of text in file to be compressed, forms text to be encoded;

Step 52, carries out participle according to the standard word string in coding schedule to described text to be encoded, and described text to be encoded is resolved into at least one word string to be encoded; Described coding schedule has recorded code identification corresponding to described standard word string, described code identification is with numeral, and described in each, standard word string has unique described code identification, the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described standard word string;

Step 53, utilizes the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtains first coded sequence corresponding with described text to be encoded;

Step 54, utilizes default value Coding Compression Algorithm, respectively the code identification corresponding with described at least one word string to be encoded in described the first coded sequence is carried out to compressed encoding, obtains second coded sequence corresponding with described text to be encoded.

The embodiment of the present invention also provides the searching method of the compressed file that the compression method shown in Fig. 5 is obtained, and as shown in Figure 6, comprising:

Step 61, obtains the search string of user's input;

Step 62, carries out participle according to standard word string to described search string, obtains at least one word string to be searched;

Step 63 is obtained respectively the corresponding file identification set of at least one word string to be searched described in each from coding schedule;

Step 64, exports described file identification intersection of sets collection as Search Results.

The file decompressing device of the embodiment of the present invention comprises:

First preserves module, be used for preserving a coding schedule, described coding schedule has recorded code identification corresponding to described standard word string, described code identification is with numeral, and described in each, standard word string has unique described code identification, the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string;

The 3rd acquisition module, for obtaining the first sequence to be decoded;

The first decoder module, be used for according to the described standard word string of described coding schedule record and the corresponding relation of described code identification, utilize described standard word string to replace code identification corresponding in described the first sequence to be decoded, obtain the text corresponding with described sequence to be decoded.

Certainly, if Serial No. is compressed in compression process, the file decompressing device of the embodiment of the present invention also comprises:

The second decoder module, for utilizing default value decompression algorithm, decompresses to the second sequence to be decoded, obtains the first sequence to be decoded;

Its processing procedure comprises the steps:

Utilize default value decompression algorithm, the second sequence to be decoded is decompressed, obtain the first sequence to be decoded;

According to the standard word string of coding schedule record and the corresponding relation of code identification, utilize described standard word string to replace code identification corresponding in described the first sequence to be decoded, obtain the text corresponding with described the first sequence to be decoded.

The embodiment of the present invention also provides a kind of compressing file transmission method, comprising:

Obtain full text or part text in file to be compressed, form text to be encoded;

Described the first coded sequence is sent to network storage server.

During part text in obtaining file to be compressed, certainly also should repeat above-mentioned steps, until the full text in file to be compressed is disposed.

Corresponding compressing file transmitting device comprises:

The technical scheme that is transferred to network storage server after compression with respect to prior art is compared, owing to storing identical coding schedule at network storage server in advance, so the coded sequence after compression does not comprise coding schedule, reduced network burden, and this coding schedule is all suitable for all compressed texts, when the text of the network storage is more, reduced memory space.

, owing to using the coding schedule obtaining in advance, so text to be compressed can be divided into a plurality of parts at transmitting terminal, process respectively meanwhile, handle part transmission in time, reduced the demand to interim storage.

The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1. a compressing file device, is characterized in that, comprising:

First preserves module, and for preserving a coding schedule, described coding schedule has recorded the corresponding relation between standard word string and code identification, and described in each, standard word string has unique described code identification; In described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding;

2. compressing file device according to claim 1, is characterized in that, described code identification is with numeral, and the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

3. compressing file device according to claim 2, is characterized in that, also comprises:

4. according to the compressing file device described in claim 1 or 2 or 3, it is characterized in that, also comprise:

5. according to the compressing file device described in claim 1 or 2 or 3, it is characterized in that, also comprise:

6. a file compression method, is characterized in that, comprising:

According to the described standard word string recording in the coding schedule of preserving in advance and the corresponding relation between code identification, utilize the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtain first coded sequence corresponding with described text to be encoded, described in each, standard word string has unique described code identification; In described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding.

7. method according to claim 6, is characterized in that, described code identification is with numeral, and the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

8. according to the method described in claim 6 or 7, it is characterized in that, in described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding, and described method also comprises:

9. according to the method described in claim 6 or 7, it is characterized in that, also comprise:

Utilize default value Coding Compression Algorithm, respectively the code identification corresponding with described at least one word string to be encoded in described the first coded sequence carried out to compressed encoding, obtain second coded sequence corresponding with described text to be encoded.

10. a file decompressing device, is characterized in that, comprising:

The 3rd acquisition module, for obtaining the first sequence to be decoded;

The first decoder module, the standard word string that the coding schedule of preserving in advance for basis records and the corresponding relation of code identification, utilize described standard word string to replace code identification corresponding in described the first sequence to be decoded, obtain the text corresponding with described the first sequence to be decoded, described in each, standard word string has unique described code identification; In described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding.

11. file decompressing devices according to claim 10, it is characterized in that, described code identification with numeral, the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

12. file decompressing devices according to claim 11, is characterized in that, also comprise:

13. according to the file decompressing device described in claim 10 or 11 or 12, it is characterized in that, also comprises:

14. 1 kinds of file decompression methods, is characterized in that, comprising:

Obtain the first sequence to be decoded;

According to the standard word string recording in the coding schedule of preserving in advance and the corresponding relation of code identification, utilize described standard word string to replace code identification corresponding in described the first sequence to be decoded, obtain the text corresponding with described the first sequence to be decoded, described in each, standard word string has unique described code identification; In described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding.

15. methods according to claim 14, is characterized in that, described code identification with numeral, the frequency that described standard word string occurs in forming the text of corpus is higher, less for representing the numeral of code identification of described word string.

16. according to the method described in claims 14 or 15, it is characterized in that, also comprises:

17. 1 kinds of compressing file transmission methods, is characterized in that, comprising:

According to the standard word string recording in the coding schedule of preserving in advance and the corresponding relation between code identification, utilize the code identification of described standard word string to replace corresponding described at least one word string to be encoded, obtain first coded sequence corresponding with described text to be encoded, described in each, standard word string has unique described code identification; In described coding schedule, corresponding to standard word string described in each, be provided with search field, described search field is for log file sign, and the indicated file of file identification recording in described search field comprises the described standard word string that described search field is corresponding;

Described the first coded sequence is sent to network storage server.

18. methods according to claim 17, is characterized in that, during part text in obtaining file to be compressed, described method also comprises:

19. 1 kinds of compressing file transmitting devices, is characterized in that, comprising: