CN115801902A - Compression method of network access request data - Google Patents

Compression method of network access request data Download PDF

Info

Publication number
CN115801902A
CN115801902A CN202310084879.1A CN202310084879A CN115801902A CN 115801902 A CN115801902 A CN 115801902A CN 202310084879 A CN202310084879 A CN 202310084879A CN 115801902 A CN115801902 A CN 115801902A
Authority
CN
China
Prior art keywords
target
character
compression
dictionary
length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310084879.1A
Other languages
Chinese (zh)
Other versions
CN115801902B (en
Inventor
米存照
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Telixin Electronics Technology Co ltd
Original Assignee
Beijing Telixin Electronics Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Telixin Electronics Technology Co ltd filed Critical Beijing Telixin Electronics Technology Co ltd
Priority to CN202310084879.1A priority Critical patent/CN115801902B/en
Publication of CN115801902A publication Critical patent/CN115801902A/en
Application granted granted Critical
Publication of CN115801902B publication Critical patent/CN115801902B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the field of data compression processing, in particular to a compression method of network access request data, which comprises the following steps: acquiring each high-frequency character in the network access request data; obtaining the compressible degree and the compressed data volume according to each dictionary entry corresponding to the target character under the target input length and the occurrence probability of each dictionary entry, and further obtaining the corresponding compression income; obtaining the compression loss amount of the target character corresponding to the target input length according to the compression income corresponding to the input length of each adjacent dictionary of the target character; obtaining the optimal input length of the target character according to the corresponding compression income and compression loss of the target character under the target input length, and further obtaining the optimal input length of each high-frequency character; obtaining a compression dictionary according to the optimal input length of each high-frequency character; and compressing the network access request data according to the compression dictionary. The invention can keep good compression effect at the initial stage of establishing the compression dictionary.

Description

Compression method of network access request data
Technical Field
The invention relates to the field of data compression processing, in particular to a compression method of network access request data.
Background
With the development of science and technology, the exchange of network data by various industries is more and more demanding, and when data exchange is performed, transmission of network access request data is required. In order to improve the efficiency of data exchange, network access request data generally needs to be compressed, the network access request data is text data with small data volume, lossless compression is needed when the network access request data is compressed, and the most frequently used algorithm for lossless compression of the text data in the existing compression algorithm is the LZW compression algorithm;
when the LZW compression algorithm is used for compressing the network access request data, a compression dictionary is established by taking a single character as a basis for the network access request data to be compressed, but because the overall data volume of the network access request data is small, when the compression dictionary of the LZW compression algorithm is not completely established, a plurality of characters in the network access request data cannot be compressed, and further the network access request data cannot reach the highest compression volume, so that the compressed network access request data volume is too large, and the compressed data transmission cost is higher.
Disclosure of Invention
The invention provides a compression method of network access request data, which aims to solve the existing problems.
The compression method of the network access request data adopts the following technical scheme:
one embodiment of the invention provides a compression method of network access request data, which comprises the following steps:
acquiring network access request data, and acquiring each high-frequency character in the acquired network access request data;
taking any high-frequency character as a target character, taking any dictionary entry length as a target entry length, and acquiring all character combinations corresponding to the target character under the target entry length to obtain dictionary entries and the occurrence probability of each dictionary entry, wherein each dictionary entry corresponds to one or more same character combinations; obtaining the corresponding compressibility degree of the target character under the target input length according to the number of the obtained dictionary entries and the occurrence probability of each dictionary entry; obtaining the corresponding compressed data volume of the target character under the target input length according to all the character combinations corresponding to the target character under the target input length; obtaining a corresponding compression benefit of the target character under the target input length according to the obtained compressibility degree and the compression data amount;
obtaining the corresponding compression loss amount of the target character under the target entry length according to the difference of the compression benefits between the entry lengths of all adjacent dictionaries of the target character; obtaining the optimal input length of the target character according to the corresponding compression income and compression loss of the target character under the target input length; obtaining the optimal input length of each high-frequency character by taking each high-frequency character as a target character;
obtaining a compression dictionary according to the optimal input length of each high-frequency character; and compressing the network access request data according to the compression dictionary.
Preferably, the method for acquiring each high-frequency character includes:
and acquiring the occurrence probability of each single character in the network access request data, taking the average value of the obtained occurrence probabilities as the average occurrence probability, and marking each single character with the occurrence probability larger than the average occurrence probability as each high-frequency character.
Preferably, the method for acquiring each dictionary entry and the occurrence probability of each dictionary entry comprises:
and taking each position of the target character in the network access request data as each initial position, acquiring all character combinations formed by characters with continuous target input lengths from each initial position, calling the same character combination as a dictionary entry, obtaining each dictionary entry according to different character combinations, and taking the probability of the character combination corresponding to each dictionary entry appearing in all the character combinations as the appearance probability of each dictionary entry.
Preferably, the method for acquiring the corresponding compressibility degree of the target character under the target entry length comprises the following steps:
obtaining the compression difficulty according to the occurrence probability corresponding to each dictionary entry obtained by the target character under the target input length; and obtaining the corresponding compressibility degree of the target character under the target entry length according to the number of all dictionary entries and the obtained compression difficulty.
Preferably, the step of obtaining the compressed data amount corresponding to the target character under the target entry length includes:
and calculating the accumulated sum of the numbers of different character combinations corresponding to the target character under the target entry length, and taking the product of the accumulated sum and the entry length of each dictionary as the corresponding compressed data amount of the target character under the target entry length.
Preferably, the method for obtaining the compression yield corresponding to the target character under the target entry length comprises the following steps: and taking the product of the corresponding compressibility degree of the target character under the target entry length and the compressed data amount as the corresponding compression benefit of the target character under the target entry length.
Preferably, the method for obtaining the compression loss amount corresponding to each high-frequency character under each dictionary entry length includes:
and respectively calculating difference values between the entry length of each dictionary smaller than the target entry length and the compression benefits corresponding to the entry length of the adjacent dictionary, and taking the accumulated sum of the obtained difference values as the corresponding compression loss amount of the target character when the entry length of the dictionary is the target entry length.
Preferably, the method for obtaining the optimal entry length of the target character comprises the following steps:
and when the compression gain corresponding to the target character under the target input length is larger than the compression loss amount, taking the target input length as the optimal input length of the target character, otherwise, adding one operation to the target input length to obtain a new target input length, and processing the new target input length according to the compression gain and the compression loss amount corresponding to the target character under the new target input length until the optimal input length of the target character is obtained.
The invention has the beneficial effects that: calculating the corresponding compressible degree and the compressed data amount of the high-frequency characters in the network access request data under different dictionary entry lengths to obtain the compression benefits of each high-frequency character under different dictionary entry lengths; according to the limitation that a shorter character combination cannot be compressed by a longer character combination, the entry length of each dictionary of the same high-frequency character is evaluated further according to the compression loss amount of the same high-frequency character under different dictionary entry lengths, so that the optimal entry length of each high-frequency character is obtained, then the establishment of a compression dictionary is carried out on the network access request data in a mode of combining the optimal entry length of the high-frequency character with a traditional compression algorithm, so that when the compression dictionary is not completely established, namely the initial establishment stage of the compression dictionary, longer byte data compression can still be carried out on the high-frequency character in the network access request data, compared with the existing compression algorithm, a better compression effect can be achieved at the initial establishment stage of the compression dictionary, and the compression method is more suitable for the compression of small data volume text data such as network access request data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart illustrating steps of a method for compressing network access request data according to the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of a method for compressing network access request data according to the present invention, its specific implementation, structure, features and effects will be given in conjunction with the accompanying drawings and the preferred embodiments. In the following description, the different references to "one embodiment" or "another embodiment" do not necessarily refer to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following describes a specific scheme of the method for compressing network access request data according to the present invention in detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart illustrating steps of a method for compressing network access request data according to an embodiment of the present invention is shown, where the method includes the following steps:
step S001: and acquiring the network access request data and acquiring each high-frequency character in the acquired network access request data.
The method includes the steps that network access request data to be compressed are obtained, wherein the network access request data refer to a string of text data requested by a client to a target terminal when the client needs to exchange access data for a certain terminal, for example, when a certain user needs to access a certain website, the client corresponding to the user needs to send an HTTP request to a server, the request includes a request line, a request header, a blank line and request data, and all data included in the request sent by the user are network access request data needing to be collected in the embodiment.
In the LZW compression algorithm, a compression dictionary is always built from one character length, so that when the compression dictionary is not completely built, a part of data, such as long-byte data, cannot be compressed, so that all network access request data is screened for characters appearing at high frequency, then the recording length of the compression dictionary is calculated for each high-frequency character data, and when the compression dictionary is built according to the high-frequency character data, the corresponding long-byte character data is recorded according to the recording length of the compression dictionary, so that when the compression dictionary is not built, more long-byte data can be compressed.
Because the text space of the network access request data to be compressed is large, and different text character data do not have high repeatability, the calculation amount is large and the time is consumed when the dictionary entry length is calculated for all the text character data. In order to reduce the amount of calculation and reduce the overall compression time, in the present embodiment, high-frequency characters in network access request data to be compressed are screened, and then network access request data is compressed according to the screened high-frequency characters, it should be noted that, because some characters in some repeated entries always repeatedly appear on the basis of a single character, for example, a repeated entry abc repeatedly appears on the basis of a single character a, where a single character is simply referred to as a single character, in network access request data, if the single character a is a high-frequency repeated character, it indicates that the probability of appearance of a repeated entry on the basis of a character a is quite large, therefore, in the present embodiment, when high-frequency character screening is performed, only for the single character, that is, each high-frequency character is a single character, for example, the repeated entry abca is composed of two characters a, one character b, and one character c, and then the characters a, b, and c are all single characters in the present embodiment;
firstly, the average frequency of occurrence of single characters in the acquired network access request data is acquired
Figure SMS_1
The calculation of (2) is as follows:
Figure SMS_2
wherein M is the total number of unrepeated single characters in the network access request data,
Figure SMS_3
the m-th single character in the network access request data is represented by the frequency of occurrence in all the network access request data, and the value is the ratio of the number of the m-th single character in the network access request data to the total number of all the characters in the network access request data.
Screening high-frequency characters in the network access request data according to the average occurrence frequency, wherein the specific screening process is to judge the occurrence frequency of each nonrepetitive single character in the network access request data and the average occurrence frequency, if the occurrence frequency of the mth single character in the network access request data is greater than the average occurrence frequency, the mth single character is considered as the high-frequency character, otherwise, the mth single character is not considered as the high-frequency character; and sequentially judging each single character in the network access request data to obtain each high-frequency character in the network access request data.
Step S002: and acquiring each dictionary entry corresponding to the target character under the target input length and the occurrence probability of each dictionary entry, acquiring the corresponding compressibility degree and the corresponding compressed data amount of the target character under the target input length, and further acquiring the corresponding compression benefit of the target character under the target input length.
Because the compression dictionary of the existing LZW compression algorithm is not perfect in the early stage, a large amount of data in the network access request data cannot be compressed, for example, a certain data fragment in the existing network access request data is: ababab, due to the self limitation of the existing LZW compression algorithm, a in the compression dictionary of the LZW compression algorithm is numbered, b is numbered, ab is numbered, then the 3 rd ab is compressed by using the numbered ab, namely, the existing LZW compression algorithm can cause that only one of 3 repeated ab can be compressed, and the first two repeated ab can only be used for establishing the compression dictionary;
therefore, in the embodiment, the compression benefit and the compression loss are calculated for different dictionary entry lengths corresponding to the screened high-frequency characters, and then the dictionary entry length of each high-frequency character is calculated according to the compression benefit and the compression loss of each high-frequency character, so that in the initial stage of establishing the compression dictionary, the dictionary entry length of the high-frequency character in the compression dictionary is expanded to improve the compression ratio of the network access request data.
In the embodiment, an nth high-frequency character is used as the target character, and a dictionary entry length i is used as the target entry length, so that the compression yield of the nth high-frequency character when the dictionary entry length is i means that when the nth high-frequency character is used for establishing a compression dictionary, all character combinations formed by continuous i characters starting from each starting position are obtained by taking each corresponding position of the nth high-frequency character in network access request data as each starting position, and then the yield when the network access request data is compressed according to the obtained different character combinations, namely the compression yield. For example, when the network access request data is high-frequency character a in abcabacad, when the dictionary entry length i =2 corresponding to the high-frequency character a, starting with the character a, the character combinations formed by 2 consecutive characters are ab, ac, and ad, in this embodiment, the same character combination in the network access request data is referred to as a character combination, each character combination is referred to as a dictionary entry, at this time, one dictionary entry corresponds to one or more same character combinations, that is, two ab are referred to as a character combination, one ac is referred to as a character combination, and one ad is referred to as a character combination, thereby obtaining three character combinations, where the number J =3 corresponding to the number of dictionary entries, the total number of the character combinations corresponding to the dictionary entry ab is 2, and the total number of the character combinations corresponding to the second dictionary entry ac is 1; the total number of character combinations corresponding to the third dictionary entry ad is 1.
To a first order
Figure SMS_4
The high-frequency character is a target character, and when the dictionary entry length is a target entry length i, the corresponding compression income is obtained
Figure SMS_5
The calculation process of (c) is as follows:
Figure SMS_6
Figure SMS_7
in the formula, e is a natural constant,
Figure SMS_8
() Is a logarithmic function with base 2;
Figure SMS_9
the entropy of information when the dictionary representing the nth high frequency character enters length i,
Figure SMS_10
representing the occurrence probability of the jth dictionary entry when the input length of the nth high-frequency character in the dictionary is i; j is the number of dictionary entries corresponding to the nth high-frequency character when the dictionary entry length is i;
Figure SMS_11
representing the total number of character combinations corresponding to the jth dictionary entry of the nth high-frequency character when the length i is recorded in the dictionary;
when i is used as the dictionary entry length of the nth high-frequency character, the more the types of character combinations with the length of i are, namely the larger the number J of dictionary entries, the more complex the obtained character combinations are represented, the more the number of dictionary entries corresponding to the nth high-frequency character needs to be established, so that the larger the overall calculated amount is, the more difficult a good compression effect is obtained; when a certain character combination has a large number of repetitions, the information entropy corresponding to the character combinations is small, and the compression difficulty is small when the network access request data is compressed, so that the information entropy of the obtained dictionary entry is used in the embodiment
Figure SMS_12
Representing the corresponding compression difficulty when the length is input by taking i as a dictionary of the nth high-frequency character, wherein the larger the information entropy is, the larger the corresponding compression difficulty is, otherwise, the smaller the corresponding compression difficulty is;
therefore, in order to reduce the calculation amount of dictionary creation and ensure the compression efficiency of data, when the nth high-frequency character takes i as the dictionary entry length, the obtained number J of dictionary entries is smaller, and the corresponding information entropy of the obtained dictionary entries is smaller, so that the obtained compression degree is maximum, namely the smaller the number of dictionary entries is, the smaller the complexity of character combinations of different dictionary entries is, and the more the corresponding compression degree isLarge, therefore this embodiment uses
Figure SMS_13
And the compressible degree in the network access request data is represented after the length is recorded by taking i as a dictionary and the nth high-frequency character is recorded.
Because the degree of compressibility is evaluated from the whole network access request data, but because the lengths of the character combinations corresponding to a single dictionary entry are different, the compression effect is also different, for example, when all dictionary entries are formed by fewer characters and all dictionary entries are formed by more characters, a character index is correspondingly used for replacing the dictionary entry formed by fewer characters and a character index is used for replacing the dictionary entry formed by more characters, the later can compress more data and obtain better compression effect correspondingly, and therefore, the compressed data volumes of different dictionary entry lengths need to be further evaluated, so that more accurate compression benefit is obtained
Figure SMS_14
Is characterized by
Figure SMS_15
When the length is recorded for a dictionary, after the nth high-frequency character is recorded, the compressed data amount in the network access request data, that is, the data amount that the network access request data can be compressed, the larger the value is, the more the data amount that the dictionary recording length using i as the nth high-frequency character can be compressed is, the compression benefit of the nth high-frequency character when the dictionary recording length is i is obtained according to the compressibility degree and the compressed data amount, in this embodiment, the product between the compressibility degree corresponding to the target character under the target recording length and the compressed data amount is used as the compression benefit corresponding to the target character under the target recording length.
Step S003: obtaining the compression loss amount of the target character corresponding to the target input length according to the compression income corresponding to the input length of each adjacent dictionary of the target character; and obtaining the optimal input length of the target character according to the corresponding compression gain and compression loss of the target character under the target input length, and further obtaining the optimal input length of each high-frequency character.
In the compression dictionary of the LZW algorithm, the longer character combination can not compress the shorter character combination, namely when the dictionary entry length of the nth high-frequency character is i, all the lengths corresponding to the nth high-frequency character are in
Figure SMS_16
The character combination of (2) cannot be compressed, so that the length of the character combination of (1) is required to be
Figure SMS_17
The dictionary entry corresponding to the character combination is established, for example: since ab is compressed by the character combination abc, the compression loss amount needs to be calculated, and the optimum entry length of the nth high-frequency character is obtained.
Because the compression loss is relative, when the nth high-frequency character is recorded by different dictionary recording lengths, the obtained compression gains are different, so that the embodiment performs difference analysis on all the compression gains smaller than the dictionary recording length i, namely, the compression loss corresponding to the adjacent dictionary recording length is obtained according to the difference between the compression gains corresponding to the adjacent dictionary recording length, wherein when the dictionary recording length is
Figure SMS_18
When the corresponding adjacent dictionary is recorded with the length of
Figure SMS_19
Then, summing all the obtained compression loss quantities, wherein the obtained summation result represents the integral compression loss quantity of the nth high-frequency character with i as the dictionary entry length; when it comes to
Figure SMS_20
The dictionary entry length of each high-frequency character is
Figure SMS_21
Time, corresponding amount of compression loss
Figure SMS_22
The calculation process of (a) is as follows:
Figure SMS_23
wherein the content of the first and second substances,
Figure SMS_24
representing the nth high frequency character
Figure SMS_25
As a compression yield corresponding to the entry of the length into the dictionary,
Figure SMS_26
representing the nth high frequency character
Figure SMS_27
And the compression yield corresponding to the length recorded by the dictionary is used.
Secondly, calculating the optimal input length by using the compression benefit and the compression loss of the nth high-frequency character in length, namely judging the relative size between the compression benefit and the compression loss of the nth high-frequency character when the input length of the dictionary is equal to the basic length, wherein the basic length set in the embodiment is 3, and when the compression benefit is greater than the compression loss, the input length of the dictionary at the moment is the optimal input length of the nth high-frequency character; and if the compression gain is less than or equal to the compression loss amount, adding one to the i to obtain a new dictionary entry length, and judging the relative size between the corresponding compression gain and the compression loss amount until the compression gain is greater than the compression loss amount, so as to obtain the optimal entry length of the target character, namely the optimal entry length of the nth high-frequency character. And repeating the method, and taking each high-frequency character as a target character to obtain the optimal input length of each high-frequency character.
Step S004: obtaining a compression dictionary according to the optimal input length of each high-frequency character; and compressing the network access request data according to the compression dictionary.
Since the compression dictionary of the LZW algorithm is established for the entire network access request data, there are two cases: establishing a compression dictionary of the high-frequency characters and establishing a compression dictionary of the rest non-high-frequency characters, wherein the establishment processes of the compression dictionaries corresponding to two different conditions are respectively as follows: for the rest non-high frequency characters, the establishment is carried out by utilizing the establishment mode of the traditional LZW algorithm, namely, the establishment of a gradually increased compression dictionary is carried out by starting with a single character; and for the high-frequency characters, when the establishment of the compression dictionary is carried out, the traditional establishment of the compression dictionary is carried out, and the establishment of the compression dictionary based on the optimal input length is carried out for each character.
For example, for an existing network access request data fragment abcabcabcdabcabbc, where a character a is a high-frequency character, and assuming that an optimal entry length corresponding to the high-frequency character a is 3, when the network access request data of the fragment is compressed by using an LZW compression algorithm, in the process of establishing a compression dictionary for the high-frequency character a, the traditional establishment method starts gradually increasing with a single character a, but establishes a compression dictionary with a dictionary entry length of 3, that is, abc, on the basis of the high-frequency character a while the traditional establishment method starts gradually increasing with a single character.
And compressing the network access request data according to the obtained compression dictionary.
Through the steps, the compression of the network access request data is completed.
In the embodiment, the compression benefits of each high-frequency character under different dictionary entry lengths are obtained by calculating the corresponding compressibility degree and the compressed data amount of the high-frequency character in the network access request data under different dictionary entry lengths; according to the limitation that a shorter character combination cannot be compressed by a longer character combination, the entry length of each dictionary of the same high-frequency character is evaluated further according to the compression loss amount of the same high-frequency character under different dictionary entry lengths, so that the optimal entry length of each high-frequency character is obtained, then the establishment of a compression dictionary is carried out on the network access request data in a mode of combining the optimal entry length of the high-frequency character with a traditional compression algorithm, so that when the compression dictionary is not completely established, namely the initial establishment stage of the compression dictionary, longer byte data compression can still be carried out on the high-frequency character in the network access request data, compared with the existing compression algorithm, a better compression effect can be achieved at the initial establishment stage of the compression dictionary, and the compression method is more suitable for the compression of small data volume text data such as network access request data.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (8)

1. A method for compressing network access request data, the method comprising the steps of:
acquiring network access request data, and acquiring each high-frequency character in the acquired network access request data;
taking any high-frequency character as a target character, taking any dictionary entry length as a target entry length, and acquiring all character combinations corresponding to the target character under the target entry length to obtain dictionary entries and the occurrence probability of each dictionary entry, wherein each dictionary entry corresponds to one or more same character combinations; obtaining the corresponding compressibility degree of the target character under the target input length according to the number of the obtained dictionary entries and the occurrence probability of each dictionary entry; obtaining the corresponding compressed data volume of the target character under the target input length according to all the character combinations corresponding to the target character under the target input length; obtaining a corresponding compression benefit of the target character under the target input length according to the obtained compressibility degree and the compression data amount;
obtaining the corresponding compression loss amount of the target character under the target entry length according to the difference of the compression benefits between the entry lengths of all adjacent dictionaries of the target character; obtaining the optimal input length of the target character according to the corresponding compression income and compression loss of the target character under the target input length; obtaining the optimal input length of each high-frequency character by taking each high-frequency character as a target character;
obtaining a compression dictionary according to the optimal input length of each high-frequency character; and compressing the network access request data according to the compression dictionary.
2. The method as claimed in claim 1, wherein the method for obtaining each high frequency character comprises:
and acquiring the occurrence probability of each single character in the network access request data, taking the average value of the obtained occurrence probabilities as the average occurrence probability, and marking each single character with the occurrence probability larger than the average occurrence probability as each high-frequency character.
3. The method according to claim 1, wherein the dictionary entries and the probability of occurrence of the dictionary entries are obtained by:
and taking each position of the target character in the network access request data as each initial position, acquiring all character combinations formed by characters with continuous target input lengths from each initial position, calling the same character combination as a dictionary entry, obtaining each dictionary entry according to different character combinations, and taking the probability of the character combination corresponding to each dictionary entry appearing in all the character combinations as the appearance probability of each dictionary entry.
4. The method for compressing network access request data according to claim 1, wherein the method for obtaining the corresponding compressibility degree of the target character under the target entry length comprises:
obtaining the compression difficulty according to the occurrence probability corresponding to each dictionary entry obtained by the target character under the target input length; and obtaining the corresponding compressibility degree of the target character under the target entry length according to the number of all dictionary entries and the obtained compression difficulty.
5. The method for compressing network access request data according to claim 1, wherein the step of obtaining the corresponding compressed data amount of the target character under the target entry length comprises:
and calculating the accumulated sum of the numbers of different character combinations corresponding to the target character under the target entry length, and taking the product of the accumulated sum and the entry length of each dictionary as the corresponding compressed data amount of the target character under the target entry length.
6. The method for compressing network access request data according to claim 1, wherein the method for obtaining the compression yield corresponding to the target character under the target entry length comprises: and taking the product of the corresponding compressibility degree of the target character under the target entry length and the compressed data amount as the corresponding compression benefit of the target character under the target entry length.
7. The method according to claim 1, wherein the method for obtaining the compression loss amount corresponding to each high-frequency character under each dictionary entry length comprises:
and respectively calculating difference values between the entry length of each dictionary smaller than the target entry length and the compression benefits corresponding to the entry length of the adjacent dictionary, and taking the accumulated sum of the obtained difference values as the corresponding compression loss amount of the target character when the entry length of the dictionary is the target entry length.
8. The method for compressing network access request data according to claim 1, wherein the method for obtaining the optimal entry length of the target character comprises:
and when the compression gain corresponding to the target character under the target input length is larger than the compression loss amount, taking the target input length as the optimal input length of the target character, otherwise, adding one operation to the target input length to obtain a new target input length, and processing the new target input length according to the compression gain corresponding to the target character under the new target input length and the compression loss amount until the optimal input length of the target character is obtained.
CN202310084879.1A 2023-02-09 2023-02-09 Compression method of network access request data Active CN115801902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310084879.1A CN115801902B (en) 2023-02-09 2023-02-09 Compression method of network access request data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310084879.1A CN115801902B (en) 2023-02-09 2023-02-09 Compression method of network access request data

Publications (2)

Publication Number Publication Date
CN115801902A true CN115801902A (en) 2023-03-14
CN115801902B CN115801902B (en) 2023-04-11

Family

ID=85430579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310084879.1A Active CN115801902B (en) 2023-02-09 2023-02-09 Compression method of network access request data

Country Status (1)

Country Link
CN (1) CN115801902B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116112434A (en) * 2023-04-12 2023-05-12 深圳市网联天下科技有限公司 Router data intelligent caching method and system
CN116934487A (en) * 2023-09-18 2023-10-24 青岛场外市场清算中心有限公司 Financial clearing data optimal storage method and system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000516058A (en) * 1996-08-06 2000-11-28 シー. レイナー,ジェフリー Lempel-Ziv data compression technology using a dictionary pre-filled with frequent character combinations, words and / or phrases
JP2007129683A (en) * 2006-03-13 2007-05-24 Fujitsu Ltd Compressed data transmission method
CN103326732A (en) * 2013-05-10 2013-09-25 华为技术有限公司 Method for packing data, method for unpacking data, coder and decoder
CN105978668A (en) * 2016-05-06 2016-09-28 电信科学技术研究院 Dictionary information synchronization method, device and equipment
CN109697277A (en) * 2017-10-20 2019-04-30 北京京东尚科信息技术有限公司 The method and apparatus of Text compression
US20190140939A1 (en) * 2017-12-28 2019-05-09 Eve M. Schooler Converged routing for distributed computing systems
US10637498B1 (en) * 2019-01-30 2020-04-28 Shanghai Zhaoxin Semiconductor Co., Ltd. Accelerated compression method and accelerated compression apparatus
CN111782660A (en) * 2020-07-17 2020-10-16 支付宝(杭州)信息技术有限公司 Data compression method and system based on key value storage
US20210058477A1 (en) * 2017-12-29 2021-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Compression context setup for data transmission for iot devices
CN114764557A (en) * 2021-01-15 2022-07-19 腾讯科技(深圳)有限公司 Data processing method and device, electronic equipment and storage medium
CN115695563A (en) * 2021-07-29 2023-02-03 华为技术有限公司 Communication method, device and equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000516058A (en) * 1996-08-06 2000-11-28 シー. レイナー,ジェフリー Lempel-Ziv data compression technology using a dictionary pre-filled with frequent character combinations, words and / or phrases
JP2007129683A (en) * 2006-03-13 2007-05-24 Fujitsu Ltd Compressed data transmission method
CN103326732A (en) * 2013-05-10 2013-09-25 华为技术有限公司 Method for packing data, method for unpacking data, coder and decoder
CN105978668A (en) * 2016-05-06 2016-09-28 电信科学技术研究院 Dictionary information synchronization method, device and equipment
CN109697277A (en) * 2017-10-20 2019-04-30 北京京东尚科信息技术有限公司 The method and apparatus of Text compression
US20190140939A1 (en) * 2017-12-28 2019-05-09 Eve M. Schooler Converged routing for distributed computing systems
US20210058477A1 (en) * 2017-12-29 2021-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Compression context setup for data transmission for iot devices
US10637498B1 (en) * 2019-01-30 2020-04-28 Shanghai Zhaoxin Semiconductor Co., Ltd. Accelerated compression method and accelerated compression apparatus
CN111782660A (en) * 2020-07-17 2020-10-16 支付宝(杭州)信息技术有限公司 Data compression method and system based on key value storage
EP3940550A1 (en) * 2020-07-17 2022-01-19 Alipay (Hangzhou) Information Technology Co., Ltd. Data compression methods and systems based on key-value store
CN114764557A (en) * 2021-01-15 2022-07-19 腾讯科技(深圳)有限公司 Data processing method and device, electronic equipment and storage medium
CN115695563A (en) * 2021-07-29 2023-02-03 华为技术有限公司 Communication method, device and equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116112434A (en) * 2023-04-12 2023-05-12 深圳市网联天下科技有限公司 Router data intelligent caching method and system
CN116112434B (en) * 2023-04-12 2023-06-09 深圳市网联天下科技有限公司 Router data intelligent caching method and system
CN116934487A (en) * 2023-09-18 2023-10-24 青岛场外市场清算中心有限公司 Financial clearing data optimal storage method and system
CN116934487B (en) * 2023-09-18 2023-12-12 青岛场外市场清算中心有限公司 Financial clearing data optimal storage method and system

Also Published As

Publication number Publication date
CN115801902B (en) 2023-04-11

Similar Documents

Publication Publication Date Title
CN115801902B (en) Compression method of network access request data
CN103458460B (en) Method and device for compressing and decompressing signal data
CN106407285B (en) A kind of optimization bit file compression & decompression method based on RLE and LZW
US9660667B2 (en) Method and apparatus for compressing/decompressing data using floating point
CN117097810B (en) Data center transmission optimization method based on cloud computing
CN116610265B (en) Data storage method of business information consultation system
US10566997B2 (en) Apparatus and method for data compression and decompression based on calculation of error vector magnitude
CN112968706B (en) Data compression method, FPGA chip and FPGA online upgrading method
CN115276666B (en) Efficient data transmission method for equipment training simulator
CN116346289A (en) Data processing method for computer network center
CN116614139B (en) User transaction information compression storage method in wine selling applet
CN116055008B (en) Router data processing method for cloud server connection
CN117082156B (en) Intelligent analysis method for network flow big data
CN1256605A (en) Short message transmitting equipment and method for mobile communication terminal
CN101163239B (en) Novel vector quantization inceptive code book generating method
CN108880559B (en) Data compression method, data decompression method, compression equipment and decompression equipment
CN116861271B (en) Data analysis processing method based on big data
CN116934487B (en) Financial clearing data optimal storage method and system
KR100453142B1 (en) Compression Method for Sound in a Mobile Communication Terminal
US9235610B2 (en) Short string compression
CN112054805B (en) Model data compression method, system and related equipment
Xu et al. Low complexity rate-adaptive deep joint source channel coding for wireless image transmission using tensor-train decomposition
US5708429A (en) Method of compressing electroencephalographic signals
CN113708772A (en) Huffman coding method, system, device and readable storage medium
US7733249B2 (en) Method and system of compressing and decompressing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant