WO2017097071A1 - 数据压缩及解压的方法及装置 - Google Patents

数据压缩及解压的方法及装置 Download PDF

Info

Publication number
WO2017097071A1
WO2017097071A1 PCT/CN2016/104567 CN2016104567W WO2017097071A1 WO 2017097071 A1 WO2017097071 A1 WO 2017097071A1 CN 2016104567 W CN2016104567 W CN 2016104567W WO 2017097071 A1 WO2017097071 A1 WO 2017097071A1
Authority
WO
WIPO (PCT)
Prior art keywords
key
data
transmission data
compression
compressed
Prior art date
Application number
PCT/CN2016/104567
Other languages
English (en)
French (fr)
Inventor
梁敬彪
任建峰
李勇智
刘文娇
Original Assignee
北京奇虎科技有限公司
奇智软件(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京奇虎科技有限公司, 奇智软件(北京)有限公司 filed Critical 北京奇虎科技有限公司
Publication of WO2017097071A1 publication Critical patent/WO2017097071A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Definitions

  • the present invention relates to the field of computer technology, and in particular, to a method and apparatus for data compression and decompression.
  • Embodiments of the present invention provide a data compression method, including: analyzing original transmission data, and determining a data feature letter including a data structure of the original transmission data and a data size. According to the data characteristic information, it is judged whether the original transmission data is compressed; when the judgment result indicates that the compression processing is performed, the key of the original key value pair in the original transmission data is converted based on the predetermined key compression method in the key compression matching list. Generating a corresponding compression key; generating compressed transmission data including the corresponding compressed key value pair based on the compression key.
  • Another embodiment of the present invention provides a data decompression method, including: determining whether the received transmission data is compressed transmission data; and when determining that the transmission data is compressed transmission data, parsing and extracting compression in the compressed transmission data The compression key of the key-value pair; based on the predetermined key decompression mode in the pre-configured key compression matching list, the compression key is decompressed to obtain the corresponding original key-value pair.
  • An embodiment of the present invention provides a data compression apparatus, including: a first determining module, configured to analyze original transmission data, and determine data feature information including data structure of original transmission data and data size; first determining module And determining, according to the data feature information, whether to perform compression processing on the original transmission data; and a conversion module, configured to: when the determination result indicates that the compression process is performed, based on a predetermined key compression mode in the key compression matching list, the original key in the original transmission data The keys of the pair of values are converted to generate corresponding compression keys; the first generation module generates compressed transmission data including the corresponding pairs of compressed key values based on the compression keys.
  • Another embodiment of the present invention provides a data decompression apparatus, including: a sixth determining module, configured to determine whether the received transmission data is compressed transmission data; and an analysis extraction module, configured to determine that the transmission data is compressed
  • a sixth determining module configured to determine whether the received transmission data is compressed transmission data
  • an analysis extraction module configured to determine that the transmission data is compressed
  • the decompression processing module decompresses the compression key based on the predetermined key decompression mode in the pre-configured key compression matching list to obtain corresponding The original key-value pair.
  • Another embodiment of the present invention is directed to a computer program comprising computer readable code that, when executed on a computing device, causes the computing device to perform the method as described above.
  • Another embodiment of the present invention is directed to a computer readable medium storing a computer program as described above.
  • a data compression and decompression scheme is proposed.
  • the server side analyzes the original transmission data to determine whether to compress the original transmission data, and when the judgment result indicates that the compression processing is performed, the key compression matching is performed.
  • the predetermined key compression method in the list the original The key of the original key value pair in the original transmission data is converted to generate a corresponding compression key, and then the compressed transmission data including the corresponding compression key value pair is generated based on the compression key; if the transmission data is transmitted through the network mode, the original transmission data is transmitted The key of the original key-value pair is converted to generate a corresponding compression key, which can save network bandwidth during data transmission, avoids unpredictable data loss when the data volume is too large, and realizes efficient data transmission;
  • the response time of the computer data processing in the client of the terminal device, when it is judged that the received transmission data is compressed transmission data, the compression key of the compressed key value pair in the compressed transmission data is parsed and extracted, and then based on the pre-configured key
  • FIG. 1 is a schematic flow chart of a method for data compression according to an embodiment of the present invention
  • FIG. 2 is a schematic flow chart of a method for data compression according to a preferred embodiment of the present invention
  • FIG. 3 is a schematic flow chart of a method for data compression according to a preferred embodiment of the present invention.
  • FIG. 4 is a schematic flow chart of a method for data compression according to a preferred embodiment of the present invention.
  • FIG. 5 is a schematic flowchart diagram of a method for data decompression according to another embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of an apparatus for data compression according to another embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an apparatus for data compression according to another preferred embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an apparatus for data compression according to another preferred embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an apparatus for data compression according to another preferred embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an apparatus for decompressing data according to another embodiment of the present invention.
  • Figure 11 shows a block diagram of a computing device for performing a method in accordance with the present invention
  • Figure 12 shows a program code for holding or carrying a method implementing the method according to the invention. Schematic of the storage unit.
  • FIG. 1 is a schematic flow chart of a method for data compression according to an embodiment of the present invention.
  • Step S110 analyzing the original transmission data, determining data feature information including the data structure of the original transmission data and the data size; step S120: determining, according to the data feature information, whether to perform compression processing on the original transmission data; step S130: when determining the result When the compression processing is instructed, the key of the original key value pair in the original transmission data is converted according to a predetermined key compression manner in the key compression matching list to generate a corresponding compression key; Step S140: generating a corresponding compression key value pair based on the compression key generation Compressed transfer data.
  • a data compression and decompression scheme is proposed, which is on the server side.
  • the key of the original key value pair in the original transmission data is converted based on a predetermined key compression method in the key compression matching list.
  • Corresponding compression key and then generating compressed transmission data including the corresponding compression key value pair based on the compression key; if the transmission data is transmitted through the network mode, the corresponding compression is generated by converting the key of the original key value pair in the original transmission data
  • the key can save network bandwidth during data transmission, avoids unpredictable data loss when the data volume is too large, and realizes efficient data transmission; at the same time, improves the response time of computer data processing; at the client of the terminal device
  • the compression key of the compressed key value pair in the compressed transmission data is parsed and extracted, and then the compression key is pressed based on a predetermined key decompression manner in the pre-configured key compression matching list. Decompressing to get the corresponding original key value , Can be obtained quickly and accurately extract the original value of the compression bond enables efficient transmission of data acquisition, so that efficient transmission of data for post processing to provide a received protection, improving the user experience.
  • the data structure of the data in the transmission data is exemplified by a key-value structure.
  • the key compression content in the key compression dictionary file can be represented by combining 26 uppercase and lowercase English characters and 10 numbers from 0 to 9, in this way, 62 different single character compression keys can be formed, 3844 Different two-character compression keys, 238,328 different three-character compression keys, can fully satisfy the use of keys in the compressed key-value structure of existing applications.
  • Step S110 analyzing the original transmission data, and determining data feature information including the data structure of the original transmission data and the data size.
  • the three sets of key-value pairs are “user_id: 22", “user_id: 23”, and “user_id: 24", respectively, for a set of key-value pairs.
  • “user_id: 22” the Key part is “user_id”
  • the Value part is "22”
  • the data structure of the original transmission data "userinfo” is determined to be a key-value pair structure, and the plurality of sets of keys in the original transmission data "userinfo” are determined and determined.
  • the specific size of the value pair is 30 bytes.
  • Step S120 Determine, according to the data feature information, whether to perform compression processing on the original transmission data.
  • the data structure according to the original transmission data "userinfo” is a key-value pair structure, and The data size is 30 bytes, and it is judged whether the key portion "user_id" of the original transmission data of "userinfo” is compressed, and the determination manner is specifically described in the following embodiments as shown in FIG.
  • Step S130 When the determination result indicates that the compression process is performed, the key of the original key value pair in the original transmission data is converted based on the predetermined key compression mode in the key compression matching list to generate a corresponding compression key.
  • the predetermined key compression method in the key compression matching list is used, such as “user_id” after compression
  • the corresponding value is “uid”
  • the key value "user_id” of the three sets of key-value pairs of the original transmission data "userinfo” is converted to generate a corresponding compression key "uid”.
  • Step S140 Generate compressed transmission data including a corresponding compressed key value pair based on the compression key.
  • compressed transmission data “uid:22”, “uid:23”, and “uid:24” including the corresponding three sets of compressed key-value pairs are generated.
  • the step of determining whether to perform compression processing on the original transmission data according to the data feature information specifically includes step S221 and step S222.
  • Step S221 determining a relationship between the data size of the original transmission data and the first predetermined data size threshold.
  • Step S222 If the result of the determination is that the data size of the original transmission data is greater than the first predetermined data size threshold, determining to compress the original transmission data. .
  • the first predetermined data size threshold is 20 bytes, and according to the data size of the original transmission data “userinfo” being 30 bytes, it is determined that the data size of the original transmission data “userinfo” is greater than the first predetermined data size threshold, then it is determined that The Key part "user_id” of the three sets of key-value pairs in the original transmission data "userinfo” is compressed, and the compressed Key part is the data "uid” corresponding to "username" in the key compression matching list.
  • the step of determining whether to perform compression processing on the original transmission data according to the data feature information specifically includes step S323 and step S324.
  • Step S323 determining a relationship between the data size of the original transmission data and the first predetermined data size threshold and the second predetermined data size threshold, the first predetermined data size threshold is greater than the second predetermined data size threshold; and step S324: if the determination result is original The data size of the transmitted data is larger than the second When the predetermined data size threshold is less than the first predetermined data size threshold, it is determined according to the data structure whether the original transmission data is subjected to compression processing.
  • the first predetermined data size threshold is 50 bytes
  • the second predetermined data size threshold is 20 bytes.
  • the following format key value pairs are included.
  • Data: "username:tracy;age:18;username:tom;age:32;", where the data of the Key part is “username”, “age”, “username” and “age”, respectively, the data of the Value part is "tracy”, "18", “tom” and "32” are used to transmit the original transmission data of the customer information, and the data structure of the original transmission data is determined to be a key-value pair structure, and the data size of the original transmission data is 42 words.
  • the data size 42 bytes of the original transmission data is greater than the second predetermined data size threshold of 20 bytes and less than the first predetermined data size threshold of 50 bytes, it is determined whether to compress the original transmission data according to the data structure. .
  • step S324 includes step S3241 (not shown) and step S3242 (not shown).
  • Step S3241 Counting the first ratio of the number of key-value pairs having the same first key in the original transmission data to the total number of key-value pairs included in the original transmission data;
  • Step S3242 When the first ratio is greater than the first predetermined ratio threshold And when the number of characters of the first key is greater than the first predetermined number of characters threshold, it is judged that the original transmission data needs to be compressed.
  • the first predetermined data size threshold is 50 bytes
  • the second predetermined data size threshold is 20 bytes
  • the first predetermined ratio threshold is 40%
  • the first predetermined number of characters threshold is 15, and the highest frequency occurs in the original transmission data.
  • the key is the second key; in the original transmission data "username:tracy;age:18;username:tom;age:32;" containing multiple sets of key-value pairs, the data size of the original transmission data is 42 bytes. Determining that the data size 42 bytes of the original transmission data is greater than the second predetermined data size threshold of 20 bytes and less than the first predetermined data size threshold of 50 bytes, and then counting the key having the same first key "username" in the original transmission data.
  • the number of value pairs is two, the number 2 of key-value pairs having the same first key "username” is divided by the total number of key-value pairs included in the original transmission data, and the first ratio is calculated to be 50%, first The ratio 50% is greater than the first predetermined ratio threshold 40%, and the number of characters of the first key "username” in the original transmission data is 16 is greater than the first predetermined number of characters threshold 15, determining that the first key "username” is required Performing compression processing; then compressing the "username” in the key value pair of the "keyname” of the original transmission data based on the predetermined key compression method in the key compression matching list, and obtaining the compressed Key portion as "username”
  • the corresponding compression key of "username” in the key-value pair is "uid”; the number of key-value pairs having the same key “age” in the original transmission data is 2, and the key having the same key “age”
  • the number 2 of value pairs is divided by the first ratio 50% of the key value included in the original transmission data to the total number
  • step S324 includes step S3243 (not shown), step S3244 (not shown), and step S3245 (not shown).
  • Step S3243 Determine whether the number of characters of the second key having the longest character length in the original transmission data is greater than a threshold value of the second predetermined number of characters;
  • Step S3244 If it is determined that the number of characters of the second key is greater than the threshold of the second predetermined number of characters, statistics The second ratio of the number of characters of all the second keys in the original transmission data to the total number of characters of the original transmission data; step S3245: when the second ratio is greater than the second predetermined ratio threshold, it is determined that the original transmission data needs to be compressed.
  • the first predetermined data size threshold is 50 bytes
  • the second predetermined data size threshold is 20 bytes
  • the second predetermined number of characters threshold is 2
  • the second predetermined ratio threshold is 20%
  • the original transmission data has the longest character.
  • the key of the length is the second key; in the original transmission data including the plurality of sets of key-value pairs, the data size of the original transmission data is 42 bytes, and it can be determined that the data size of the original transmission data is 42 bytes larger than the second predetermined data.
  • the size threshold is 20 bytes and is less than the first predetermined data size threshold of 50 bytes, and then it is determined that the number of characters of the second key "username” having the longest character length in the original transmission data is 8 is greater than the second predetermined number of characters threshold 2,
  • the number of characters of the two second keys "username” in the original data is divided by the total number of characters 42 of the original transmitted data to obtain a second ratio of 38%, and it is determined that the second ratio 38% is greater than the second predetermined ratio threshold of 20%, then it is judged
  • the "username” in the key-value pair whose key part of the original transmission data is "username” needs to be compressed, based on the predetermined key compression side in the key compression matching list. , Partially compressed to give Key “username” key of the key corresponding to the compression of the "username” is "uid".
  • the method includes step S410, step S420, step S430, step S440, and step S450.
  • Step S410 analyzing the original transmission data, determining data feature information including the data structure of the original transmission data and the data size; step S420: determining a historical frequency of occurrence of the key in each key value pair in the original transmission data within the predetermined time period
  • Step S430 determining whether to perform compression processing on the original transmission data according to the data characteristic information and combining the historical appearance frequency;
  • step S440 when the determination result indicates that the compression processing is performed, based on a predetermined key compression method in the key compression matching list, Converting the key of the original key value pair in the data to generate a corresponding compression key;
  • Step S450 Generating compressed transmission data including the corresponding compressed key value pair based on the compression key.
  • the content executed in the step S410, the step S440, and the step S450 in the preferred embodiment is the same as or similar to the content executed in the step S110, the step S130, and the step S140 shown in FIG. 1, and details are not described herein again.
  • the historical frequency of the key keys in the original transmission data in the original transmission data in the past one month is as follows: “username” appears frequently for 800 times a month, and "age” appears at a frequency of 900 per month.
  • the "username” in the key value pair whose key portion of the original transmission data is "username” is compressed, and the key portion of the original transmission data is not in the key value pair of "age”.
  • “age” is compressed. Because the frequency of "age” is greater than the frequency of occurrence of "username”, it is judged that the "age” in the key-value pair whose key part of the original transmission data is "age” needs to be compressed in combination with the historical appearance frequency. deal with.
  • step S140 further includes step S141 (not shown) and step S142 (not shown).
  • Step S141 Generate a corresponding compression key value pair based on the compression key, and set an associated compression state identifier;
  • step S142 combine the compression key value pair and the associated compression state identifier to generate compressed transmission data.
  • the method further includes a step S150 (not shown) and a step S160 (not shown).
  • Step S150 Generate a corresponding key compression dictionary file based on the key compression matching list;
  • Step S160 Send the key compression dictionary file as a configuration file to the requesting party according to the received application acquisition request.
  • a corresponding key compression dictionary file is generated based on a predetermined key compression matching list, the file name is CompFile, and the file content is “username:un;age:ag;”, and the key compression dictionary is obtained according to the received client application acquisition request.
  • the file CompFile is sent as a configuration file to the corresponding requester.
  • FIG. 5 is a schematic flow chart of a method for data decompression according to an embodiment of the present invention.
  • Step S510 determining whether the received transmission data is compressed transmission data; step S520: when determining that the transmission data is compressed transmission data, parsing and extracting a compression key of the compressed key value pair in the compressed transmission data; step S530: based on the pre- The configured key compression matches the predetermined key decompression method in the list, and decompresses the compression key to obtain the corresponding original key value pair.
  • the terminal device receives the transmission data returned by the server, and when the application APP of the client determines that the received transmission data is the compressed transmission data, Performing data parsing on the compressed transmission data and extracting a compression key of the compressed key value pair in the compressed transmission data, and returning a compression key in the transmission data to the server side based on a predetermined key decompression method in the key compression matching list preconfigured by the client Decompression processing is performed to obtain a corresponding original key value pair of the compressed key value pair in the APP transmission data.
  • step S510 further includes step S511 (not shown).
  • step S511 It is judged whether the key value pair in the transmission data includes the associated compression state identifier.
  • step S520 when it is determined that the transmission data is compressed transmission data, step S520 further includes step S521 (not shown). Step S521: when the key value pair in the transmission data is included The compressed state identifier determines that the transmitted data is compressed transmission data; and extracts a compression key of the compressed key value pair associated with the compressed state identifier in the compressed transmission data.
  • the method further includes a step S531 (not shown) and a step S532 (not shown).
  • Step S531 Receive a configuration file, where the configuration file includes a key compression dictionary file; and step S532: generate a local key compression matching list according to the key compression dictionary file configuration.
  • the client of the application APP receives the configuration file returned by the server, including the key compression dictionary file CompFile, and the content is “username:un;age:ag;”, and generates a local key according to the data configuration in the key compression dictionary file CompFile. Compress the matching list, the content is "un:username;ag:age;”; Subsequently, based on the predetermined key decompression method in the generated key compression matching list, the extracted compression key "un;ag;un;ag;” is performed.
  • the original key of the compression key "un;ag;un;ag;” is "username;age;username;age;", so the corresponding original key-value pair can be obtained as "username:tracy;age:18; Username:tom;age:32;”.
  • FIG. 6 is a schematic structural diagram of an apparatus for data compression according to another embodiment of the present invention.
  • the first determining module 610 analyzes the original transmission data, and determines data feature information including the data structure of the original transmission data and the data size. According to the data feature information, the first determining module 620 determines whether to compress the original transmission data; The result indicates that when the compression process is performed, based on the predetermined key compression mode in the key compression matching list, the conversion module 630 converts the key of the original key value pair in the original transmission data to generate a corresponding compression key; the first generation module 640 generates the inclusion based on the compression key. Corresponding compressed key-value pairs for compressed transmission of data.
  • a data compression and decompression scheme is proposed, and the original transmission data is analyzed on the server side to determine whether to compress the original transmission data, when judging The result indicates that when the compression process is performed, the key of the original key value pair in the original transmission data is converted according to a predetermined key compression mode in the key compression matching list to generate a corresponding compression key, and then the corresponding compression key value pair is generated based on the compression key.
  • Compressed transmission data if the transmission data is transmitted through the network mode, the corresponding compression key is generated by converting the key of the original key value pair in the original transmission data, thereby saving network bandwidth during data transmission and avoiding excessive data volume
  • the data is transmitted efficiently; at the same time, the response time of the computer data processing is improved; in the client of the terminal device, when it is judged that the received transmission data is compressed transmission data, parsing and extracting Compressing the compression key of the compressed key value pair in the transmission data, and then decompressing the compression key based on the predetermined key decompression mode in the pre-configured key compression matching list to obtain the corresponding original key value pair, which can be accurately and quickly Decompressing to obtain the original value of the compression key, achieving efficient acquisition of transmission data Thereby transmitting data to a receiver for efficient post-processing provides a guarantee and improve the user experience.
  • the data structure of the data in the transmission data is exemplified by a key-value structure.
  • the key compression content in the key compression dictionary file can be represented by combining 26 uppercase and lowercase English characters and 10 numbers from 0 to 9, in this way, 62 different single character compression keys can be formed, 3844 Different two-character compression keys, 238,328 different three-character compression keys, can fully satisfy the use of keys in the compressed key-value structure of existing applications.
  • the first determining module 610 analyzes the original transmission data, and determines data feature information including the data structure of the original transmission data and the data size.
  • the three sets of key-value pairs are “user_id: 22", “user_id: 23”, and “user_id: 24", respectively, for a set of key-value pairs.
  • “user_id: 22” the Key part is “user_id”
  • the Value part is "22”
  • the data structure of the original transmission data "userinfo” is determined to be a key-value pair structure, and the plurality of sets of keys in the original transmission data "userinfo” are determined and determined.
  • the specific size of the value pair is 30 bytes.
  • the first determining module 620 determines whether the original transmission data is subjected to compression processing.
  • the conversion module 630 converts the key of the original key value pair in the original transmission data to generate a corresponding compression key.
  • the predetermined key compression method in the key compression matching list is used, such as “user_id” after compression
  • the corresponding value is “uid”
  • the key value "user_id” of the three sets of key-value pairs of the original transmission data "userinfo” is converted to generate a corresponding compression key "uid”.
  • the first generation module 640 generates compressed transmission data including the corresponding compressed key value pairs based on the compression keys.
  • compressed transmission data “uid:22”, “uid:23”, and “uid:24” including the corresponding three sets of compressed key-value pairs are generated.
  • the first determining module specifically includes a second determining submodule 721 and a second determining submodule 722.
  • the second determining sub-module 721 determines the relationship between the data size of the original transmission data and the first predetermined data size threshold; if the result of the determination is that the data size of the original transmission data is greater than the first predetermined data size threshold, the second determining sub-module 722 determines The original transmission data is compressed.
  • the first predetermined data size threshold is 20 bytes, and according to the data size of the original transmission data “userinfo” being 30 bytes, it is determined that the data size of the original transmission data “userinfo” is greater than the first predetermined data size threshold, then it is determined that The Key part "user_id” of the three sets of key-value pairs in the original transmission data "userinfo” is compressed, and the compressed Key part is the data "uid” corresponding to "username" in the key compression matching list.
  • the first determining module specifically includes a third determining submodule 823 and a fourth determining submodule 824.
  • the third determining sub-module 823 determines a relationship between the data size of the original transmission data and the first predetermined data size threshold and the second predetermined data size threshold, respectively, the first predetermined data size threshold is greater than the second predetermined data size threshold; if the determination result is original
  • the data size of the transmitted data is greater than the second predetermined data size threshold and less than the first predetermined number
  • the fourth determining sub-module 824 determines whether to compress the original transmission data according to the data structure.
  • the first predetermined data size threshold is 50 bytes
  • the second predetermined data size threshold is 20 bytes.
  • the following format key value pairs are included.
  • Data: "username:tracy;age:18;username:tom;age:32;", where the data of the Key part is “username”, “age”, “username” and “age”, respectively, the data of the Value part is "tracy”, "18", “tom” and "32” are used to transmit the original transmission data of the customer information, and the data structure of the original transmission data is determined to be a key-value pair structure, and the data size of the original transmission data is 42 words.
  • the data size 42 bytes of the original transmission data is greater than the second predetermined data size threshold of 20 bytes and less than the first predetermined data size threshold of 50 bytes, it is determined whether to compress the original transmission data according to the data structure. .
  • the fourth determining sub-module includes a first statistical unit (not shown) and a first determining unit (not shown).
  • the first statistic unit counts a first ratio of the number of key-value pairs having the same first key in the original transmission data to the total number of key-value pairs included in the original transmission data; when the first ratio is greater than the first predetermined ratio threshold, and When the number of characters of the first key is greater than the first predetermined number of characters threshold, the first determining unit determines that the original transmission data needs to be compressed.
  • the first predetermined data size threshold is 50 bytes
  • the second predetermined data size threshold is 20 bytes
  • the first predetermined ratio threshold is 40%
  • the first predetermined number of characters threshold is 15, and the highest frequency occurs in the original transmission data.
  • the key is the second key; in the original transmission data "username:tracy;age:18;username:tom;age:32;" containing multiple sets of key-value pairs, the data size of the original transmission data is 42 bytes. Determining that the data size 42 bytes of the original transmission data is greater than the second predetermined data size threshold of 20 bytes and less than the first predetermined data size threshold of 50 bytes, and then counting the key having the same first key "username" in the original transmission data.
  • the number of value pairs is two, the number 2 of key-value pairs having the same first key "username” is divided by the total number of key-value pairs included in the original transmission data, and the first ratio is calculated to be 50%, first The ratio 50% is greater than the first predetermined ratio threshold 40%, and the number of characters of the first key "username” in the original transmission data is 16 is greater than the first predetermined number of characters threshold 15, determining that the first key "username” is required Performing compression processing; then compressing the "username” in the key value pair of the "keyname” of the original transmission data based on the predetermined key compression method in the key compression matching list, and obtaining the compressed Key portion as "username”
  • the corresponding compression key of "username” in the key-value pair is "uid”; the number of key-value pairs having the same key “age” in the original transmission data is 2, and the key having the same key “age”
  • the number 2 of value pairs is divided by the first ratio 50% of the key value included in the original transmission data to the total number
  • the fourth judging sub-module includes a second judging unit (not shown), a second statistic unit (not shown), and a third judging unit (not shown).
  • the second determining unit determines whether the number of characters of the second key having the longest character length in the original transmission data is greater than a threshold value of the second predetermined number of characters; if it is determined that the number of characters of the second key is greater than the threshold of the second predetermined number of characters, the second statistic
  • the unit calculates a second ratio of the number of characters of all the second keys in the original transmission data to the total number of characters of the original transmission data; when the second ratio is greater than the second predetermined ratio threshold, the third determining unit determines that the original transmission data needs to be compressed. deal with.
  • the first predetermined data size threshold is 50 bytes
  • the second predetermined data size threshold is 20 bytes
  • the second predetermined number of characters threshold is 2
  • the second predetermined ratio threshold is 20%
  • the original transmission data has the longest character.
  • the key of the length is the second key; in the original transmission data including the plurality of sets of key-value pairs, the data size of the original transmission data is 42 bytes, and it can be determined that the data size of the original transmission data is 42 bytes larger than the second predetermined data.
  • the size threshold is 20 bytes and is less than the first predetermined data size threshold of 50 bytes, and then it is determined that the number of characters of the second key "username” having the longest character length in the original transmission data is 8 is greater than the second predetermined number of characters threshold 2,
  • the number of characters of the two second keys "username” in the original data is divided by the total number of characters 42 of the original transmitted data to obtain a second ratio of 38%, and it is determined that the second ratio 38% is greater than the second predetermined ratio threshold of 20%, then it is judged
  • the "username” in the key-value pair whose key part of the original transmission data is "username” needs to be compressed, based on the predetermined key compression side in the key compression matching list. , Compressed to give Key part "username” the key corresponding to the key compression "username” is "uid".
  • the apparatus includes a first determining module 910, a third determining module 920, a first determining module 930, a converting module 940, and a first generating module 950.
  • the first determining module 910 analyzes the original transmission data, and determines data feature information including the data structure of the original transmission data and the data size.
  • the third determining module 920 determines the key in each key value pair in the original transmission data within the predetermined time period.
  • the historical appearance frequency according to the data characteristic information, combined with the historical appearance frequency, the first determining module 930 determines whether to compress the original transmission data; when the judgment result indicates that the compression processing is performed, the predetermined key compression method based on the key compression matching list
  • the conversion module 940 converts the keys of the original key value pairs in the original transmission data to generate corresponding compression keys; the first generation module 950 generates compressed transmission data including the corresponding compression key value pairs based on the compression keys.
  • the content of the first determining module 910, the converting module 940, and the first generating module 950 in the preferred embodiment is the same as that performed by the first determining module 610, the converting module 630, and the first generating module 640 shown in FIG. Or similar, no longer repeat here.
  • the historical frequency of the key keys in the original transmission data in the original transmission data in the past one month is as follows: “username” appears frequently for 800 times a month, and "age” appears at a frequency of 900 per month.
  • the judging device of FIG. 8 it is judged that the "username” in the key value pair whose key portion of the original transmission data is "username” is compressed, and the key portion of the original transmission data is not in the key value pair of "age” "age” is compressed. Because the frequency of "age” is greater than the frequency of occurrence of "username”, it is judged that the "age” in the key-value pair whose key part of the original transmission data is "age” needs to be compressed in combination with the historical appearance frequency. deal with.
  • the first generation module further includes an identification setting unit (not shown) and a combination generation unit (not shown). Generating a corresponding compression key value pair based on the compression key, identifying the setting unit and setting an associated compression status identification; the combination generation unit combining the compressed key value pair and the associated compression status identification to generate compressed transmission data.
  • the apparatus further includes a second generation module (not shown) and a transmission module (not shown).
  • the second generation module generates a corresponding key compression dictionary file based on the key compression matching list; according to the received application acquisition request, the sending module sends the key compression dictionary file as a configuration file to the requesting party.
  • a corresponding key compression dictionary file is generated based on a predetermined key compression matching list, the file name is CompFile, and the file content is “username:un;age:ag;”, and the key compression dictionary is obtained according to the received client application acquisition request.
  • the file CompFile is sent as a configuration file to the corresponding requester.
  • FIG. 10 is a schematic structural diagram of an apparatus for decompressing data according to another preferred embodiment of the present invention.
  • the sixth determining module 1010 determines whether the received transmission data is compressed transmission data; when determining that the transmission data is compressed transmission data, the parsing and extracting module 1020 parses and extracts a compression key of the compressed key value pair in the compressed transmission data; The configured key compression matches the predetermined key decompression mode in the list, and the decompression processing module 1030 decompresses the compression key to obtain a corresponding original key value pair.
  • the terminal device receives the transmission data returned by the server, and when the application APP of the client determines that the received transmission data is the compressed transmission data, Performing data parsing on the compressed transmission data and extracting a compression key of the compressed key value pair in the compressed transmission data, and returning a compression key in the transmission data to the server side based on a predetermined key decompression method in the key compression matching list preconfigured by the client Decompression processing is performed to obtain a corresponding original key value pair of the compressed key value pair in the APP transmission data.
  • the sixth determining module 1010 is further configured to determine whether the key value pair in the transmission data includes an associated compressed state identifier.
  • the parsing extraction module 1020 determines that the transmission data is compressed transmission data when the key value pair in the transmission data includes the associated compression status identifier; and extracts the compressed transmission data and compresses The status identifies the compression key associated with the compressed key-value pair.
  • the apparatus further includes a receiving module 1031 (not shown) and a third generating module 1032 (not shown).
  • the receiving module 1031 receives the configuration file, and the configuration file includes a key compression dictionary file.
  • the third generation module 1032 generates a local key compression matching list according to the key compression dictionary file configuration.
  • the client of the application APP receives the configuration file returned by the server, including the key compression dictionary file CompFile, and the content is “username:un;age:ag;”, and generates a local key according to the data configuration in the key compression dictionary file CompFile. Compress the matching list, the content is "un:username;ag:age;”; Subsequently, based on the predetermined key decompression method in the generated key compression matching list, the extracted compression key "un;ag;un;ag;” is performed.
  • the original key of the compression key "un;ag;un;ag;” is "username;age;username;age;", so the corresponding original key-value pair can be obtained as "username:tracy;age:18; Username:tom;age:32;”.
  • the present invention includes apparatus that is directed to performing one or more of the operations in this application.
  • These devices may be specially designed and manufactured for the required purposes, or may also include known devices in a general purpose computer. These devices have computer programs stored therein that are selectively activated or reconfigured.
  • Such computer programs may be stored in a device (eg, computer) readable medium or in any type of medium suitable for storing electronic instructions and coupled to a bus, respectively, including but not limited to any type of Disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory, Read Only Memory), RAM (Random Access Memory), EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory, Electrically erasable programmable read only memory), flash memory, magnetic card or light card.
  • a readable medium includes any medium that is stored or transmitted by a device (eg, a computer) in a readable form.
  • each block of the block diagrams and/or block diagrams and/or flow diagrams and combinations of blocks in the block diagrams and/or block diagrams and/or flow diagrams can be implemented by computer program instructions. .
  • these computer program instructions can be implemented by a general purpose computer, a professional computer, or a processor of other programmable data processing methods, such that the processor is executed by a computer or other programmable data processing method.
  • the various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof.
  • a microprocessor or digital signal processor may be used in practice to implement some or all of the functionality of some or all of the components of the apparatus in accordance with embodiments of the present invention.
  • the invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein.
  • a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
  • Figure 11 shows a block diagram of a computing device in which a data compression/decompression method in accordance with the present invention can be implemented.
  • the computing device conventionally includes a processor 710 and a computer program product or computer readable medium in the form of a memory 720.
  • Memory 720 can be an electronic memory such as a flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • Memory 720 has a storage space 730 that stores program code 731 for performing any of the method steps described above.
  • storage space 730 storing program code may store respective program code 731 for implementing various steps in the above methods, respectively.
  • These program codes can be read from or written to one or more computer program products. Or in multiple computer program products.
  • Such computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is typically a portable or fixed storage unit such as that shown in FIG.
  • the storage unit may have a storage segment or storage space or the like arranged similarly to the storage 720 in the computing device of FIG.
  • the program code can be compressed, for example, in an appropriate form.
  • the storage unit stores program code 731' for performing the steps of the method according to the invention, ie program code readable by a processor such as 710, which causes the calculation when the program code is run by the computing device The device performs the various steps in the methods described above.
  • steps, measures, and solutions in the various operations, methods, and processes that have been discussed in the present invention may be alternated, changed, combined, or deleted. Further, other steps, measures, and schemes of the various operations, methods, and processes that have been discussed in the present invention may be alternated, modified, rearranged, decomposed, combined, or deleted. Further, the steps, measures, and solutions in the prior art having various operations, methods, and processes disclosed in the present invention may also be alternated, changed, rearranged, decomposed, combined, or deleted.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种数据压缩及解压的方法,对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息(S110),随后,根据数据特征信息,判断是否对原始传输数据进行压缩处理(S120),当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键(S130),基于压缩键生成包括相应的压缩键值对的压缩传输数据(S140)。通过本方法可在数据传输过程中节省网络带宽,避免了数据量过大时不可预期的数据丢失的情况,实现了数据的高效传输;同时,提高计算机数据处理的响应时间,从而为后期高效处理接收到的传输数据提供了保障,提高了用户的体验。

Description

数据压缩及解压的方法及装置 技术领域
本发明涉及计算机技术领域,具体而言,本发明涉及一种数据压缩及解压的方法及装置。
背景技术
随着计算机技术的不断发展,在软件开发方面各种SDK(Software Development Kit,软件开发工具包)及开发方法也在不断的更新。在软件应用的过程中,通常需要通过接口传递各种类型的数据,当传输的数据量较大时,如传输数据中包含了大量的较长字符的数据名及其对应的具体数值,一方面,在数据传递后将耗费系统大量的CPU(Central Processing Unit,中央处理器)资源去解析各数据名及其对应的数值,从而影响终端的处理速度,且大量较长字符的数据名在存储时将浪费大量的终端数据存储空间;另一方面,若数据通过网络方式进行传递,将耗费大量的网络带宽,甚至发生数据被截断的情况,导致数据丢失。
因此,需要一种对传输数据中较长字符的参数名进行压缩的方案,使包含大量较长字符的传输数据名及其对应的具体数值在接口传递的过程中实现高效的传递,达到节约网络带宽资源与系统CPU资源的目的,并实现在数据库中对传输数据进行高效的读写,从而进一步的提高用户的体验。
发明内容
为克服上述技术问题或者至少部分地解决上述技术问题,特提出以下技术方案:
本发明的实施例提出了一种数据压缩的方法,包括:对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信 息;根据数据特征信息,判断是否对原始传输数据进行压缩处理;当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键;基于压缩键生成包括相应的压缩键值对的压缩传输数据。
本发明的另一实施例提出了一种数据解压的方法,包括:判断接收到的传输数据是否为压缩传输数据;当判断确定传输数据为压缩传输数据时,解析并提取压缩传输数据中的压缩键值对的压缩键;基于预配置的键压缩匹配列表中的预定键解压方式,对压缩键进行解压处理,以获取得到相应的原始键值对。
本发明的实施例提出了一种数据压缩的装置,包括:第一确定模块,用于对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息;第一判断模块,用于根据数据特征信息,判断是否对原始传输数据进行压缩处理;转换模块,用于当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键;第一生成模块,基于压缩键生成包括相应的压缩键值对的压缩传输数据。
本发明的另一实施例提出了一种数据解压的装置,包括:第六判断模块,用于判断接收到的传输数据是否为压缩传输数据;解析提取模块,用于当判断确定传输数据为压缩传输数据时,解析并提取压缩传输数据中的压缩键值对的压缩键;解压处理模块,基于预配置的键压缩匹配列表中的预定键解压方式,对压缩键进行解压处理,以获取得到相应的原始键值对。
本发明的另一实施例提出了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在计算设备上运行时,导致所述计算设备执行如上文所述的方法。
本发明的另一实施例提出了一种计算机可读介质,其中存储了如上文所述的计算机程序。
本发明的实施例中,提出了一种数据压缩及解压的方案,在服务器端通过分析原始传输数据,判断是否对原始传输数据进行压缩处理,当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原 始传输数据中原始键值对的键进行转换生成相应的压缩键,随后基于压缩键生成包括相应的压缩键值对的压缩传输数据;若传输数据通过网络方式进行传输时,通过对原始传输数据中原始键值对的键进行转换生成相应的压缩键,可在数据传输过程中节省网络带宽,避免了数据量过大时不可预期的数据丢失的情况,实现了数据的高效传输;同时,提高计算机数据处理的响应时间;在终端设备的客户端中,当判断接收到的传输数据为压缩传输数据时,解析并提取压缩传输数据中的压缩键值对的压缩键,随后基于预配置的键压缩匹配列表中的预定键解压方式,对压缩键进行解压处理,以获取得到相应的原始键值对,可准确快速地解压得到压缩键的原始值,实现了高效的获取传输数据,从而为后期高效处理接收到的传输数据提供了保障,提高了用户的体验。
本发明附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为本发明中一个实施例的数据压缩的方法的流程示意图;
图2为本发明中一个优选实施例的数据压缩的方法的流程示意图;
图3为本发明中一个优选实施例的数据压缩的方法的流程示意图;
图4为本发明中一个优选实施例的数据压缩的方法的流程示意图;
图5为本发明中另一实施例的数据解压的方法的流程示意图;
图6为本发明中另一实施例的数据压缩的装置的结构示意图;
图7为本发明中另一优选实施例的数据压缩的装置的结构示意图;
图8为本发明中另一优选实施例的数据压缩的装置的结构示意图;
图9为本发明中另一优选实施例的数据压缩的装置的结构示意图;
图10为本发明中另一实施例的数据解压的装置的结构示意图;
图11示出了用于执行根据本发明的方法的计算设备的框图;以及
图12示出了用于保持或者携带实现根据本发明的方法的程序代码的 存储单元示意图。
具体实施方式
下面详细描述本发明的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
图1为本发明中一个实施例的数据压缩的方法的流程示意图。
步骤S110:对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息;步骤S120:根据数据特征信息,判断是否对原始传输数据进行压缩处理;步骤S130:当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键;步骤S140:基于压缩键生成包括相应的压缩键值对的压缩传输数据。
本发明的实施例中,提出了一种数据压缩及解压的方案,在服务器端 通过分析原始传输数据,判断是否对原始传输数据进行压缩处理,当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键,随后基于压缩键生成包括相应的压缩键值对的压缩传输数据;若传输数据通过网络方式进行传输时,通过对原始传输数据中原始键值对的键进行转换生成相应的压缩键,可在数据传输过程中节省网络带宽,避免了数据量过大时不可预期的数据丢失的情况,实现了数据的高效传输;同时,提高计算机数据处理的响应时间;在终端设备的客户端中,当判断接收到的传输数据为压缩传输数据时,解析并提取压缩传输数据中的压缩键值对的压缩键,随后基于预配置的键压缩匹配列表中的预定键解压方式,对压缩键进行解压处理,以获取得到相应的原始键值对,可准确快速地解压得到压缩键的原始值,实现了高效的获取传输数据,从而为后期高效处理接收到的传输数据提供了保障,提高了用户的体验。
本发明实施例中,传输数据中数据的数据结构以键值(Key-Value)结构为例。其中,键压缩字典文件中键压缩的内容如可以以26个大写和小写英文字符以及0至9的10个数字相结合进行表示,以此方式可以组成62个不同单字符的压缩键,3844个不同的两个字符的压缩键,238328个不同的三个字符的压缩键,完全能够满足现有应用中压缩键值结构中键的使用。
步骤S110:对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息。
例如,在包含三组键值对结构的原始传输数据“userinfo”中,三组键值对分别为“user_id:22”、“user_id:23”及“user_id:24”,对于一组键值对“user_id:22”,其Key部分为“user_id”,Value部分为“22”,确定原始传输数据“userinfo”的数据结构为键值对结构,并计算确定原始传输数据“userinfo”中多组键值对的具体大小为30个字节。
步骤S120:根据数据特征信息,判断是否对原始传输数据进行压缩处理。
例如,根据原始传输数据“userinfo”的数据结构为键值对结构,且 数据大小为30个字节,判断是否对“userinfo”的原始传输数据的Key部分“user_id”进行压缩处理,判断方式在下述如图2-4所示的实施例中具体阐述。
步骤S130:当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键。
例如,当判断结果指示执行对原始传输数据“userinfo”的3组键值对的Key部分的数据“user_id”进行压缩处理时,基于键压缩匹配列表中预定键压缩方式,如“user_id”压缩后对应的值为“uid”,将原始传输数据“userinfo”的3组键值对的的键值“user_id”进行转换生成相应的压缩键“uid”。
步骤S140:基于压缩键生成包括相应的压缩键值对的压缩传输数据。
例如,基于压缩键“uid”生成包括相应的3组压缩键值对的压缩传输数据“uid:22”、“uid:23”及“uid:24”。
在一优选实施例中,如图2所示,根据数据特征信息,判断是否对原始传输数据进行压缩处理的步骤,具体包括步骤S221和步骤S222。步骤S221:判断原始传输数据的数据大小与第一预定数据大小阈值的关系;步骤S222:若判断结果为原始传输数据的数据大小大于第一预定数据大小阈值时,确定对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为20字节,根据原始传输数据“userinfo”的数据大小为30字节,判断得到原始传输数据“userinfo”的数据大小大于第一预定数据大小阈值,则确定对原始传输数据“userinfo”中三组键值对的Key部分“user_id”进行压缩处理,得到压缩后的Key部分为键压缩匹配列表中“username”对应的数据“uid”。
在另一优选实施例中,如图3所示,根据数据特征信息,判断是否对原始传输数据进行压缩处理的步骤,具体包括步骤S323和步骤S324。步骤S323:判断原始传输数据的数据大小分别与第一预定数据大小阈值及第二预定数据大小阈值的关系,第一预定数据大小阈值大于第二预定数据大小阈值;步骤S324:若判断结果为原始传输数据的数据大小大于第二 预定数据大小阈值且小于第一预定数据大小阈值时,根据数据结构判断是否对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为50字节,第二预定数据大小阈值为20字节,在包含多组键值对结构的用于传递客户信息的原始传输数据中,包含如下格式键值对数据:“username:tracy;age:18;username:tom;age:32;”,其中Key部分的数据分别为“username”、“age”、“username”和“age”,Value部分的数据分别为“tracy”、“18”、“tom”和“32”,用于传递客户信息的原始传输数据中,确定原始传输数据的数据结构为键值对结构,且原始传输数据的数据大小为42字节,可判断得到原始传输数据的数据大小42字节大于第二预定数据大小阈值20字节且小于第一预定数据大小阈值50字节,则继续根据数据结构判断是否对原始传输数据进行压缩处理。
优选地(参照图3),步骤S324包括步骤S3241(图中未示出)和步骤S3242(图中未示出)。步骤S3241:统计原始传输数据中具有相同的第一键的键值对的数量与原始传输数据中包括的键值对总数量的第一比例;步骤S3242:当第一比例大于第一预定比例阈值,且第一键的字符数量大于第一预定字符数量阈值时,判断需要对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为50字节,第二预定数据大小阈值为20字节,第一预定比例阈值为40%,第一预定字符数量阈值为15,原始传输数据中出现频率最高的键为第二键;在包含多组键值对结构的原始传输数据“username:tracy;age:18;username:tom;age:32;”中,原始传输数据的数据大小为42字节,可判断确定原始传输数据的数据大小42字节大于第二预定数据大小阈值20字节且小于第一预定数据大小阈值50字节,随后统计原始传输数据中具有相同的第一键“username”的键值对的数量为2个,具有相同的第一键“username”的键值对的数量2除以原始传输数据中包括的键值对总数量4,计算得到第一比例为50%,第一比例50%大于第一预定比例阈值40%,且在原始传输数据中第一键“username”的字符数量为16大于第一预定字符数量阈值15,判断需要对第一键“username” 进行压缩处理;随后基于键压缩匹配列表中预定键压缩方式,对原始传输数据的Key部分为“username”的键值对中的“username”进行压缩处理,得到压缩后的Key部分为“username”的键值对中的“username”的对应的压缩键为“uid”;统计原始传输数据中具有相同的键“age”的键值对的数量为2个,具有相同的键“age”的键值对的数量2除以原始传输数据中包括的键值对总数量4得到的第一比例50%,则第一比例50%大于第一预定比例阈值40%,但键“age”的字符数量为4小于第一预定字符数量阈值15,因此不对原始传输数据的Key部分为“age”的键值对中的“age”进行压缩处理。
优选地(参照图3),步骤S324包括步骤S3243(图中未示出)、步骤S3244(图中未示出)和步骤S3245(图中未示出)。步骤S3243:判断原始传输数据中具有最长字符长度的第二键的字符数量是否大于第二预定字符数量阈值;步骤S3244:若判断第二键的字符数量大于第二预定字符数量阈值时,统计原始传输数据中全部第二键的字符数量占原始传输数据的总字符数量的第二比例;步骤S3245:当第二比例大于第二预定比例阈值时,判断需要对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为50字节,第二预定数据大小阈值为20字节,第二预定字符数量阈值为2,第二预定比例阈值为20%,原始传输数据中具有最长字符长度的键为第二键;在包含多组键值对结构的原始传输数据中,原始传输数据的数据大小为42字节,可判断确定原始传输数据的数据大小42字节大于第二预定数据大小阈值20字节且小于第一预定数据大小阈值50字节,随后判断原始传输数据中具有最长字符长度的第二键“username”的字符数量为8大于第二预定字符数量阈值2,将原始数据中两个第二键“username”的字符数量16除以原始传输数据的总字符数量42得到第二比例38%,判断确定第二比例38%大于第二预定比例阈值20%,则判断需要对原始传输数据的Key部分为“username”的键值对中的“username”进行压缩处理,基于键压缩匹配列表中预定键压缩方式,压缩后得到Key部分为“username”的键值对中的“username”的对应的压缩键为“uid”。
在一优选实施例中,如图4所示,该方法包括步骤S410、步骤S420、步骤S430、步骤S440和步骤S450。步骤S410:对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息;步骤S420:确定在预定时间段内原始传输数据中各个键值对中的键的历史出现频率;步骤S430:根据数据特征信息,并结合历史出现频率,判断是否对原始传输数据进行压缩处理;步骤S440:当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键;步骤S450:基于压缩键生成包括相应的压缩键值对的压缩传输数据。
其中,本优选实施例中步骤S410、步骤S440和步骤S450中执行的内容与图1所示的步骤S110、步骤S130和步骤S140中执行的内容相同或相似,在此不再赘述。
例如,在包含多组键值对结构的原始传输数据中,包含如下格式键值对数据:“username:tracy;age:18;username:tom;age:32;”,根据数据库中的近一个月的历史记录,得到在近一个月的时间段内原始传输数据中各个键值对中Key键的历史出现频率如下:“username”出现频率为一个月800次,“age”出现频率为一个月900次;根据图3的判断方法判断对原始传输数据的Key部分为“username”的键值对中的“username”进行压缩处理,不对原始传输数据的Key部分为“age”的键值对中的“age”进行压缩处理,由于“age”出现频率大于“username”出现的频率,则结合历史出现频率判断需要对原始传输数据的Key部分为“age”的键值对中的“age”进行压缩处理。
在一优选实施例中,步骤S140还包括步骤S141(图中未示出)和步骤S142(图中未示出)。步骤S141:基于压缩键生成相应的压缩键值对,并设置相关联的压缩状态标识;步骤S142:将压缩键值对及相关联的压缩状态标识组合以生成压缩传输数据。
例如,基于原始传输数据生成相应的压缩键值对,设置相关联的压缩状态标识,可通过压缩标识如“IsCompressed”对压缩标识状态进行设置,当压缩状态为压缩时可设置为“IsCompressed=true;”,当压缩状态为不 压缩时可设置为“IsCompressed=false;”,根据对原始传输数据的Key部分为“username”和“age”的键值对中的“username”和“age”进行压缩处理得到对应压缩键为“un”和“ag”,将压缩键值对及相关联的压缩状态标识组合以生成压缩传输数据“IsCompressed=true;un:tracy;ag:18;un:tom;ag:32;”。
在一优选实施例中,该方法还包括步骤S150(图中未示出)和步骤S160(图中未示出)。步骤S150:基于键压缩匹配列表生成相应的键压缩字典文件;步骤S160:根据接收到的应用获取请求,将键压缩字典文件作为配置文件发送至请求方。
例如,基于预定的键压缩匹配列表生成相应的键压缩字典文件,文件名为CompFile,文件内容为“username:un;age:ag;”,根据接收到的客户端应用获取请求,将键压缩字典文件CompFile作为配置文件发送至相应的请求方。
图5为本发明中一个实施例的数据解压的方法的流程示意图。
步骤S510:判断接收到的传输数据是否为压缩传输数据;步骤S520:当判断确定传输数据为压缩传输数据时,解析并提取压缩传输数据中的压缩键值对的压缩键;步骤S530:基于预配置的键压缩匹配列表中的预定键解压方式,对压缩键进行解压处理,以获取得到相应的原始键值对。
例如,终端设备中应用APP的客户端向服务器端发送获取APP相关的数据请求后,终端设备接收到服务器端返回的传输数据,当客户端的应用APP判断接收到的传输数据为压缩传输数据时,对压缩传输数据进行数据解析并提取压缩传输数据中的压缩键值对的压缩键,基于在客户端预配置的键压缩匹配列表中的预定键解压方式,对服务器端返回传输数据中的压缩键进行解压处理,以获取得到APP传输数据中压缩键值对相应的原始键值对。
在一优选实施例中,步骤S510还包括步骤S511(图中未示出)。步骤S511:判断传输数据中的键值对是否包括相关联的压缩状态标识。
其中,当判断确定传输数据为压缩传输数据时,步骤S520还包括步骤S521(图中未示出)。步骤S521:当传输数据中的键值对包括相关联 的压缩状态标识,则确定传输数据为压缩传输数据;以及提取压缩传输数据中与压缩状态标识相关的压缩键值对的压缩键。
例如,终端设备中应用APP的客户端向服务器端发送获取APP相关的数据请求后,终端设备接收到服务端返回的传输数据“IsCompressed=true;un:tracy;ag:18;un:tom;ag:32;”,应用APP的客户端对传输数据进行解析,可得到传输数据中的键值对中包括相关联的压缩状态标识“IsCompressed”,可根据传输数据中的压缩状态标识“IsCompressed=true;”判断传输数据为压缩传输数据,随后提取压缩传输数据中与压缩状态标识相关的压缩键值对的压缩键得到“un;ag;un;ag;”。
在一优选实施例中,该方法还包括步骤S531(图中未示出)和步骤S532(图中未示出)。步骤S531:接收配置文件,配置文件中包括键压缩字典文件;步骤S532:根据键压缩字典文件配置生成本地的键压缩匹配列表。
例如,应用APP的客户端接收到服务器端返回的配置文件中包括键压缩字典文件CompFile,内容为“username:un;age:ag;”,根据键压缩字典文件CompFile中的数据配置生成本地的键压缩匹配列表,内容为“un:username;ag:age;”;随后,基于生成的键压缩匹配列表中的预定键解压方式,对提取到的压缩键“un;ag;un;ag;”进行解压处理,得到压缩键“un;ag;un;ag;”的原始键为“username;age;username;age;”,因此可得到相应的原始键值对为“username:tracy;age:18;username:tom;age:32;”。
图6为本发明中另一实施例的数据压缩的装置的结构示意图。
第一确定模块610对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息;根据数据特征信息,第一判断模块620判断是否对原始传输数据进行压缩处理;当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,转换模块630对原始传输数据中原始键值对的键进行转换生成相应的压缩键;第一生成模块640基于压缩键生成包括相应的压缩键值对的压缩传输数据。
本发明的实施例中,提出了一种数据压缩及解压的方案,在服务器端通过分析原始传输数据,判断是否对原始传输数据进行压缩处理,当判断 结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对原始传输数据中原始键值对的键进行转换生成相应的压缩键,随后基于压缩键生成包括相应的压缩键值对的压缩传输数据;若传输数据通过网络方式进行传输时,通过对原始传输数据中原始键值对的键进行转换生成相应的压缩键,可在数据传输过程中节省网络带宽,避免了数据量过大时不可预期的数据丢失的情况,实现了数据的高效传输;同时,提高计算机数据处理的响应时间;在终端设备的客户端中,当判断接收到的传输数据为压缩传输数据时,解析并提取压缩传输数据中的压缩键值对的压缩键,随后基于预配置的键压缩匹配列表中的预定键解压方式,对压缩键进行解压处理,以获取得到相应的原始键值对,可准确快速地解压得到压缩键的原始值,实现了高效的获取传输数据,从而为后期高效处理接收到的传输数据提供了保障,提高了用户的体验。
本发明实施例中,传输数据中数据的数据结构以键值(Key-Value)结构为例。其中,键压缩字典文件中键压缩的内容如可以以26个大写和小写英文字符以及0至9的10个数字相结合进行表示,以此方式可以组成62个不同单字符的压缩键,3844个不同的两个字符的压缩键,238328个不同的三个字符的压缩键,完全能够满足现有应用中压缩键值结构中键的使用。
第一确定模块610对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息。
例如,在包含三组键值对结构的原始传输数据“userinfo”中,三组键值对分别为“user_id:22”、“user_id:23”及“user_id:24”,对于一组键值对“user_id:22”,其Key部分为“user_id”,Value部分为“22”,确定原始传输数据“userinfo”的数据结构为键值对结构,并计算确定原始传输数据“userinfo”中多组键值对的具体大小为30个字节。
根据数据特征信息,第一判断模块620判断是否对原始传输数据进行压缩处理。
例如,根据原始传输数据“userinfo”的数据结构为键值对结构,且数据大小为30个字节,判断是否对“userinfo”的原始传输数据的Key部 分“user_id”进行压缩处理,判断方式在下述如图2-4所示的实施例中具体阐述。
当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,转换模块630对原始传输数据中原始键值对的键进行转换生成相应的压缩键。
例如,当判断结果指示执行对原始传输数据“userinfo”的3组键值对的Key部分的数据“user_id”进行压缩处理时,基于键压缩匹配列表中预定键压缩方式,如“user_id”压缩后对应的值为“uid”,将原始传输数据“userinfo”的3组键值对的的键值“user_id”进行转换生成相应的压缩键“uid”。
第一生成模块640基于压缩键生成包括相应的压缩键值对的压缩传输数据。
例如,基于压缩键“uid”生成包括相应的3组压缩键值对的压缩传输数据“uid:22”、“uid:23”及“uid:24”。
在一优选实施例中,如图7所示,第一判断模块具体包括第二判断子模块721和第二确定子模块722。第二判断子模块721判断原始传输数据的数据大小与第一预定数据大小阈值的关系;若判断结果为原始传输数据的数据大小大于第一预定数据大小阈值时,第二确定子模块722确定对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为20字节,根据原始传输数据“userinfo”的数据大小为30字节,判断得到原始传输数据“userinfo”的数据大小大于第一预定数据大小阈值,则确定对原始传输数据“userinfo”中三组键值对的Key部分“user_id”进行压缩处理,得到压缩后的Key部分为键压缩匹配列表中“username”对应的数据“uid”。
在另一优选实施例中,如图8所示,第一判断模块具体包括第三判断子模块823和第四判断子模块824。第三判断子模块823判断原始传输数据的数据大小分别与第一预定数据大小阈值及第二预定数据大小阈值的关系,第一预定数据大小阈值大于第二预定数据大小阈值;若判断结果为原始传输数据的数据大小大于第二预定数据大小阈值且小于第一预定数 据大小阈值时,第四判断子模块824根据数据结构判断是否对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为50字节,第二预定数据大小阈值为20字节,在包含多组键值对结构的用于传递客户信息的原始传输数据中,包含如下格式键值对数据:“username:tracy;age:18;username:tom;age:32;”,其中Key部分的数据分别为“username”、“age”、“username”和“age”,Value部分的数据分别为“tracy”、“18”、“tom”和“32”,用于传递客户信息的原始传输数据中,确定原始传输数据的数据结构为键值对结构,且原始传输数据的数据大小为42字节,可判断得到原始传输数据的数据大小42字节大于第二预定数据大小阈值20字节且小于第一预定数据大小阈值50字节,则继续根据数据结构判断是否对原始传输数据进行压缩处理。
优选地(参照图8),第四判断子模块包括第一统计单元(图中未示出)和第一判断单元(图中未示出)。第一统计单元统计原始传输数据中具有相同的第一键的键值对的数量与原始传输数据中包括的键值对总数量的第一比例;当第一比例大于第一预定比例阈值,且第一键的字符数量大于第一预定字符数量阈值时,第一判断单元判断需要对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为50字节,第二预定数据大小阈值为20字节,第一预定比例阈值为40%,第一预定字符数量阈值为15,原始传输数据中出现频率最高的键为第二键;在包含多组键值对结构的原始传输数据“username:tracy;age:18;username:tom;age:32;”中,原始传输数据的数据大小为42字节,可判断确定原始传输数据的数据大小42字节大于第二预定数据大小阈值20字节且小于第一预定数据大小阈值50字节,随后统计原始传输数据中具有相同的第一键“username”的键值对的数量为2个,具有相同的第一键“username”的键值对的数量2除以原始传输数据中包括的键值对总数量4,计算得到第一比例为50%,第一比例50%大于第一预定比例阈值40%,且在原始传输数据中第一键“username”的字符数量为16大于第一预定字符数量阈值15,判断需要对第一键“username” 进行压缩处理;随后基于键压缩匹配列表中预定键压缩方式,对原始传输数据的Key部分为“username”的键值对中的“username”进行压缩处理,得到压缩后的Key部分为“username”的键值对中的“username”的对应的压缩键为“uid”;统计原始传输数据中具有相同的键“age”的键值对的数量为2个,具有相同的键“age”的键值对的数量2除以原始传输数据中包括的键值对总数量4得到的第一比例50%,则第一比例50%大于第一预定比例阈值40%,但键“age”的字符数量为4小于第一预定字符数量阈值15,因此不对原始传输数据的Key部分为“age”的键值对中的“age”进行压缩处理。
优选地(参照图8),第四判断子模块包括第二判断单元(图中未示出)、第二统计单元(图中未示出)和第三判断单元(图中未示出)。第二判断单元判断原始传输数据中具有最长字符长度的第二键的字符数量是否大于第二预定字符数量阈值;若判断第二键的字符数量大于第二预定字符数量阈值时,第二统计单元统计原始传输数据中全部第二键的字符数量占原始传输数据的总字符数量的第二比例;当第二比例大于第二预定比例阈值时,第三判断单元判断需要对原始传输数据进行压缩处理。
例如,第一预定数据大小阈值为50字节,第二预定数据大小阈值为20字节,第二预定字符数量阈值为2,第二预定比例阈值为20%,原始传输数据中具有最长字符长度的键为第二键;在包含多组键值对结构的原始传输数据中,原始传输数据的数据大小为42字节,可判断确定原始传输数据的数据大小42字节大于第二预定数据大小阈值20字节且小于第一预定数据大小阈值50字节,随后判断原始传输数据中具有最长字符长度的第二键“username”的字符数量为8大于第二预定字符数量阈值2,将原始数据中两个第二键“username”的字符数量16除以原始传输数据的总字符数量42得到第二比例38%,判断确定第二比例38%大于第二预定比例阈值20%,则判断需要对原始传输数据的Key部分为“username”的键值对中的“username”进行压缩处理,基于键压缩匹配列表中预定键压缩方式,压缩后得到Key部分为“username”的键值对中的“username”的对应的压缩键为“uid”。
在一优选实施例中,如图9所示,该装置包括第一确定模块910、第三确定模块920、第一判断模块930、转换模块940和第一生成模块950。第一确定模块910对原始传输数据进行分析,确定包括原始传输数据的数据结构及数据大小的数据特征信息;第三确定模块920确定在预定时间段内原始传输数据中各个键值对中的键的历史出现频率;根据数据特征信息,并结合历史出现频率,第一判断模块930判断是否对原始传输数据进行压缩处理;当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,转换模块940对原始传输数据中原始键值对的键进行转换生成相应的压缩键;第一生成模块950基于压缩键生成包括相应的压缩键值对的压缩传输数据。
其中,本优选实施例中第一确定模块910、转换模块940和第一生成模块950执行的内容与图6所示的第一确定模块610、转换模块630和第一生成模块640执行的内容相同或相似,在此不再赘述。
例如,在包含多组键值对结构的原始传输数据中,包含如下格式键值对数据:“username:tracy;age:18;username:tom;age:32;”,根据数据库中的近一个月的历史记录,得到在近一个月的时间段内原始传输数据中各个键值对中Key键的历史出现频率如下:“username”出现频率为一个月800次,“age”出现频率为一个月900次;根据图8的判断装置判断对原始传输数据的Key部分为“username”的键值对中的“username”进行压缩处理,不对原始传输数据的Key部分为“age”的键值对中的“age”进行压缩处理,由于“age”出现频率大于“username”出现的频率,则结合历史出现频率判断需要对原始传输数据的Key部分为“age”的键值对中的“age”进行压缩处理。
在一优选实施例中,第一生成模块还包括标识设置单元(图中未示出)和组合生成单元(图中未示出)。基于压缩键生成相应的压缩键值对,标识设置单元并设置相关联的压缩状态标识;组合生成单元将压缩键值对及相关联的压缩状态标识组合以生成压缩传输数据。
例如,基于原始传输数据生成相应的压缩键值对,设置相关联的压缩状态标识,可通过压缩标识如“IsCompressed”对压缩标识状态进行设置, 当压缩状态为压缩时可设置为“IsCompressed=true;”,当压缩状态为不压缩时可设置为“IsCompressed=false;”,根据对原始传输数据的Key部分为“username”和“age”的键值对中的“username”和“age”进行压缩处理得到对应压缩键为“un”和“ag”,将压缩键值对及相关联的压缩状态标识组合以生成压缩传输数据“IsCompressed=true;un:tracy;ag:18;un:tom;ag:32;”。
在一优选实施例中,该装置还包括第二生成模块(图中未示出)和发送模块(图中未示出)。第二生成模块基于键压缩匹配列表生成相应的键压缩字典文件;根据接收到的应用获取请求,发送模块将键压缩字典文件作为配置文件发送至请求方。
例如,基于预定的键压缩匹配列表生成相应的键压缩字典文件,文件名为CompFile,文件内容为“username:un;age:ag;”,根据接收到的客户端应用获取请求,将键压缩字典文件CompFile作为配置文件发送至相应的请求方。
图10为本发明中另一优选实施例的数据解压的装置的结构示意图。
第六判断模块1010判断接收到的传输数据是否为压缩传输数据;当判断确定传输数据为压缩传输数据时,解析提取模块1020解析并提取压缩传输数据中的压缩键值对的压缩键;基于预配置的键压缩匹配列表中的预定键解压方式,解压处理模块1030对压缩键进行解压处理,以获取得到相应的原始键值对。
例如,终端设备中应用APP的客户端向服务器端发送获取APP相关的数据请求后,终端设备接收到服务器端返回的传输数据,当客户端的应用APP判断接收到的传输数据为压缩传输数据时,对压缩传输数据进行数据解析并提取压缩传输数据中的压缩键值对的压缩键,基于在客户端预配置的键压缩匹配列表中的预定键解压方式,对服务器端返回传输数据中的压缩键进行解压处理,以获取得到APP传输数据中压缩键值对相应的原始键值对。
在一优选实施例中,第六判断模块1010进一步用于判断传输数据中的键值对是否包括相关联的压缩状态标识。
其中,当判断确定传输数据为压缩传输数据时,解析提取模块1020当传输数据中的键值对包括相关联的压缩状态标识,则确定传输数据为压缩传输数据;以及提取压缩传输数据中与压缩状态标识相关的压缩键值对的压缩键。
例如,终端设备中应用APP的客户端向服务器端发送获取APP相关的数据请求后,终端设备接收到服务端返回的传输数据“IsCompressed=true;un:tracy;ag:18;un:tom;ag:32;”,应用APP的客户端对传输数据进行解析,可得到传输数据中的键值对中包括相关联的压缩状态标识“IsCompressed”,可根据传输数据中的压缩状态标识“IsCompressed=true;”判断传输数据为压缩传输数据,随后提取压缩传输数据中与压缩状态标识相关的压缩键值对的压缩键得到“un;ag;un;ag;”。
在一优选实施例中,该装置还包括接收模块1031(图中未示出)和第三生成模块1032(图中未示出)。接收模块1031接收配置文件,配置文件中包括键压缩字典文件;第三生成模块1032根据键压缩字典文件配置生成本地的键压缩匹配列表。
例如,应用APP的客户端接收到服务器端返回的配置文件中包括键压缩字典文件CompFile,内容为“username:un;age:ag;”,根据键压缩字典文件CompFile中的数据配置生成本地的键压缩匹配列表,内容为“un:username;ag:age;”;随后,基于生成的键压缩匹配列表中的预定键解压方式,对提取到的压缩键“un;ag;un;ag;”进行解压处理,得到压缩键“un;ag;un;ag;”的原始键为“username;age;username;age;”,因此可得到相应的原始键值对为“username:tracy;age:18;username:tom;age:32;”。
本技术领域技术人员可以理解,本发明包括涉及用于执行本申请中操作中的一项或多项的设备。这些设备可以为所需的目的而专门设计和制造,或者也可以包括通用计算机中的已知设备。这些设备具有存储在其内的计算机程序,这些计算机程序选择性地激活或重构。这样的计算机程序可以被存储在设备(例如,计算机)可读介质中或者存储在适于存储电子指令并分别耦联到总线的任何类型的介质中,计算机可读介质包括但不限于任何类型的盘(包括软盘、硬盘、光盘、CD-ROM、和磁光盘)、ROM (Read-Only Memory,只读存储器)、RAM(Random Access Memory,随即存储器)、EPROM(Erasable Programmable Read-Only Memory,可擦写可编程只读存储器)、EEPROM(Electrically Erasable Programmable Read-Only Memory,电可擦可编程只读存储器)、闪存、磁性卡片或光线卡片。也就是,可读介质包括由设备(例如,计算机)以能够读的形式存储或传输信息的任何介质。
本技术领域技术人员可以理解,可以用计算机程序指令来实现这些结构图和/或框图和/或流图中的每个框以及这些结构图和/或框图和/或流图中的框的组合。本技术领域技术人员可以理解,可以将这些计算机程序指令提供给通用计算机、专业计算机或其他可编程数据处理方法的处理器来实现,从而通过计算机或其他可编程数据处理方法的处理器来执行本发明公开的结构图和/或框图和/或流图的框或多个框中指定的方案。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的装置的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
例如,图11示出了可以实现根据本发明的数据压缩/解压方法的计算设备的框图。该计算设备传统上包括处理器710和以存储器720形式的计算机程序产品或者计算机可读介质。存储器720可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器720具有存储用于执行上述方法中的任何方法步骤的程序代码731的存储空间730。例如,存储程序代码的存储空间730可以存储分别用于实现上面的方法中的各种步骤的各个程序代码731。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个 或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为例如图12所示的便携式或者固定存储单元。该存储单元可以具有与图11的计算设备中的存储器720类似布置的存储段或者存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元存储有用于执行根据本发明的方法步骤的程序代码731’,即可以由例如诸如710之类的处理器读取的程序代码,当这些程序代码由计算设备运行时,导致该计算设备执行上面所描述的方法中的各个步骤。
本技术领域技术人员可以理解,本发明中已经讨论过的各种操作、方法、流程中的步骤、措施、方案可以被交替、更改、组合或删除。进一步地,具有本发明中已经讨论过的各种操作、方法、流程中的其他步骤、措施、方案也可以被交替、更改、重排、分解、组合或删除。进一步地,现有技术中的具有与本发明中公开的各种操作、方法、流程中的步骤、措施、方案也可以被交替、更改、重排、分解、组合或删除。
以上仅是本发明的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (24)

  1. 一种数据压缩的方法,包括:
    对原始传输数据进行分析,确定包括所述原始传输数据的数据结构及数据大小的数据特征信息;
    根据所述数据特征信息,判断是否对所述原始传输数据进行压缩处理;
    当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对所述原始传输数据中原始键值对的键进行转换生成相应的压缩键;
    基于所述压缩键生成包括相应的压缩键值对的压缩传输数据。
  2. 根据权利要求1所述的数据压缩的方法,其中,根据所述数据特征信息,判断是否对所述原始传输数据进行压缩处理,包括:
    判断所述原始传输数据的数据大小与第一预定数据大小阈值的关系;
    若判断结果为所述原始传输数据的数据大小大于第一预定数据大小阈值时,确定对所述原始传输数据进行压缩处理。
  3. 根据权利要求1所述的数据压缩的方法,其中,根据所述数据特征信息,判断是否对所述原始传输数据进行压缩处理,包括:
    判断所述原始传输数据的数据大小分别与第一预定数据大小阈值及第二预定数据大小阈值的关系,所述第一预定数据大小阈值大于第二预定数据大小阈值;
    若判断结果为所述原始传输数据的数据大小大于所述第二预定数据大小阈值且小于所述第一预定数据大小阈值时,根据所述数据结构判断是否对所述原始传输数据进行压缩处理。
  4. 根据权利要求3所述的数据压缩的方法,其中,根据所述数据结构判断是否对所述原始传输数据进行压缩处理,具体包括:
    统计所述原始传输数据中具有相同的第一键的键值对的数量与所述原始传输数据中包括的键值对总数量的第一比例;
    当所述第一比例大于第一预定比例阈值,且所述第一键的字符数量大 于第一预定字符数量阈值时,判断需要对所述原始传输数据进行压缩处理。
  5. 根据权利要求3所述的数据压缩的方法,其中,根据所述数据结构判断是否对所述原始传输数据进行压缩处理,具体包括:
    判断所述原始传输数据中具有最长字符长度的第二键的字符数量是否大于第二预定字符数量阈值;
    若判断所述第二键的字符数量大于所述第二预定字符数量阈值时,统计所述原始传输数据中全部第二键的字符数量占所述原始传输数据的总字符数量的第二比例;
    当所述第二比例大于第二预定比例阈值时,判断需要对所述原始传输数据进行压缩处理。
  6. 根据权利要求1所述的数据压缩的方法,还包括:
    确定在预定时间段内所述原始传输数据中各个键值对的键的历史出现频率;
    其中,根据所述数据特征信息,判断是否对所述原始传输数据进行压缩处理,包括:
    根据所述数据特征信息,并结合所述历史出现频率,判断是否对所述原始传输数据进行压缩处理。
  7. 根据权利要求1所述的数据压缩的方法,其中,基于所述压缩键生成包括相应的压缩键值对的压缩传输数据的步骤进一步包括:
    基于所述压缩键生成相应的压缩键值对,并设置相关联的压缩状态标识;
    将所述压缩键值对及相关联的所述压缩状态标识组合以生成压缩传输数据。
  8. 根据权利要求1所述的数据压缩的方法,还包括:
    基于所述键压缩匹配列表生成相应的键压缩字典文件;
    根据接收到的应用获取请求,将所述键压缩字典文件作为配置文件发送至请求方。
  9. 一种数据解压的方法,包括:
    判断接收到的传输数据是否为压缩传输数据;
    当判断确定所述传输数据为压缩传输数据时,解析并提取所述压缩传输数据中的压缩键值对的压缩键;
    基于预配置的键压缩匹配列表中的预定键解压方式,对所述压缩键进行解压处理,以获取得到相应的原始键值对。
  10. 根据权利要求9所述的数据解压的方法,其中,判断接收到的传输数据是否为压缩传输数据的步骤进一步包括:
    判断传输数据中的键值对是否包括相关联的压缩状态标识;
    其中,当判断确定所述传输数据为压缩传输数据时,解析并提取所述压缩传输数据中的压缩键值对的压缩键的步骤进一步包括:
    当所述传输数据中的键值对包括相关联的压缩状态标识,则确定所述传输数据为压缩传输数据;以及提取所述压缩传输数据中与所述压缩状态标识相关的压缩键值对的压缩键。
  11. 根据权利要求9或10所述的数据解压的方法,还包括:
    接收配置文件,所述配置文件中包括键压缩字典文件;
    根据所述键压缩字典文件配置生成本地的所述键压缩匹配列表。
  12. 一种数据压缩的装置,包括:
    第一确定模块,用于对原始传输数据进行分析,确定包括所述原始传输数据的数据结构及数据大小的数据特征信息;
    第一判断模块,用于根据所述数据特征信息,判断是否对所述原始传输数据进行压缩处理;
    转换模块,用于当判断结果指示执行压缩处理时,基于键压缩匹配列表中预定键压缩方式,对所述原始传输数据中原始键值对的键进行转换生成相应的压缩键;
    第一生成模块,基于所述压缩键生成包括相应的压缩键值对的压缩传输数据。
  13. 根据权利要求12所述的数据压缩的装置,其中,所述第一判断模块具体包括:
    第二判断子模块,用于判断所述原始传输数据的数据大小与第一预定 数据大小阈值的关系;
    第二确定子模块,用于若判断结果为所述原始传输数据的数据大小大于第一预定数据大小阈值时,确定对所述原始传输数据进行压缩处理。
  14. 根据权利要求12所述的数据压缩的装置,其中,所述第一判断模块包括:
    第三判断子模块,用于判断所述原始传输数据的数据大小分别与第一预定数据大小阈值及第二预定数据大小阈值的关系,所述第一预定数据大小阈值大于第二预定数据大小阈值;
    第四判断子模块,用于若判断结果为所述原始传输数据的数据大小大于所述第二预定数据大小阈值且小于所述第一预定数据大小阈值时,根据所述数据结构判断是否对所述原始传输数据进行压缩处理。
  15. 根据权利要求14所述的数据压缩的装置,其中,所述第四判断子模块具体包括:
    第一统计单元,用于统计所述原始传输数据中具有相同的第一键的键值对的数量与所述原始传输数据中包括的键值对总数量的第一比例;
    第一判断单元,用于当所述第一比例大于第一预定比例阈值,且所述第一键的字符数量大于第一预定字符数量阈值时,判断需要对所述原始传输数据进行压缩处理。
  16. 根据权利要求14所述的数据压缩的装置,其中,所述第四判断子模块具体包括:
    第二判断单元,用于判断所述原始传输数据中具有最长字符长度的第二键的字符数量是否大于第二预定字符数量阈值;
    第二统计单元,用于若判断所述第二键的字符数量大于所述第二预定字符数量阈值时,统计所述原始传输数据中全部第二键的字符数量占所述原始传输数据的总字符数量的第二比例;
    第三判断单元,用于当所述第二比例大于第二预定比例阈值时,判断需要对所述原始传输数据进行压缩处理。
  17. 根据权利要求12所述的数据压缩的装置,还包括:
    第三确定模块,用于确定在预定时间段内所述原始传输数据中各个键 值对的键的历史出现频率;
    其中,第一判断模块具体用于根据所述数据特征信息,并结合所述历史出现频率,判断是否对所述原始传输数据进行压缩处理。
  18. 根据权利要求12所述的数据压缩的装置,其中,第一生成模块包括:标识设置单元,用于基于所述压缩键生成相应的压缩键值对,并设置相关联的压缩状态标识;
    组合生成单元,用于将所述压缩键值对及相关联的所述压缩状态标识组合以生成压缩传输数据。
  19. 根据权利要求12所述的数据压缩的装置,还包括:
    第二生成模块,用于基于所述键压缩匹配列表生成相应的键压缩字典文件;
    发送模块,用于根据接收到的应用获取请求,将所述键压缩字典文件作为配置文件发送至请求方。
  20. 一种数据解压的装置,包括:
    第六判断模块,用于判断接收到的传输数据是否为压缩传输数据;
    解析提取模块,用于当判断确定所述传输数据为压缩传输数据时,解析并提取所述压缩传输数据中的压缩键值对的压缩键;
    解压处理模块,基于预配置的键压缩匹配列表中的预定键解压方式,对所述压缩键进行解压处理,以获取得到相应的原始键值对。
  21. 根据权利要求20所述的数据解压的装置,其中,所述第六判断模块进一步用于判断传输数据中的键值对是否包括相关联的压缩状态标识;
    其中,所述解析提取模块进一步用于当所述传输数据中的键值对包括相关联的压缩状态标识,则确定所述传输数据为压缩传输数据;以及提取所述压缩传输数据中与所述压缩状态标识相关的压缩键值对的压缩键。
  22. 根据权利要求20或21所述的数据解压的装置,还包括:
    接收模块,用于接收配置文件,所述配置文件中包括键压缩字典文件;
    第三生成模块,用于根据所述键压缩字典文件配置生成本地的所述键压缩匹配列表。
  23. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在计算设备上运行时,导致所述计算设备执行根据权利要求1-11中的任一项权利要求所述的方法。
  24. 一种计算机可读介质,其中存储了如权利要求23所述的计算机程序。
PCT/CN2016/104567 2015-12-09 2016-11-04 数据压缩及解压的方法及装置 WO2017097071A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510907676.3 2015-12-09
CN201510907676.3A CN105322969B (zh) 2015-12-09 2015-12-09 数据压缩及解压的方法及装置

Publications (1)

Publication Number Publication Date
WO2017097071A1 true WO2017097071A1 (zh) 2017-06-15

Family

ID=55249664

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/104567 WO2017097071A1 (zh) 2015-12-09 2016-11-04 数据压缩及解压的方法及装置

Country Status (2)

Country Link
CN (1) CN105322969B (zh)
WO (1) WO2017097071A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110430144A (zh) * 2019-09-09 2019-11-08 联想(北京)有限公司 数据处理方法、装置、电子设备以及介质
CN111526151A (zh) * 2020-04-28 2020-08-11 网易(杭州)网络有限公司 一种数据传输方法、装置、电子设备及存储介质
CN111831211A (zh) * 2019-04-19 2020-10-27 阿里巴巴集团控股有限公司 数据传输方法、装置、设备及存储介质

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105322969B (zh) * 2015-12-09 2019-06-18 北京奇虎科技有限公司 数据压缩及解压的方法及装置
CN107994907B (zh) * 2017-12-01 2021-05-28 北京奇艺世纪科技有限公司 一种生成压缩字典的方法及装置
CN108011952B (zh) * 2017-12-01 2021-06-18 北京奇艺世纪科技有限公司 一种获取压缩字典的方法和装置
CN110505655A (zh) * 2018-09-10 2019-11-26 深圳市文鼎创数据科技有限公司 数据指令处理方法、存储介质及蓝牙盾
CN109683968B (zh) * 2018-12-18 2022-03-29 北京东土军悦科技有限公司 交换机快速启动方法、交换机和存储介质
CN111277274A (zh) * 2020-01-13 2020-06-12 平安国际智慧城市科技股份有限公司 数据压缩方法、装置、设备及存储介质
CN112965934A (zh) * 2021-02-04 2021-06-15 北京高因科技有限公司 一种日志压缩存储方法、电子装置
CN115905168B (zh) * 2022-11-15 2023-11-07 本原数据(北京)信息技术有限公司 基于数据库的自适应压缩方法和装置、设备、存储介质
CN115858450B (zh) * 2023-02-24 2023-05-05 深圳华龙讯达信息技术股份有限公司 一种高适应性cpu的信号传输系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080019613A1 (en) * 2006-06-14 2008-01-24 Tetsuomi Tanaka Information processing apparatus, method of controlling same and computer program
CN104468044A (zh) * 2014-12-05 2015-03-25 北京国双科技有限公司 应用于网络传输中的数据压缩的方法及装置
CN104462524A (zh) * 2014-12-24 2015-03-25 福建江夏学院 一种物联网数据压缩存储方法
CN104753540A (zh) * 2015-03-05 2015-07-01 华为技术有限公司 数据压缩方法、数据解压方法和装置
CN105322969A (zh) * 2015-12-09 2016-02-10 北京奇虎科技有限公司 数据压缩及解压的方法及装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103326732B (zh) * 2013-05-10 2016-12-28 华为技术有限公司 压缩数据的方法、解压数据的方法、编码器和解码器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080019613A1 (en) * 2006-06-14 2008-01-24 Tetsuomi Tanaka Information processing apparatus, method of controlling same and computer program
CN104468044A (zh) * 2014-12-05 2015-03-25 北京国双科技有限公司 应用于网络传输中的数据压缩的方法及装置
CN104462524A (zh) * 2014-12-24 2015-03-25 福建江夏学院 一种物联网数据压缩存储方法
CN104753540A (zh) * 2015-03-05 2015-07-01 华为技术有限公司 数据压缩方法、数据解压方法和装置
CN105322969A (zh) * 2015-12-09 2016-02-10 北京奇虎科技有限公司 数据压缩及解压的方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831211A (zh) * 2019-04-19 2020-10-27 阿里巴巴集团控股有限公司 数据传输方法、装置、设备及存储介质
CN110430144A (zh) * 2019-09-09 2019-11-08 联想(北京)有限公司 数据处理方法、装置、电子设备以及介质
CN111526151A (zh) * 2020-04-28 2020-08-11 网易(杭州)网络有限公司 一种数据传输方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN105322969A (zh) 2016-02-10
CN105322969B (zh) 2019-06-18

Similar Documents

Publication Publication Date Title
WO2017097071A1 (zh) 数据压缩及解压的方法及装置
CN110909063B (zh) 一种用户行为的分析方法、装置、应用服务器及存储介质
US20210152183A1 (en) Data compression method, data decompression method, and related apparatus, electronic device, and system
CN103970793B (zh) 信息查询方法、客户端及服务器
US7793001B2 (en) Packet compression for network packet traffic analysis
CN110445860B (zh) 一种报文发送方法、装置、终端设备及存储介质
US9088540B1 (en) Processing data formatted for efficient communication over a network
CN113238912B (zh) 一种网络安全日志数据的聚合处理方法
CN104899204B (zh) 数据存储方法及装置
CN114157502B (zh) 一种终端识别方法、装置、电子设备及存储介质
US9966971B2 (en) Character conversion
WO2016206605A1 (zh) 一种客户端数据的采集方法和装置
CN110995273B (zh) 电力数据库的数据压缩方法、装置、设备及介质
CN112925661A (zh) 消息处理方法、装置、计算机设备及存储介质
WO2021120782A1 (zh) 日志中关键信息提取方法、装置、终端及存储介质
CN114760369A (zh) 一种协议元数据提取方法、装置、设备及存储介质
CN111324809A (zh) 一种热点信息监测方法、装置及系统
WO2020000742A1 (zh) 一种去重流量记录方法、装置、服务器及存储介质
CN110032432B (zh) 实例的压缩方法和装置、实例的解压方法和装置
CN104468771A (zh) 地理位置的确定方法及装置
CN114610792A (zh) 数据处理方法、装置及系统、工业设备
WO2018077059A1 (zh) 一种条形码的识别方法和装置
US20220375465A1 (en) Methods to employ compaction in asr service usage
CN112910902A (zh) 数据解析方法、装置、电子设备、计算机可读存储介质
CN114070471B (zh) 一种测试数据包传输方法、装置、系统、设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16872263

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16872263

Country of ref document: EP

Kind code of ref document: A1